Re: [HACKERS] Word-smithing doc changes

2011-11-30 Thread Greg Smith

Excerpts from Greg Stark's message of sáb jun 25 21:01:36 -0400 2011:

I think this commit was ill-advised:
http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=a03feb9354bda5084f19cc952bc52ba7be89f372
Seems way to implementation-specific and detailed for a user to make
heads or tails of. Except in the sections talking about locking
internals we don't talk about shared locks on virtual transactions
identifiers we just talk about waiting for a transaction to complete.


Looks like I missed this when it passed by before, and looks like Greg 
Stark may have missed the message on pgsql-docs that kicked this all 
off:  
http://archives.postgresql.org/message-id/4ddb64cb.7070...@2ndquadrant.com


I will happily accept that the description there may have suffered from 
me not using all of the terms optimally, and that the resulting commit 
could be improved.  Some more feedback to get the description correct 
and useful would be much appreciated.


What I cannot agree with is that idea that the implementation details I 
suggested documenting should not be.  There are extremely user-hostile 
things that can happen here, and that are unique to this command.  
Saying this is too complicated for users to make heads or tails of may 
very well be true in many cases, but I think it's not giving PostgreSQL 
users very much credit.  And when problems with this happen, and I 
wouldn't have spent any time on this if they didn't, right now the only 
way to make heads or tails of it is to read the source code.


If the code was simple, quick, and had no failure modes, it would be 
fine to not describe it.  This is complicated, the run time cannot be 
bounded, and it can ripple to nasty lock queue issues--at some 
impossible to predict future time, long after you start the creation.  I 
don't have a good idea how to unload the potential foot gun.  The best I 
could think of after being shot with it was describing how it fires.



And looping over the transactions one by one is purely an
implementation detail and uninteresting to users.


That particular suggestion came from me having a painful session I 
didn't want anyone else to ever go through again.  By the end of that, 
this implementation detail felt like the most important missing piece of 
PostgreSQL documentation in the world to me--I'm too busy to send in doc 
patches describing things that I haven't been shot by.


To provide some more context, the server I ran into this on always has 
at least 2 reports that take 10 to 16 hours to run active.  I happily 
kicked off a concurrent index build on a heavily bloated 1GB table whose 
indexes are typically 5GB, which I expect to take a few minutes given 
the small size involved.  Six hours later, when I come back and discover 
it's still not done, I find a single lock waiting for a transaction to 
finish.Since I'm used to multiple locks forming into a tree 
structure, and I only see one, I expect I'm OK once that's done.  Fine; 
I estimate how much time that report has left and leave for a bit.


Four hours later, I come back.  That original transaction lock is gone.  
Now it's created a new one I didn't expect, moving onto the second 
oldest report active.  I actually have six more hours to go still.  This 
locking pattern is unique to this command, and if I'd had the slightest 
idea that it worked this way I'd have approached the rebuild differently.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronous commit vs. hint bits

2011-11-30 Thread YAMAMOTO Takashi
hi,

 On Tue, Nov 29, 2011 at 1:42 AM, YAMAMOTO Takashi
 y...@mwd.biglobe.ne.jp wrote:
 On Mon, Nov 7, 2011 at 5:26 PM, Simon Riggs si...@2ndquadrant.com wrote:

 5. Make the WAL writer more responsive, maybe using latches, so that
 it doesn't take as long for the commit record to make it out to disk.

 I'm working on this already as part of the update for power
 reduction/group commit/replication performance.

 I extracted this from my current patch for you to test.

 is it expected to produce more frequent fsync?
 
 Yes, I would expect that.  What kind of increase are you seeing?  Is
 it causing a problem for you, or are you just making an observation?

i was curious because my application uses async commits mainly to
avoid frequent fsync.  i have no numbers right now.

YAMAMOTO Takashi

 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] GIN internal query-plan alternatives.

2011-11-30 Thread jesper
Hi

I have a feeling that the internal query-plans (or alternative) 
query-plans when executing GIN-searches are not being exhausted as much as
they generally are in PG.

More specifically a query like:

select id from table where fts @@ to_tsquery('english','verycommon 
veryrare');

Can do an search on the very rare and postfilter it on the
very-common keyword. I had problems trying to force the query-planner into
executing a query that forced that behavior, but here is my attempt:


2011-11-30 11:33:41.010 testdb=# explain analyze select id from
testdb.testtable where id in (select id from testdb.testtable where fts @@
pptsquery('veryrare')) and fts @@ pptsquery('verycommon') order by id desc
limit 300;
QUERY
PLAN
























--
 Limit  (cost=14203.33..14204.08 rows=300 width=4) (actual
time=62.561..62.567 rows=14 loops=1)
   -  Sort  (cost=14203.33..14206.91 rows=1430 width=4) (actual
time=62.561..62.566 rows=14 loops=1)
 Sort Key: testdb.testtable.id
 Sort Method:  quicksort  Memory: 25kB
 -  Nested Loop  (cost=4799.17..14137.35 rows=1430 width=4)
(actual time=11.225..62.530 rows=14 loops=1)
   -  HashAggregate  (cost=4799.17..4813.47 rows=1430
width=4) (actual time=6.792..7.941 rows=2409 loops=1)
 -  Bitmap Heap Scan on testtable
(cost=1943.29..4795.59 rows=1430 width=4) (actual
time=0.962..5.174 rows=2409 loops=1)
   Recheck Cond: (fts @@ '''veryrare'''::tsquery)
-  Bitmap Index Scan on testtable_gin_idx
(cost=0.00..1942.93 rows=1430 width=0) (actual
time=0.635..0.635 rows=2419 loops=1)
 Index Cond: (fts @@
'''veryrare'''::tsquery)
   -  Index Scan using testtable_pkey on testtable
(cost=0.00..6.51 rows=1 width=4) (actual time=0.022..0.022
rows=0 loops=2409)
 Index Cond: (testdb.testtable.id =
testdb.testtable.id)
 Filter: (testdb.testtable.fts @@
'''verycommon'''::tsquery)
 Total runtime: 62.679 ms
(14 rows)

Time: 125.899 ms
2011-11-30 11:40:59.673 testdb=# explain analyze select id from
testdb.testtable where fts @@ pptsquery('verycommon veryrare') order by id
desc limit 300;
   QUERY PLAN

-
 Limit  (cost=2522.12..2522.87 rows=300 width=4) (actual
time=1282.967..1282.972 rows=14 loops=1)
   -  Sort  (cost=2522.12..2523.88 rows=704 width=4) (actual
time=1282.965..1282.968 rows=14 loops=1)
 Sort Key: id
 Sort Method:  quicksort  Memory: 25kB
 -  Bitmap Heap Scan on testtable  (cost=1081.67..2489.63
rows=704 width=4) (actual time=1282.902..1282.948 rows=14
loops=1)
   Recheck Cond: (fts @@ '''verycommon'' 
''veryrare'''::tsquery)
   -  Bitmap Index Scan on testtable_gin_idx
(cost=0.00..1081.49 rows=704 width=0) (actual
time=1282.880..1282.880 rows=17 loops=1)
 Index Cond: (fts @@ '''verycommon'' 
''veryrare'''::tsquery)
 Total runtime: 1283.274 ms
(9 rows)

Time: 1300.587 ms
2011-11-30 11:41:13.217 testdb=#

This may of-course not always be the optimal query-plan, but in this
situation the alternative plan is roughly 20-times better.
(both queries cached).

The in-clause is not the best way, a regular filter on the results of 
the veryrare-term would be natural, but the query-planner cleverly
collapses that to be the exact same thing.

This can be stressed by just adding the same common keyword several time
to the GIN-search, where the query, even producing the same results gets
slower and slower:

2011-11-30 11:51:10.239 testdb=# select count(id) from testdb.testtable
where id in (select id from testdb.testtable where fts @@
pptsquery('veryrare')) and fts @@ pptsquery('verycommon verycommon
verycommon');
 count
---
14
(1 row)


Time: 90.389 ms
2011-11-30 11:51:16.777 testdb=# select count(id) from testdb.testtable
where fts @@ to_tsquery('english','veryrare  verycommon  verycommon 
verycommon');
 count
---
14
(1 row)

Time: 2207.125 ms

PG does a lot of query-rewriting and testing different query-plans, there
seems to be room for improvements here. I'll craft a test-dataset
where it can be reproduced. Then someone with internal PG knowledge can
tell me if it just isn't implemented or I might have some configuration 
option that prevents my system from doing it correctly.

-- 
Jesper





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Java LISTEN/NOTIFY client library work-around

2011-11-30 Thread Joel Jacobson
Hi,

As you know, LISTEN/NOTIFY is broken in the Java client library. You have
to do a SELECT 1 in a while-loop to receive the notifications.

http://jdbc.postgresql.org/documentation/head/listennotify.html

Is there some other library with a proper implementation where you don't
have to spam the database with queries to get the notifications?

If not, my company is willing to sponsor development of a patch fixing this
problem.

-- 
Best regards,

Joel Jacobson


Re: [HACKERS] Java LISTEN/NOTIFY client library work-around

2011-11-30 Thread Jan Urbański

On 30/11/11 13:07, Joel Jacobson wrote:

Hi,

As you know, LISTEN/NOTIFY is broken in the Java client library. You have
to do a SELECT 1 in a while-loop to receive the notifications.

http://jdbc.postgresql.org/documentation/head/listennotify.html

Is there some other library with a proper implementation where you don't
have to spam the database with queries to get the notifications?


You mean a Java library? If Java is not a requirement, the psycopg2 
Python library supports notifies well.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronous commit vs. hint bits

2011-11-30 Thread Robert Haas
On Wed, Nov 30, 2011 at 1:37 AM, YAMAMOTO Takashi
y...@mwd.biglobe.ne.jp wrote:
 Yes, I would expect that.  What kind of increase are you seeing?  Is
 it causing a problem for you, or are you just making an observation?

 i was curious because my application uses async commits mainly to
 avoid frequent fsync.  i have no numbers right now.

Oh, that's interesting.  Why do you want to avoid frequent fsyncs?  I
thought the point of synchronous_commit=off was to move the fsyncs to
the background, but not necessarily to decrease the frequency.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Word-smithing doc changes

2011-11-30 Thread Robert Haas
On Wed, Nov 30, 2011 at 3:02 AM, Greg Smith g...@2ndquadrant.com wrote:
 I will happily accept that the description there may have suffered from me
 not using all of the terms optimally, and that the resulting commit could be
 improved.  Some more feedback to get the description correct and useful
 would be much appreciated.

 What I cannot agree with is that idea that the implementation details I
 suggested documenting should not be.  There are extremely user-hostile
 things that can happen here, and that are unique to this command.  Saying
 this is too complicated for users to make heads or tails of may very well
 be true in many cases, but I think it's not giving PostgreSQL users very
 much credit.  And when problems with this happen, and I wouldn't have spent
 any time on this if they didn't, right now the only way to make heads or
 tails of it is to read the source code.

+1.

If we only document approximately how it works, then that's less work,
but it's also less useful.  Greg's attempt to document *exactly* how
it works was kind of klunky, but I think that can and should be
improved, not replaced with wording that's more vague and therefore
easier to write.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Java LISTEN/NOTIFY client library work-around

2011-11-30 Thread Andrew Dunstan



On 11/30/2011 07:27 AM, Jan Urbański wrote:

On 30/11/11 13:07, Joel Jacobson wrote:

Hi,

As you know, LISTEN/NOTIFY is broken in the Java client library. You 
have

to do a SELECT 1 in a while-loop to receive the notifications.

http://jdbc.postgresql.org/documentation/head/listennotify.html

Is there some other library with a proper implementation where you don't
have to spam the database with queries to get the notifications?


You mean a Java library? If Java is not a requirement, the psycopg2 
Python library supports notifies well.





... and probably most libraries that (unlike JDBC) are libpq-based, like 
the perl DBD::Pg.



cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reserved words and delimited identifiers

2011-11-30 Thread Alvaro Herrera

Excerpts from Joe Abbate's message of mié nov 30 02:15:09 -0300 2011:

 Thanks Tom and Robert.  I think I understand the problem now.  I guess
 I'll have to work around this quirk by dealing specially with type
 names and not quote them when they're in the shorter list of SQL
 Standard reserved words.

I wonder if it would simpler to just not quote type names except when
absolutely necessary.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reserved words and delimited identifiers

2011-11-30 Thread Andrew Dunstan



On 11/30/2011 09:02 AM, Alvaro Herrera wrote:

Excerpts from Joe Abbate's message of mié nov 30 02:15:09 -0300 2011:


Thanks Tom and Robert.  I think I understand the problem now.  I guess
I'll have to work around this quirk by dealing specially with type
names and not quote them when they're in the shorter list of SQL
Standard reserved words.

I wonder if it would simpler to just not quote type names except when
absolutely necessary.



Yeah, and very much less ugly. Ploughing through masses of unnecessary 
quotes is they way to a headache. quote_ident() gets this right.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reserved words and delimited identifiers

2011-11-30 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 11/30/2011 09:02 AM, Alvaro Herrera wrote:
 I wonder if it would simpler to just not quote type names except when
 absolutely necessary.

 Yeah, and very much less ugly. Ploughing through masses of unnecessary 
 quotes is they way to a headache. quote_ident() gets this right.

But, per this discussion, you can't just blindly apply quote_ident to a
type name, because it may not *be* an identifier.

One possible solution, if you're getting type information about columns
from the server, is to cast the type OID to regtype and let the regtype
output converter make all the decisions.  It's less notation than a join
to pg_type anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2011-11-30 Thread Albe Laurenz
Pavel Stehule wrote:
 updated patch:
 
 * recheck compilation and initdb
 * working routines moved to pl_exec.c
 * add entry to catalog.sgml about lanchecker field
 * add node's utils

Documentation:
--

This patch has no documentation for CHECK FUNCTION or CHECK TRIGGER.
The last patch had at least something.
\h check function in psql does not show anything.

The patch should also add documentation about the handler function
to plhandler.sgml. In particular, the difference between the
validator function and the check function should be pointed out.

Usability:
--

Do I understand right that the reason why the check function is
different from the validator function is that it would be more difficult
to add the checks to the validator function?

Is that a good enough argument? From a user's perspective it is
difficult to see why some checks are performed at function creation
time, while others have to be explicitly checked with CHECK FUNCTION.
I think it would be much more intuitive if CHECK FUNCTION does
the same as function validation with check_function_bodies on.

Submission, Compilation, Regression tests:
--

The patch applies and compiles fine and passes regression tests.
The tests cover the functionality.
initdb succeeds.

pg_dump:


pg_dump support does not work right.
If I create a language like this:

CREATE LANGUAGE mylang HANDLER plpgsql_call_handler
   INLINE plpgsql_inline_handler
   VALIDATOR plpgsql_validator
   CHECK plpgsql_checker;
the dump will contain:
CREATE OR REPLACE PROCEDURAL LANGUAGE mylang;

This is not a problem of this patch though (same in 9.1);
it seems that pg_dump support for languages without definition
in pg_pltemplate is broken in general.

Feature test:
-

CHECK FUNCTION and CHECK TRIGGER work, I couldn't crash it.

Error messages could be better:
CHECK TRIGGER atrigger;
ERROR:  syntax error at or near ;
LINE 1: CHECK TRIGGER atrigger;
  ^
Something like expected keyword 'ON' might be nice.


There are a lot of things that CHECK FUNCTION does not check, some
examples:

1)

CREATE OR REPLACE FUNCTION t(i integer) RETURNS integer
  LANGUAGE plpgsql STRICT AS
  $$DECLARE j integer;
BEGIN
  IF i=1 THEN
FOR I IN 1..4 BY -1 LOOP
   RAISE NOTICE '%', i;
END LOOP;
RETURN -1;
  ELSE
RETURN 2*i;
  END IF;
END;$$;

CHECK FUNCTION t(integer); -- no error

SELECT t(1);
ERROR:  BY value of FOR loop must be greater than zero
CONTEXT:  PL/pgSQL function t line 4 at FOR with integer loop variable

2)

CREATE OR REPLACE FUNCTION t(i integer) RETURNS integer
  LANGUAGE plpgsql STRICT AS
  $$DECLARE j integer;
BEGIN
  IF i=1 THEN
j=999;
RETURN j;
  ELSE
RETURN 2*i;
  END IF;
END;$$;

CHECK FUNCTION t(integer); -- no error

SELECT t(1);
ERROR:  value 999 is out of range for type integer
CONTEXT:  PL/pgSQL function t line 4 at assignment

3)

CREATE OR REPLACE FUNCTION t(i integer) RETURNS integer
  LANGUAGE plpgsql STRICT AS
  $$DECLARE j atable;
BEGIN IF i=1 THEN
  j=12;
  RETURN j;
ELSE
  RETURN 2*i;
END IF;
  END;$$;

CHECK FUNCTION t(integer); -- no error

SELECT t(1);
ERROR:  cannot assign non-composite value to a row variable
CONTEXT:  PL/pgSQL function t line 3 at assignment

4)

CREATE TABLE atable(
  id integer PRIMARY KEY,
  val text NOT NULL
);

INSERT INTO atable VALUES (1, 'eins');

CREATE OR REPLACE FUNCTION atrigger() RETURNS trigger
  LANGUAGE plpgsql STRICT AS
  $$BEGIN
  NEW.id=22;
  RETURN NEW;
END;$$;

CREATE TRIGGER atrigger AFTER DELETE ON atable FOR EACH ROW
  EXECUTE PROCEDURE atrigger();

CHECK TRIGGER atrigger ON atable; -- no error
NOTICE:  checking function atrigger()

DELETE FROM atable;
ERROR:  record new is not assigned yet
DETAIL:  The tuple structure of a not-yet-assigned record is indeterminate.
CONTEXT:  PL/pgSQL function atrigger line 2 at assignment


I can try and come up with more if desired.

Maybe case 2) and 4) cannot reasonably be covered.

It is probably very hard to check everything possible, but the
usefulness of CHECK FUNCTION is proportional to the number of
cases covered.


I'll mark the patch as Waiting on Author until there is more
documentation, I understand the answers to the questions
raised in usability above, and until it is agreed that the things
checked are sufficient.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] %TYPE and array declaration patch review

2011-11-30 Thread Pavel Stehule
Hello

2011/11/28 Greg Smith g...@2ndquadrant.com:
 I'm trying to find someone for the [PL/pgSQL] %TYPE and array declaration -
 second patch patch submitted recently:
  https://commitfest.postgresql.org/action/patch_view?id=666

 Not too many people work on the PL/pgSQL code, and I see you reviewed an
 earlier version of this patch.  Do you think you could find time to review
 the update to it as well?


This patch is not applyed cleanly now

bash-4.2$ patch -p1  type_array.patch
patching file doc/src/sgml/plpgsql.sgml
patching file src/pl/plpgsql/src/gram.y
Hunk #5 succeeded at 2540 (offset -12 lines).
Hunk #6 succeeded at 2554 (offset -12 lines).
Hunk #7 succeeded at 2595 (offset -12 lines).
patching file src/pl/plpgsql/src/pl_comp.c
Hunk #1 succeeded at 1586 (offset 2 lines).
Hunk #2 succeeded at 1883 (offset 2 lines).
Hunk #3 FAILED at 1901.
Hunk #4 succeeded at 2034 (offset 3 lines).
1 out of 4 hunks FAILED -- saving rejects to file
src/pl/plpgsql/src/pl_comp.c.rej
patching file src/pl/plpgsql/src/plpgsql.h
Hunk #2 succeeded at 895 (offset 8 lines).
patching file src/test/regress/expected/plpgsql.out
patching file src/test/regress/sql/plpgsql.sql


I dislike using macros without parameters

+#define word1 strVal(linitial(idents))
+#define word2 strVal(lsecond(idents))
+#define word3 strVal(lthird(idents))

and
-   nse = plpgsql_ns_lookup(plpgsql_ns_top(), false,
-
strVal(linitial(idents)),
-
strVal(lsecond(idents)),
-   NULL,
+   var = (PLpgSQL_var *) plpgsql_get_variable2(
+   word1,
+   word2,
+
 PLPGSQL_NSTYPE_VAR,
+   NULL);

This change is useless, and smudges a code - a list operations are
well known and is not neccessary hide it.

macros

#define linitial_str(lc) strVal(linitial(lc))
#define lsecond_str(lc) strVal(lsecond(lc))
#define lthird_str(lc) strVal(lthird(lc))

these macros should be defined only once per module - #undef is not
used usually in pg source code, don't use it in this case

Regress tests are really large - it is question if about 900 lines is
necessary - should be more compact

Regards

Pavel





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2011-11-30 Thread Tom Lane
Albe Laurenz laurenz.a...@wien.gv.at writes:
 Do I understand right that the reason why the check function is
 different from the validator function is that it would be more difficult
 to add the checks to the validator function?

 Is that a good enough argument? From a user's perspective it is
 difficult to see why some checks are performed at function creation
 time, while others have to be explicitly checked with CHECK FUNCTION.
 I think it would be much more intuitive if CHECK FUNCTION does
 the same as function validation with check_function_bodies on.

I think the important point here is that we need to support more than
one level of validation, and that the higher levels can't really be
applied by default in CREATE FUNCTION because they may fail on perfectly
valid code.

The real reason why we need a separate check function is that the API
for validators doesn't include any parameter about check level.

It's conceivable that instead of adding a check-function entry point,
we could generalizee check_function_bodies into a more-than-2-level
setting, and expect validators to pay attention to the GUC's value
when deciding how aggressively to validate.  However, it's far from
clear to me that that's a more usable definition than having a separate
CHECK FUNCTION command.  An example of where a separate CHECK command
could come in handy is: you just did some ALTER TABLE commands, and now
you would like to check if your functions all still work with the
modified table schemas.

 [ snip examples of some additional checks that could be made ]
 It is probably very hard to check everything possible, but the
 usefulness of CHECK FUNCTION is proportional to the number of
 cases covered.

I don't think that the initial patch needs to check everything that
could conceivably be checked.  We can always add more checking later.
The initial patch obviously has to create the infrastructure for
optional checking, and the specific check that Pavel wants to add
is to run parse analysis on each SQL statement in a plpgsql function.
That seems to me to be a well-defined and useful check, so I think the
scope of the patch is entirely adequate for now.

A bigger issue is that once you think about more than one kind of check,
it becomes apparent that we might need some user-specifiable options to
control which checks are applied.  And I see no provision for that here.
This is not something we can add later, at least not without breaking
the API for the check function --- and if we're willing to break API,
why not just add some more parameters to the validator and avoid having
a second function?

On the whole, it might not be a bad idea to have two allowed signatures
for the validator function, rather than inventing an additional column
in pg_language.  But the fundamental point IMHO is that there needs to
be a provision to pass language-dependent validation options to the
function, whether it's the existing validator or a separate checker
entry point.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2011-11-30 Thread Pavel Stehule
Hello


 CREATE OR REPLACE FUNCTION t(i integer) RETURNS integer
  LANGUAGE plpgsql STRICT AS
  $$DECLARE j integer;
    BEGIN
      IF i=1 THEN
        FOR I IN 1..4 BY -1 LOOP
           RAISE NOTICE '%', i;
        END LOOP;
        RETURN -1;
      ELSE
        RETURN 2*i;
      END IF;
    END;$$;

 CHECK FUNCTION t(integer); -- no error

 SELECT t(1);
 ERROR:  BY value of FOR loop must be greater than zero
 CONTEXT:  PL/pgSQL function t line 4 at FOR with integer loop variable

 2)

 CREATE OR REPLACE FUNCTION t(i integer) RETURNS integer
  LANGUAGE plpgsql STRICT AS
  $$DECLARE j integer;
    BEGIN
      IF i=1 THEN
        j=999;
        RETURN j;
      ELSE
        RETURN 2*i;
      END IF;
    END;$$;

 CHECK FUNCTION t(integer); -- no error

 SELECT t(1);
 ERROR:  value 999 is out of range for type integer
 CONTEXT:  PL/pgSQL function t line 4 at assignment


This kind of check are little bit difficult. It is solveable but I
would to have a skelet in core, and then this skelet can be enhanced
step by step.

Where is problem? PL/pgSQL usually don't work with numeric constant.
Almost all numbers are expressions - and checking function ensure only
semantic validity of expression, but don't try to evaluate expression.
So isn't possible to check runtime errors now.

Regards

Pavel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2011-11-30 Thread Robert Haas
On Wed, Nov 30, 2011 at 10:53 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 On the whole, it might not be a bad idea to have two allowed signatures
 for the validator function, rather than inventing an additional column
 in pg_language.  But the fundamental point IMHO is that there needs to
 be a provision to pass language-dependent validation options to the
 function, whether it's the existing validator or a separate checker
 entry point.

Something like:

CHECK FUNCTION proname(proargs) WITH (...fdw-style elastic options...)

?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2011-11-30 Thread Alvaro Herrera

Excerpts from Tom Lane's message of mié nov 30 12:53:42 -0300 2011:

 A bigger issue is that once you think about more than one kind of check,
 it becomes apparent that we might need some user-specifiable options to
 control which checks are applied.  And I see no provision for that here.
 This is not something we can add later, at least not without breaking
 the API for the check function --- and if we're willing to break API,
 why not just add some more parameters to the validator and avoid having
 a second function?

How about

CHECK (parse, names=off) FUNCTION foobar(a, b, c)

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review of VS 2010 support patches

2011-11-30 Thread Andrew Dunstan

On 11/29/2011 10:01 AM, Andrew Dunstan wrote:



I don't have a VS2010 machine available to test it on unfortunately. 
I'll see what I can do about arranging one, at least temporarily. 
Meanwhile I'll test it on my VS2005 and VS2008 machines to make sure 
it doesn't break anything.





I can confirm that it does work on my 2005 and 2008 platforms.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2011-11-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Nov 30, 2011 at 10:53 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 On the whole, it might not be a bad idea to have two allowed signatures
 for the validator function, rather than inventing an additional column
 in pg_language.  But the fundamental point IMHO is that there needs to
 be a provision to pass language-dependent validation options to the
 function, whether it's the existing validator or a separate checker
 entry point.

 Something like:
 CHECK FUNCTION proname(proargs) WITH (...fdw-style elastic options...)

Great minds think alike ... that was pretty much exactly the syntax that
was in the back of my mind.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2011-11-30 Thread Pavel Stehule
2011/11/30 Alvaro Herrera alvhe...@commandprompt.com:

 Excerpts from Tom Lane's message of mié nov 30 12:53:42 -0300 2011:

 A bigger issue is that once you think about more than one kind of check,
 it becomes apparent that we might need some user-specifiable options to
 control which checks are applied.  And I see no provision for that here.
 This is not something we can add later, at least not without breaking
 the API for the check function --- and if we're willing to break API,
 why not just add some more parameters to the validator and avoid having
 a second function?

 How about

 CHECK (parse, names=off) FUNCTION foobar(a, b, c)

this syntax is relative consistent with EXPLAIN, is it ok for all?

Pavel




 --
 Álvaro Herrera alvhe...@commandprompt.com
 The PostgreSQL Company - Command Prompt, Inc.
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2011-11-30 Thread Robert Haas
On Mon, Nov 28, 2011 at 10:18 AM, Mikko Tiihonen
mikko.tiiho...@nitorcreations.com wrote:
 Here is a working patch that exports a supported_binary_minor constant and a
 binary_minor session variable and a that can be used by clients to enable
 newer features.

Can you add this patch here so we don't lose track of it?

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2011-11-30 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 2011/11/30 Alvaro Herrera alvhe...@commandprompt.com:
 How about
 CHECK (parse, names=off) FUNCTION foobar(a, b, c)

 this syntax is relative consistent with EXPLAIN, is it ok for all?

It seems pretty awkward to me, particularly putting the options before
the second keyword of the command --- that could bite us if we ever want
some other flavors of CHECK command.  I prefer Robert's suggestion of a
WITH clause at the end.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reserved words and delimited identifiers

2011-11-30 Thread Joe Abbate
On 11/30/2011 09:55 AM, Tom Lane wrote:
 One possible solution, if you're getting type information about columns
 from the server, is to cast the type OID to regtype and let the regtype
 output converter make all the decisions.  It's less notation than a join
 to pg_type anyway.

Unfortunately, Pyrseas' yamltodb gets (some) type information from a
YAML input file, so we can't do that.  However, since user defined TYPEs
are processed before TABLE definitions, we could search for TYPEs in the
parallel catalogs (Python dictionaries) maintained in memory.  So, given
this:

schema public:
  description: standard public schema
  table myuser:
columns:
- info:
type: user
  type user:
attributes:
- name: text
- pass: text

we could generate the following SQL against an empty database:

CREATE TYPE user AS (name text,
pass text);
CREATE TABLE myuser (
info user);

Joe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reserved words and delimited identifiers

2011-11-30 Thread Kevin Grittner
Joe Abbate j...@freedomcircle.com wrote:
 On 11/30/2011 09:55 AM, Tom Lane wrote:
 One possible solution, if you're getting type information about
 columns from the server, is to cast the type OID to regtype and
 let the regtype output converter make all the decisions.  It's
 less notation than a join to pg_type anyway.
 
 Unfortunately, Pyrseas' yamltodb gets (some) type information from
 a YAML input file, so we can't do that.  However, since user
 defined TYPEs are processed before TABLE definitions, we could
 search for TYPEs in the parallel catalogs (Python dictionaries)
 maintained in memory.  So, given this:
 
 schema public:
   description: standard public schema
   table myuser:
 columns:
 - info:
 type: user
   type user:
 attributes:
 - name: text
 - pass: text
 
 we could generate the following SQL against an empty database:
 
 CREATE TYPE user AS (name text,
 pass text);
 CREATE TABLE myuser (
 info user);
 
You are prepared to handle the difference between char and char, I
hope.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2011-11-30 Thread Pavel Stehule
2011/11/30 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 2011/11/30 Alvaro Herrera alvhe...@commandprompt.com:
 How about
 CHECK (parse, names=off) FUNCTION foobar(a, b, c)

 this syntax is relative consistent with EXPLAIN, is it ok for all?

 It seems pretty awkward to me, particularly putting the options before
 the second keyword of the command --- that could bite us if we ever want
 some other flavors of CHECK command.  I prefer Robert's suggestion of a
 WITH clause at the end.

we can provide both versions - as can be fine for people. Is is simple
in parser. I like both variants and I am thinking so much more
important is a API of checker function and behave of CHECK FUNCTION
statement.

Just idea - don't kill me :). Because CHECK FUNCTION  is not
destructive , then complete signature is not necessary, and when
function name is unique, then parameters should be optional - it can
be comfortable for manual work, so just CHECK FUNCTION name; can work.
I see a usage for option - a entering  parameter's values instead
signature. When I started with overloaded functions, sometimes I had a
problem with identification of function that was executed - CHECK
FUNCTION can help

CHECK FUNCTION name(10,20) WITH (values);
Notice: checking function name(int, int, int default 20)

I would to design API of checker function be friendly to direct call.
There was some ideas to design CHECK FUNCTION for possibility to check
all functions in schema or language. It should be, but we have a
inline statement and system catalog, so anybody can write own scripts
per your requests. And It was one point to decision for separate
checker function from validate function.

Regards

Pavel


                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2011-11-30 Thread Merlin Moncure
On Mon, Nov 28, 2011 at 9:18 AM, Mikko Tiihonen
mikko.tiiho...@nitorcreations.com wrote:
 Hi,

 As discussed few days ago it would be beneficial if we could change the v3
 backend-client protocol without breaking backwards compatibility.

 Here is a working patch that exports a supported_binary_minor constant and a
 binary_minor session variable and a that can be used by clients to enable
 newer features.

 I also added an example usage where the array encoding for constant size
 elements is optimized if binary_minor version is new enough.

 I have coded the client side support for binary_minor for jdbc driver and
 tested that it worked. But lets first discuss if this is an acceptable path
 forward.

 On 11/25/2011 02:20 AM, Oliver Jowett wrote:
 Re list vs. always-incrementing minor version, you could just use an
 integer and set bits to represent features, which would keep it simple
 but also let clients be more selective about which features they
 implement (you could support feature 21 and 23 without supporting 22)

 I decided not to use a feature flag because when features start to depend on
 each other we need multiple negotiation round trips until the final feature
 set can be known.
 If in your example above the feature 23 depends on server side on feature
 22, but the client only requests 21,23. The server must inform back that
 combination 21,23 is reduced to 21. And if then the client can not support
 21 without 23 the final feature set will not contain 21 or 23.

Regarding your TODO in the code comments about interactions with
replication:  I think it should be removed.  WAL streaming depends on
more things being identical than what is guaranteed here which is
basically the protocol + data formats.  I think all references to
'protocol' should be removed;  Binary wire formats != protocol: the
protocol could bump to v4 but the wire formats (by happenstance) could
all still continue to work -- therefore the version isn't minor (it's
not dependent on protocol version but only on itself).  Therefore,
don't much like the name 'supported_binary_minor'.  How about
binary_format_version?

Also, is a non granular approach really buying us anything here?  ISTM
*something* is likely to change format on most releases of the server
so I'm wondering what's the point (if you don't happen to be on the
same x.x release of the server, you might as well assume to not match
or at least 'go on your own risk).   The value added to the client
version query is quite low.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2011-11-30 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 2011/11/30 Tom Lane t...@sss.pgh.pa.us:
 It seems pretty awkward to me, particularly putting the options before
 the second keyword of the command --- that could bite us if we ever want
 some other flavors of CHECK command.  I prefer Robert's suggestion of a
 WITH clause at the end.

 we can provide both versions - as can be fine for people. Is is simple
 in parser. I like both variants and I am thinking so much more
 important is a API of checker function and behave of CHECK FUNCTION
 statement.

I think you missed my point: I don't want the options list at the front
because I'm afraid it will prevent us from making good extensions in the
future.  Offering both syntaxes does not fix that.

 Just idea - don't kill me :). Because CHECK FUNCTION  is not
 destructive , then complete signature is not necessary, and when
 function name is unique, then parameters should be optional - it can
 be comfortable for manual work, so just CHECK FUNCTION name; can work.

Well, there was some discussion of having a bulk check or wildcard
capability in the CHECK command, but this seems like an awfully
constricted version of that.

The thing I'd prefer to see in the first cut is some notation for check
all functions owned by me that are in language FOO.  The reason for the
language restriction is that if we think the options are
language-specific, there's no reason to believe that different
validators would accept the same options.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table

2011-11-30 Thread Robert Haas
On Sun, Nov 27, 2011 at 10:24 AM, Florian Weimer fwei...@bfk.de wrote:
 I noticed that a bulk INSERT into an empty table (which has been
 TRUNCATEd in the same transaction, for good measure) results in a
 curious number of open(2) calls for the FSM resource fork:

That's kind of unfortunate.  It looks like every time we extend the
relation, we try to read the free space map to see whether there's a
block available with free space in it.  But since we never actually
make any entries in the free space map, the fork never gets created,
so every attempt to read it involves a system call to see whether it's
there.

I set up the following test case to try to measure the overhead on my
MacBook Pro:

create table bob (a integer, b text);

pgbench -f foo -t 100, with the following contents for foo:

begin;
truncate bob;
insert into bob select g,
random()::text||random()::text||random()::text||random()::text from
generate_series(1,1) g;
commit;

I tried whacking out the call to GetPageWithFreeSpace() in
RelationGetBufferForTuple(), and also with the unpatched code, but the
run-to-run randomness was way more than any difference the change
made.  Is there a better test case?

I've had the thought before that maybe we should cache the size of
some limited number of relation forks in shared memory.  That would
potentially eliminate not only the open() calls but also the lseek()
calls.  The trouble is, to get any benefit from such a change, we'd
need to have a userspace cache which was at least as concurrent as
what the kernel implements.  We're currently well behind the Linux
kernel in terms of synchronization techniques, so that would represent
a considerable investment of time and energy.

In this particular case, it seems like there's probably some way to be
smarter.  If we knew that the relation was created or truncated in the
current transaction, and we knew that we hadn't created the free space
map for it, we could presumably deduce that it still doesn't exist.
Not sure exactly how to make that work, though, and it doesn't solve
the more general problem where you create in one transaction and then
insert in the next.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table

2011-11-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, Nov 27, 2011 at 10:24 AM, Florian Weimer fwei...@bfk.de wrote:
 I noticed that a bulk INSERT into an empty table (which has been
 TRUNCATEd in the same transaction, for good measure) results in a
 curious number of open(2) calls for the FSM resource fork:

 That's kind of unfortunate.  It looks like every time we extend the
 relation, we try to read the free space map to see whether there's a
 block available with free space in it.  But since we never actually
 make any entries in the free space map, the fork never gets created,
 so every attempt to read it involves a system call to see whether it's
 there.

I wonder whether it'd help if we went ahead and created the FSM file,
with length zero, as soon as the relation is made (or maybe when it
first becomes of nonzero length).  That would at least save the failed
open()s.  We'd still be doing lseeks on the FSM file, but those ought
to be cheaper.

A less shaky way to do it would be to just create the first page of the
FSM file immediately, but that would represent an annoying percentage
increase in the disk space needed for small tables.  Or maybe we could
compromise and do that once the rel reaches N pages, for some N.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reserved words and delimited identifiers

2011-11-30 Thread Joe Abbate
On 11/30/2011 11:26 AM, Kevin Grittner wrote:
 You are prepared to handle the difference between char and char, I
 hope.

We have not implemented a type verifier in Pyrseas.  It currently
generates SQL based on the type given in the input.  In normal usage,
dbtoyaml is expected to be invoked first, and it will generate quoted
types if necessary, e.g.,

schema public:
  description: standard public schema
  table myuser:
columns:
- info:
type: 'user'
- active:
type: 'char'
- logons:
type: integer
  type user:
attributes:
- name: text
- pass: text

The quotes above are because it selects format_type(atttypid, atttypmod)
from pg_attribute.  The YAML output can then be fed into yamltodb and
will generate (assuming the user type and the first column of myuser
already exist):

ALTER TABLE myuser
ADD COLUMN active char;
ALTER TABLE myuser
ADD COLUMN logons integer;

In other words, Pyrseas depends on the ultimate type verifier: the
PostgreSQL parser (and related routines).

Joe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why so few built-in range types?

2011-11-30 Thread Jeff Davis
On Tue, 2011-11-29 at 12:01 -0500, Tom Lane wrote:
 One thing that bothered me while looking at the range types patch is
 that it seemed you'd been mighty conservative about creating built-in
 range types.

During development, I didn't want to juggle the OIDs for too many range
types. That was really the only reason.

 In particular, I don't understand why there's not a
 standard float8range type; that seems like a pretty common case.
 I'd have also expected to see a standard textrange type.  What was
 the rationale for leaving these out?

A built-in textrange type would have to have collation C, right? Do
you think that would be useful to enough people?

One that I'd like to see is an IP address type, but that's complicated
because inet and cidr support netmasks.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why so few built-in range types?

2011-11-30 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 On Tue, 2011-11-29 at 12:01 -0500, Tom Lane wrote:
 In particular, I don't understand why there's not a
 standard float8range type; that seems like a pretty common case.
 I'd have also expected to see a standard textrange type.  What was
 the rationale for leaving these out?

 A built-in textrange type would have to have collation C, right? Do
 you think that would be useful to enough people?

No, its collation could be set to default, which would match the
database's LC_COLLATE setting.  Probably the more interesting
implementation problem is to come up with a subtype_diff function ...

 One that I'd like to see is an IP address type, but that's complicated
 because inet and cidr support netmasks.

Yeah, it's not clear what if anything to do with the netmask.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Word-smithing doc changes

2011-11-30 Thread Greg Stark
On Wed, Nov 30, 2011 at 8:02 AM, Greg Smith g...@2ndquadrant.com wrote:
 Except in the sections talking about locking
 internals we don't talk about shared locks on virtual transactions
 identifiers we just talk about waiting for a transaction to complete.

 What I cannot agree with is that idea that the implementation details I
 suggested documenting should not be.

What I'm suggesting is translating things like shared locks on
virtual transaction identifiers into what that means for users.
Namely saying something like waiting for a transaction to complete.

Given your confusion it's clear that we have to explain that it will
wait one by one for each transaction that was started before the index
was created to finish. I don't think we need to explain how that's
implemented. If we do it should be set aside in some way, something
like (see virtual transaction id locks in href...).

I just want to keep in mind that the reader here is trying to
understand how to use create index concurrently, not understand how
Postgres's locking infrastructure works in general.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2011-11-30 Thread Pavel Stehule
Hello

I rechecked a possibility to use a validator function together with
checker function.

The main issue is a different interface of both functions. Validator
needs just function oid and uses global variable
check_function_bodies. Checker function needs function oid and
relation oid (possible some other params). When we mix these two
functions together we need a

validator(oid) or validator(oid, oid, variadic any)

one parameter function is old validator, three parameters function is checker.

Question:

What is a correct signature for this function? We cannot use a
overloading, because we can have only one validator function per
language. We can change a validator to support both forms, and we have
to be carefully and correct if we will work with our validators(3 and
more params) or foreign validators (1 param). We should to support
both (compatibility reasons). We are not careful about validators now
- there are some places, where one parameter is hardly expected - this
should be changed. So using validator for checking doesn't mean
smaller patch, but is true, so these functions has similar semantic -
validator is usually low level checker.

What is your opinion?

Regards

Pavel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why so few built-in range types?

2011-11-30 Thread Robert Haas
On Wed, Nov 30, 2011 at 1:08 PM, Jeff Davis pg...@j-davis.com wrote:
 One that I'd like to see is an IP address type, but that's complicated
 because inet and cidr support netmasks.

A CIDR address defines a range all by itself, without packing any
other type on top.  It just needs GIST support, and an indexable
operator for contains or is contained by; then, you can define an
exclusion constraint over a CIDR column to enforce a
no-duplicate-or-overlapping-IP-ranges rule.  I started working on that
at one point, but I didn't have as much enthusiasm as the task needed
so I gave up before accomplishing anything particularly useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2011-11-30 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 I rechecked a possibility to use a validator function together with
 checker function.

 The main issue is a different interface of both functions. Validator
 needs just function oid and uses global variable
 check_function_bodies. Checker function needs function oid and
 relation oid (possible some other params). When we mix these two
 functions together we need a

 validator(oid) or validator(oid, oid, variadic any)

Right, although if you want it to be callable from SQL I think that
variadic any is too loose.

 What is a correct signature for this function? We cannot use a
 overloading, because we can have only one validator function per
 language.

So?  You have one validator function, it has either signature;
if it has the old signature then CHECK isn't supported by the language.
We have plenty of examples of this sort of thing already.

One issue that would need to be considered is how the validator tells
the difference between a CREATE FUNCTION call and a CHECK call with
default parameters (no WITH clause).  Those shouldn't do exactly the
same thing, presumably.  Maybe that's a sufficient reason to have two
entry points.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table

2011-11-30 Thread Robert Haas
On Wed, Nov 30, 2011 at 12:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sun, Nov 27, 2011 at 10:24 AM, Florian Weimer fwei...@bfk.de wrote:
 I noticed that a bulk INSERT into an empty table (which has been
 TRUNCATEd in the same transaction, for good measure) results in a
 curious number of open(2) calls for the FSM resource fork:

 That's kind of unfortunate.  It looks like every time we extend the
 relation, we try to read the free space map to see whether there's a
 block available with free space in it.  But since we never actually
 make any entries in the free space map, the fork never gets created,
 so every attempt to read it involves a system call to see whether it's
 there.

 I wonder whether it'd help if we went ahead and created the FSM file,
 with length zero, as soon as the relation is made (or maybe when it
 first becomes of nonzero length).  That would at least save the failed
 open()s.  We'd still be doing lseeks on the FSM file, but those ought
 to be cheaper.

 A less shaky way to do it would be to just create the first page of the
 FSM file immediately, but that would represent an annoying percentage
 increase in the disk space needed for small tables.

Well, unfortunately, we're not really doing a good job dodging that
problem as it is.  For example:

rhaas=# create table foo (a int);
CREATE TABLE
rhaas=# select pg_relation_size('foo'), pg_table_size('foo');
 pg_relation_size | pg_table_size
--+---
0 | 0
(1 row)

rhaas=# insert into foo values (1);
INSERT 0 1
rhaas=# select pg_relation_size('foo'), pg_table_size('foo');
 pg_relation_size | pg_table_size
--+---
 8192 |  8192
(1 row)

rhaas=# vacuum foo;
VACUUM
rhaas=# select pg_relation_size('foo'), pg_table_size('foo');
 pg_relation_size | pg_table_size
--+---
 8192 | 40960
(1 row)

rhaas=#

Yikes!  A table with 4 bytes of useful data is consuming 40kB on disk
- 8kB in the main form, 8kB in the VM fork, and 24kB in the FSM fork.
Ouch!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reserved words and delimited identifiers

2011-11-30 Thread Kevin Grittner
Joe Abbate j...@freedomcircle.com wrote:
 On 11/30/2011 11:26 AM, Kevin Grittner wrote:
 You are prepared to handle the difference between char and
 char, I hope.
 
 Pyrseas depends on the ultimate type verifier: the
 PostgreSQL parser (and related routines).
 
OK.  I just wanted to be sure that you were aware of that one; it
surprises people sometimes that PostgreSQL includes both a char
reserved word for a type and a char type which is completely
different:
 
test=# create table x (noq char, withq char);
CREATE TABLE
test=# \x on
Expanded display is on.
test=# select attnum, attname, atttypid, atttypid::regtype,
   (select typname from pg_type where oid = atttypid),
   attlen, atttypmod, attbyval, attstorage, attalign
  from pg_attribute
  where attrelid = 'x'::regclass and attnum  0;
-[ RECORD 1 ]-
attnum | 1
attname| noq
atttypid   | 1042
atttypid   | character
typname| bpchar
attlen | -1
atttypmod  | 5
attbyval   | f
attstorage | x
attalign   | i
-[ RECORD 2 ]-
attnum | 2
attname| withq
atttypid   | 18
atttypid   | char
typname| char
attlen | 1
atttypmod  | -1
attbyval   | t
attstorage | p
attalign   | c
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table

2011-11-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Yikes!  A table with 4 bytes of useful data is consuming 40kB on disk
 - 8kB in the main form, 8kB in the VM fork, and 24kB in the FSM fork.
 Ouch!

Yeah, ouch.  Why is the FSM fork eating so much space --- I'd have
expected 8k there, but 24?

Also, if VACUUM is going to cause the FSM to be created anyway, there
may not be a lot of point to refraining from creating the first page
right away.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum and default_transaction_isolation

2011-11-30 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 For the moment I duplicated the existing logic of overriding
 relevant GUC variables in the process's Main() function,
 
Thanks!
 
 but I wonder if we ought to be setting these things in some more
 centralized place, like InitPostgres().  That function already
 knows quite a bit about what sort of process it's initializing ...
 
It does seem like the sort of thing which might get missed when
creating a new type of process or a new GUC which needs this type of
treatment.  Whichever placement seems most likely to get noticed
seems best; one centralized placement seems to me most likely to
attract notice and the necessary thought on the topic
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Java LISTEN/NOTIFY client library work-around

2011-11-30 Thread Joel Jacobson
On 30 nov 2011, at 13:27, Jan Urbański wulc...@wulczer.org wrote:

 On 30/11/11 13:07, Joel Jacobson wrote:
 Hi,
 
 As you know, LISTEN/NOTIFY is broken in the Java client library. You have
 to do a SELECT 1 in a while-loop to receive the notifications.
 
 http://jdbc.postgresql.org/documentation/head/listennotify.html
 
 Is there some other library with a proper implementation where you don't
 have to spam the database with queries to get the notifications?
 
 You mean a Java library? If Java is not a requirement, the psycopg2 Python 
 library supports notifies 

I need a Java-library for this

 
 Cheers,
 Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Accounting for toast in query planner. (gin/gist indexes).

2011-11-30 Thread Jesper Krogh

Hi list.

I have currently hit a problem which I dug into finding the cause for, in
particular, searching in GIN indices seems in some situations to
un-fairly favor Sequential Scans.

Googling a bit I found this page:
http://postgis.refractions.net/docs/ch06.html#id2635817

Describing the excact problem.

It seemed to be discussed back in the pre 8.1 days and wasn't
solved there, is there a chance someone may address it in 9.2 ?

http://archives.postgresql.org/pgsql-performance/2005-02/msg00041.php

Would you coin it a hard task or can a fairly naive C-coder, with
a fair amount of PG experience approach it?

Test-dataset can be created with:

CREATE table ftstest (id serial unique, fts tsvector);

DO
$$DECLARE r RECORD;
BEGIN
FOR r in SELECT generate_series(1,5000)
LOOP insert into ftstest(fts) (select 
strip(to_tsvector('english',string_agg(test,' '))) from (select 'test' 
|| generate_series(1,(select (random()*1)::int)) as test ) as foo);

END LOOP;
END;
$$;

CREATE INDEX ON ftstest using gin(fts);
ANALYZE;



2011-11-30 21:13:30.302 jktest=# explain ( buffers on, analyze on  ) 
select count(id) from ftstest where fts @@ to_tsquery('english','test500');

QUERY PLAN
--
 Aggregate  (cost=122.37..122.38 rows=1 width=4) (actual 
time=1114.096..1114.097 rows=1 loops=1)

   Buffers: shared hit=13384 read=24445 written=3002
   -  Seq Scan on ftstest  (cost=0.00..110.50 rows=4748 width=4) 
(actual time=0.567..1112.447 rows=4748 loops=1)

 Filter: (fts @@ '''test500'''::tsquery)
 Rows Removed by Filter: 252
 Buffers: shared hit=13384 read=24445 written=3002
 Total runtime: 1114.134 ms
(7 rows)

Time: 1114.945 ms
2011-11-30 21:14:30.382 jktest=# set enable_seqscan to off;
SET
Time: 0.132 ms
2011-11-30 21:14:50.965 jktest=# explain ( buffers on, analyze on  ) 
select count(id) from ftstest where fts @@ to_tsquery('english','test500');

 QUERY PLAN
-
 Aggregate  (cost=184.02..184.03 rows=1 width=4) (actual 
time=2.502..2.502 rows=1 loops=1)

   Buffers: shared hit=1 read=56 written=3
   -  Bitmap Heap Scan on ftstest  (cost=64.80..172.15 rows=4748 
width=4) (actual time=1.160..1.989 rows=4748 loops=1)

 Recheck Cond: (fts @@ '''test500'''::tsquery)
 Buffers: shared hit=1 read=56 written=3
 -  Bitmap Index Scan on ftstest_fts_idx  (cost=0.00..63.61 
rows=4748 width=0) (actual time=1.137..1.137 rows=4748 loops=1)

   Index Cond: (fts @@ '''test500'''::tsquery)
   Buffers: shared hit=1 read=8
 Total runtime: 2.529 ms
(9 rows)

Time: 3.016 ms


--
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: index support for regexp search

2011-11-30 Thread Robert Haas
On Tue, Nov 22, 2011 at 2:38 PM, Alexander Korotkov
aekorot...@gmail.com wrote:
 WIP patch with index support for regexp search for pg_trgm contrib is
 attached.
 In spite of techniques which extracts continuous text parts from regexp,
 this patch presents technique of automatum transformation. That allows more
 comprehensive trigrams extraction.

Please add this patch here so it does not get lost in the shuffle:

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inverse convertion for pg_mb2wchar

2011-11-30 Thread Robert Haas
On Mon, Nov 21, 2011 at 11:49 AM, Alexander Korotkov
aekorot...@gmail.com wrote:
 I've a question about pg_mb2wchar function. Is there any way for inverse
 convertion pg_wchar* to char*?
 I've looked to pg_wchar_tbl table definition, and I didn't find anything
 about inverse transformation. So, any change to get inverse convertion?
 I'm experimenting with index support for regexp search and I'm trying to get
 some characters back from color map.

Well, any char can presumably also be represented as a wchar, but the
reverse isn't necessarily true...

(What's a color map?)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] loss of transactions in streaming replication

2011-11-30 Thread Robert Haas
On Mon, Oct 24, 2011 at 8:40 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Fri, Oct 21, 2011 at 12:01 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Oct 20, 2011 at 9:51 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Thu, Oct 20, 2011 at 1:05 AM, Robert Haas robertmh...@gmail.com wrote:
 OK, so this is an artifact of the changes to make libpq communication
 bidirectional.  But I'm still confused about where the error is coming
 from.  In your OP, you wrote In 9.2dev and 9.1, when walreceiver
 detects an error while sending data to WAL stream, it always emits
 ERROR even if there are data available in the receive buffer.  So
 that implied to me that this is only going to trigger if you have a
 shutdown together with an awkwardly-timed error.  But your scenario
 for reproducing this problem doesn't seem to involve an error.

 Yes, my scenario doesn't cause any real error. My original description was
 misleading. The following would be closer to the truth:

    In 9.2dev and 9.1, when walreceiver detects the termination of 
 replication
    connection while sending data to WAL stream, it always emits ERROR
    even if there are data available in the receive buffer.

 Ah, OK.  I think I now agree that this is a bug and that we should fix
 and back-patch.

 The patch that I posted before is well-formed enough to be adopted?

Does this still need to be worked on?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring on DROP/ALTER SET SCHEMA/ALTER RENAME TO statement

2011-11-30 Thread Robert Haas
On Sun, Nov 27, 2011 at 3:14 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 If we add new properties that required by AlterObjectNamespace, as
 you suggested, it will allow to reduce number of caller of this routine
 mechanically with small changes.
 Should I try this reworking with this way?

Yeah, let's try that for starters, and then see if anything further
suggests itself.

 At least, AlterObjectNamespace already consolidate the point to check
 permissions.

Right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: cross column stats revisited ...

2011-11-30 Thread Robert Haas
2011/9/13 PostgreSQL - Hans-Jürgen Schönig postg...@cybertec.at:
 hello everybody,

 here is the next version of the cross column patch.
 in the meantime zoli and i managed to make the cross column sampling work.
 some prototype syntax is already working and we are able to store cross 
 column data.
 next on the list is further planner integration and finally some support for 
 joins (which is the main point of our work).

 feedback is highly welcome ;).

Did this get added to a CommitFest at any point?  If not, and you
still want feedback, please add it here:

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why so few built-in range types?

2011-11-30 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 A CIDR address defines a range all by itself, without packing any
 other type on top.  It just needs GIST support, and an indexable
 operator for contains or is contained by; then, you can define an
 exclusion constraint over a CIDR column to enforce a
 no-duplicate-or-overlapping-IP-ranges rule.  I started working on that
 at one point, but I didn't have as much enthusiasm as the task needed
 so I gave up before accomplishing anything particularly useful.

Erm, isn't there a contrib type that already does all that for you..?
ip4r or whatever?  Just saying, if you're looking for that capability..

I do think it'd be kind of interesting to offer both that and a
straight-up 'ip_address' type w/ range types..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Why so few built-in range types?

2011-11-30 Thread Robert Haas
On Wed, Nov 30, 2011 at 3:58 PM, Stephen Frost sfr...@snowman.net wrote:
 Erm, isn't there a contrib type that already does all that for you..?
 ip4r or whatever?  Just saying, if you're looking for that capability..

Oh, huh, good to know.  Still, I'm not sure why you need to load a
separate type to get this... there's no reason why the built-in CIDR
type couldn't support it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FlexLocks

2011-11-30 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Why is it OK to drop these lines from the else condition in
 ProcArrayEndTransaction()?:

/* must be cleared with xid/xmin: */
proc-vacuumFlags = ~PROC_VACUUM_STATE_MASK;
 
 It's probably not.  Oops.
 
OK.  I see that's back now.
 
 I believe the attached patch versions address your comments
 regarding the flexlock patch as well; it is also rebased over the
 PGXACT patch, which has since been committed.
 
Applies cleanly again.
 
 The extraWaits code still looks like black magic to me, so unless
 someone can point me in the right direction to really understand
 that, I can't address whether it's OK.
 
 I don't think I've changed the behavior, so it should be fine. 
 The idea is that something like this can happen:
 
 [explanation of the extraWaits behavior]
 
Thanks.  I'll spend some time reviewing this part.  There is some
rearrangement of related code, and this should arm me with enough of
a grasp to review that.
 
 [gripes about modularity compromise and lack of pluggability]
 
 let me think about that.  I haven't addressed that in this
 version.
 
OK.  There are a few things I found in this pass which missed in the
last.  One contrib module was missed, I found another typo in a
comment, and I think we can reduce the include files a bit.  Rather
than describe it, I'm attaching a patch file over the top of your
patches with what I think might be a good idea.  I don't think
there's anything here to merit a new round of benchmarking.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FlexLocks

2011-11-30 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 OK.  There are a few things I found in this pass which missed in the
 last.  One contrib module was missed, I found another typo in a
 comment, and I think we can reduce the include files a bit.  Rather
 than describe it, I'm attaching a patch file over the top of your
 patches with what I think might be a good idea.
 
This time with it really attached.
 
-Kevin

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c 
b/contrib/pg_stat_statements/pg_stat_statements.c
index 51b24d0..6167e36 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -260,7 +260,7 @@ _PG_init(void)
 * resources in pgss_shmem_startup().
 */
RequestAddinShmemSpace(pgss_memsize());
-   RequestAddinLWLocks(1);
+   RequestAddinFlexLocks(1);
 
/*
 * Install hooks.
diff --git a/src/backend/postmaster/postmaster.c 
b/src/backend/postmaster/postmaster.c
index 59d18eb..a07a4c9 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -109,7 +109,6 @@
 #include postmaster/syslogger.h
 #include replication/walsender.h
 #include storage/fd.h
-#include storage/flexlock_internals.h
 #include storage/ipc.h
 #include storage/pg_shmem.h
 #include storage/pmsignal.h
diff --git a/src/backend/storage/lmgr/flexlock.c 
b/src/backend/storage/lmgr/flexlock.c
index f96437b..6145951 100644
--- a/src/backend/storage/lmgr/flexlock.c
+++ b/src/backend/storage/lmgr/flexlock.c
@@ -22,17 +22,16 @@
 #include postgres.h
 
 #include miscadmin.h
+#include pg_trace.h
 #include access/clog.h
 #include access/multixact.h
 #include access/subtrans.h
 #include commands/async.h
+#include storage/flexlock.h
 #include storage/flexlock_internals.h
-#include storage/lwlock.h
 #include storage/predicate.h
-#include storage/proc.h
 #include storage/procarraylock.h
 #include storage/spin.h
-#include utils/elog.h
 
 /*
  * We use this structure to keep track of flex locks held, for release
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 173b7cb..10ec83b 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -755,7 +755,7 @@ ProcKill(int code, Datum arg)
 #endif
 
/*
-* Release any felx locks I am holding.  There really shouldn't be any, 
but
+* Release any flex locks I am holding.  There really shouldn't be any, 
but
 * it's cheap to check again before we cut the knees off the flex lock
 * facility by releasing our PGPROC ...
 */
diff --git a/src/include/storage/flexlock_internals.h 
b/src/include/storage/flexlock_internals.h
index d1bca45..a5c5711 100644
--- a/src/include/storage/flexlock_internals.h
+++ b/src/include/storage/flexlock_internals.h
@@ -16,8 +16,6 @@
 #ifndef FLEXLOCK_INTERNALS_H
 #define FLEXLOCK_INTERNALS_H
 
-#include pg_trace.h
-#include storage/flexlock.h
 #include storage/proc.h
 #include storage/s_lock.h
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review of VS 2010 support patches

2011-11-30 Thread Andrew Dunstan



On 11/29/2011 04:32 PM, Brar Piening wrote:

Andrew Dunstan wrote:


Some minor nitpicks:

Do we really need to create all those VSProject.pm and 
VSSolution.pm files? They are all always included anyway. Why not 
just stash all the packages in Solution.pm and Project.pm? 

We certainly don't *need* them.
Having different files separates the tasks of generating different 
target file formats into different source files. In my opinion this 
makes it easier to find the code that is actually generating the files 
that get used in a specific build environment.
While the VSSolution.pm and VC200nProject.pm files are indeed not 
much more than stubs that could eventually be extended in future (and 
probably never will) VC2010Project.pm contains the whole code for 
generating the new file format which would significantly bloat up the 
code in Project.pm that currently contains the common code for 
generating the old file formats.






Does anyone else have an opinion on this. I want to wrap this up ASAP so 
we can get a VS2010 buildfarm member working.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Java LISTEN/NOTIFY client library work-around

2011-11-30 Thread Kris Jurka


On Wed, 30 Nov 2011, Joel Jacobson wrote:

 As you know, LISTEN/NOTIFY is broken in the Java client library. You have to
 do a SELECT 1 in a while-loop to receive the notifications.
 
 http://jdbc.postgresql.org/documentation/head/listennotify.html

This documentation is out of date.  Currently you can get notifications 
without polling the database if you are not using a SSL connection.  You 
still must poll the driver, using PGConnection.getNotifications, but it 
will return new notifications received without an intermediate database 
query.  This doesn't work over SSL and potentially some other connection 
types because it uses InputStream.available that not all 
implementations support.

Kris Jurka

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Accounting for toast in query planner. (gin/gist indexes).

2011-11-30 Thread Tom Lane
Jesper Krogh jes...@krogh.cc writes:
 I have currently hit a problem which I dug into finding the cause for, in
 particular, searching in GIN indices seems in some situations to
 un-fairly favor Sequential Scans.

 Googling a bit I found this page:
 http://postgis.refractions.net/docs/ch06.html#id2635817
 Describing the excact problem.

 It seemed to be discussed back in the pre 8.1 days and wasn't
 solved there, is there a chance someone may address it in 9.2 ?
 http://archives.postgresql.org/pgsql-performance/2005-02/msg00041.php

Don't hold your breath.  There's a huge amount of work to do there,
and nobody's even thinking about it.  The problem has actually gotten
worse since 8.1, because the index access patterns are more complex
and the planner has less not more info about them (because we pushed
the determination of what's lossy to run time).  Accumulating stats
about how toasty the heap tuples are would be only the first step
towards producing better cost estimates here --- you'd also need some
way of estimating how lossy an index is, for instance, so you could
guess how many heap tuples will be visited.  And on top of that,
for a lot of these complicated operators even our estimates of the final
number of matches are pretty crummy.  I'm not sure if having an explicit
model of what's happening inside the index would help.  Maybe it would,
since the condition actually being tested by the index is probably
simpler than the operator proper, but right now the planner has no clue
that they're different.

 Would you coin it a hard task or can a fairly naive C-coder, with
 a fair amount of PG experience approach it?

We're a long way from needing C coding here.  The first thing would be
to come up with a design for what we're going to model, what statistical
estimates have to be made along the way, and how we can compartmentalize
the knowledge needed given that we want operator classes and index
access methods to be pluggable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Word-smithing doc changes

2011-11-30 Thread Greg Smith

On 11/30/2011 10:20 AM, Greg Stark wrote:

Given your confusion it's clear that we have to explain that it will
wait one by one for each transaction that was started before the index
was created to finish.


When the index was created is a fuzzy thing though.  It looked to me 
like it makes this check at the start of Phase 2.  If I read when the 
index was created in the manual, I would assume that meant the instant 
at which CREATE INDEX CONCURRENTLY started.  I don't think that's 
actually the case though; it's actually delayed to the beginning of 
Phase 2 start, which can easily be hours later for big indexes.  Please 
correct me if I'm reading that wrong.



  I don't think we need to explain how that's
implemented. If we do it should be set aside in some way, something
like (see virtual transaction id locks inhref...).


Fair enough.  There is this wording in the pg_locks documentation:  
When one transaction finds it necessary to wait specifically for 
another transaction, it does so by attempting to acquire share lock on 
the other transaction ID (either virtual or permanent ID depending on 
the situation). That will succeed only when the other transaction 
terminates and releases its locks.


Linking to that instead of trying to duplicate it is a good idea.  
Another good, related idea might be to expand the main Concurrency 
Control chapter to include this information.  When I re-read Explicit 
Locking again:  
http://developer.postgresql.org/pgdocs/postgres/explicit-locking.html it 
strikes me it would be nice to add Transaction Locks as an full entry 
there.  The trivia around how those work is really kind of buried in the 
pg_locks section.



I just want to keep in mind that the reader here is trying to
understand how to use create index concurrently, not understand how
Postgres's locking infrastructure works in general.


To the extent they can be ignorant of the locking infrastructure, that's 
true.  This operation is complicated enough that I don't think we can 
hide too many of the details from the users, while still letting them 
assess the risk and potential duration accurately.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Large number of open(2) calls with bulk INSERT into empty table

2011-11-30 Thread Heikki Linnakangas

On 30.11.2011 20:45, Tom Lane wrote:

Robert Haasrobertmh...@gmail.com  writes:

Yikes!  A table with 4 bytes of useful data is consuming 40kB on disk
- 8kB in the main form, 8kB in the VM fork, and 24kB in the FSM fork.
Ouch!


Yeah, ouch.  Why is the FSM fork eating so much space --- I'd have
expected 8k there, but 24?


The FSM is a three-level tree (with the default BLCKSZ), and the code 
creates all three levels right from the start. That keeps the addressing 
simple, you can easily calculate the location of the FSM entry for a 
given page. I tried to come up with an addressing scheme that would 
allow adding upper-level pages only as needed, but failed. If you have 
any ideas on that, I'm all ears. I have a feeling that there's got to be 
some well-known scheme that does that, but I didn't find it. The current 
addressing scheme is documented in storage/freespace/README, in section 
Higher-Level Structure.



Also, if VACUUM is going to cause the FSM to be created anyway, there
may not be a lot of point to refraining from creating the first page
right away.


Perhaps we should refrain from creating a FSM unless the table is larger 
than 1 block.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review of VS 2010 support patches

2011-11-30 Thread Magnus Hagander
On Thu, Dec 1, 2011 at 01:06, Andrew Dunstan and...@dunslane.net wrote:


 On 11/29/2011 04:32 PM, Brar Piening wrote:

 Andrew Dunstan wrote:

 Some minor nitpicks:

 Do we really need to create all those VSProject.pm and
 VSSolution.pm files? They are all always included anyway. Why not just
 stash all the packages in Solution.pm and Project.pm?

 We certainly don't *need* them.
 Having different files separates the tasks of generating different target
 file formats into different source files. In my opinion this makes it easier
 to find the code that is actually generating the files that get used in a
 specific build environment.
 While the VSSolution.pm and VC200nProject.pm files are indeed not much
 more than stubs that could eventually be extended in future (and probably
 never will) VC2010Project.pm contains the whole code for generating the new
 file format which would significantly bloat up the code in Project.pm that
 currently contains the common code for generating the old file formats.


 Does anyone else have an opinion on this. I want to wrap this up ASAP so we
 can get a VS2010 buildfarm member working.

I guess the most likely one would be me, but not really. My perl-fu is
well below this level, so I will happily +1 whatever you more
experienced perl guys say :-) I don't see a big problem with a couple
of more files - it's not like we're going to support 20 different
versions of VS anyway, once we get to 4 i'm sure the earliest one is
well out of support already and can be removed. But in summary I'd
vote for whatever matches the general perl pest practices at this
time.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers