Re: [GENERAL] libpq ASYNC with PQgetResult and PQisBusy

2010-12-21 Thread Raimon Fernandez
hi again,

On 20dic, 2010, at 18:48 , Tom Lane wrote:

 So, now I'm using the PQisBusy to check if postgre is still busy and I can 
 safely call the PQgetResult wihtout blocking, or just wait *some time* 
 before sending a new PQisBusy.
 
 Your proposed code is still a busy-wait loop.  What you should be doing
 is waiting for some data to arrive on the socket.  Once you see
 read-ready on the socket, call PQconsumeInput, then check PQisBusy to
 see if the query is complete or not.  If not, go back to waiting on the
 socket.  Typically you'd use select() or poll() to watch for both data
 on libpq's socket and whatever other events your app is interested in.

As I'm doing this as an excercise and how knows, I'm still playing with my 
previous approach, the same question but in a different way:

Why the PQisBusy is telling me that it's still busy if I send the PQgetResult 
and obtain them at the same moment ?

Now I'm not in a loop, just send the PQsendQuery from a button, and then, I 
just click on another button that simply checks for the PQconsumeInput and 
PQisBusy, and I'm clickin on it each second, and always it's returning busy, 
but if I send the PQgetResult I get it.

So, why libpq it's not updating it's status ?

thanks again,

r.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Can the query planner create indexes?

2010-12-21 Thread Dario Beraldi

Hello,

This question is just for my curiosity...

When an index is available for a query, the planner decides whether to  
use it or not depending on whether it would make the query perform  
better, right? However if an index, which does not exist, would make  
the query run better the planner is not able (allowed?) to create such  
index, use it, and drop it once the query is done. Why is it so?
Why is the query planner not allowed to create indexes, but only  
allowed to use or not use what's available?


Am I misunderstanding the way the planner works? Any explanation   
clarification much appreciated!


All the best and Christmas wishes!
Dario

--

Dr. Dario Beraldi
Institute of Evolutionary Biology
University of Edinburgh
West Mains Road
Edinburgh EH9 3JT
Scotland, UK


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to guard PostgreSQL from overcharged and/or malicious (read-only) queries?

2010-12-21 Thread Stefan Keller
Hi Alban

Many thanks for your answers.

You answered:
 1. Filter out all SQL commands which are *not* read-only (no DROP
 Most people do this using permissions.

Oh, yes: forgot to mention that; that's obvious. What I also looked
for was the PL/pgSQL's EXECUTE command-string.

 2. Get the estimated time (units) from PostgreSQL planner in a
 reliable way (if possible standard/ANSI).

Ok; again keep in mind that I have a read-only database. Therefore the
statistics should be up-to-date (after a vacuum analyse). What I have
in mind is exposing the database to the 'public' for exercising and
testing in a way similar to the following (try a query like SELECT
ST_AsText(ST_GeometryFromText('POINT(8.5 47.7)', 4326));):
http://openlayers-buch.de/beispiele/chapter-09/postgis-terminal.html

See below my attempt to write such a function I called
secure_execute(text). It's still not functioning and I have
indicated two problems there. What do you think?
I like the idea letting abandon the query if it's obviously(!) wrong
or if the planner alerts me about very high costs?
Or should I rather abandon the idea of such a function and simply rely
on read-only privileges and a session statement_timeout?

Yours, S.

--
-- Executes a query. Aborts if it contains ; or does take too long.
-- Returns: TABLE
--
DROP FUNCTION secure_execute(text);
--
CREATE OR REPLACE FUNCTION secure_execute(text)
RETURNS SETOF real  --  PROBLEM 1: Want to return the resultset of
the query here as table (SETOF RECORD?) .
LANGUAGE 'plpgsql' STRICT
AS $$
DECLARE
  query text := $1;
  cost_estimate_txt text;
  max_cost_estimate integer;
  rec RECORD;
BEGIN
  -- Abort if ; is in query
  -- tbd.

  -- Get max_cost_estimate:
  EXECUTE 'EXPLAIN' || ' ' || query INTO cost_estimate_txt;
  SET statement_timeout TO DEFAULT;
  -- txt example: 'Function Scan on generate_series id
(cost=0.00..12.50 rows=1000 width=0)'
  max_cost_estimate := round(CAST(substring(cost_estimate_txt,50,6) AS
numeric),0);

  -- Execute query (abort if too time consuming)!
  IF max_cost_estimate  100 THEN  -- in units (production config.:
depends on machine)
SET statement_timeout TO 10; -- in ms (production config.: set to
a minute = 6ms)
EXECUTE $1 INTO rec;
SET statement_timeout TO DEFAULT;
  END IF;
  RETURN; --  PROBLEM 2: want to return rec here.

  -- Error handling: Catch all
  EXCEPTION WHEN OTHERS THEN
SET statement_timeout TO DEFAULT;
RAISE NOTICE 'ERROR';
RETURN;
END;
$$
-- Test (positive):
SELECT secure_execute('SELECT random() FROM generate_series(1, 100) AS id');
-- Test (not ok):
SELECT secure_execute('SELECT random() FROM generate_series(1, 10)
AS id');  -- timeout
SELECT secure_execute('SELECT random() FROM generate_series(1, 100) AS
id;'SELECT * FROM generate_series(1, 100)); -- two commands
SELECT secure_execute('DROP TABLE IF EXISTS dummy');  -- malicious!
SELECT secure_execute('SELECT random() FROM generate_series(1, 100) AS
id;DROP TABLE IF EXISTS dummy'); -- two commands, one malicious


2010/12/20 Alban Hertroys dal...@solfertje.student.utwente.nl:
 On 20 Dec 2010, at 10:05, Stefan Keller wrote:

 I'd like to guard postgres from overcharged and/or malicious queries.

 The queries are strinctly read-only (from a SQL users perspective).
 For doing this I'd like to code two functions (preferrably pl/pgsql):

 1. Filter out all SQL commands which are *not* read-only (no
 DROP/DELETE/UPDATE/TRUNCATE).

 Most people do this using permissions.

 2. Get the estimated time (units) from PostgreSQL planner in a
 reliable way (if possible standard/ANSI).


 I don't think there's a way to do that directly, not without hacking the 
 source.

 What you can do is to have all users go through a SECURITY DEFINER type of 
 function that does this for them. That function can then read the output of 
 EXPLAIN query for its estimates. Those aren't exactly times, but cost 
 estimates. The actual time taken depends on your hardware, you would need to 
 do some measurements to see how planned costs and actual time relate.

 I'm not sure this is a good idea though.
 Firstly, testing the query plan adds a little overhead to every query coming 
 in. It's not a lot, but if someone fires a lot of small fast queries it could 
 become a problem. You would be hurting the people who're using your database 
 correctly, instead of the people who're abusing it.

 Secondly, you could achieve a similar effect by limiting the amount of time a 
 query is allowed to run. I'm pretty sure there are configuration options that 
 cause long-running queries to get killed after a set time.

 Thirdly... Reliable estimates??? Lol!
 Seriously, as hard as Postgres tries to keep statistics that make sense, I 
 don't think they can ever be considered entirely reliable. You may not be 
 vacuuming frequently enough, your statistics target may be too small or your 
 data might not be suitable for statistical analysis (random numbers and very 
 unbalanced distributions are good 

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Gabriele Bartolini

Ciao Dario,

On Tue, 21 Dec 2010 09:14:36 +, Dario Beraldi 
dario.bera...@ed.ac.uk wrote:
the query run better the planner is not able (allowed?) to create 
such

index, use it, and drop it once the query is done. Why is it so?


Because it is not its responsibility. This is the simplest and most 
rational answer.


I would reply with some questions that hopefully will give you the 
answer. How can you define a 'good' query from a 'bad' query? Consider 
the case when an user launches a wrong query and the planner you propose 
starts creating an index. I believe that would be a mess.


Then ... how could you project this scenario in a concurrent context 
where multiple users launch queries that 'need' an index?


I suggest that you look at the documentation for more information. 
Otherwise, I strongly suggest that you read the chapter on the planner 
from Greg's book on High Performance (which you can find from here: 
http://www.postgresql.org/docs/books/)


Merry Christmas to you too!

Cheers,
Gabriele

--
 Gabriele Bartolini - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it - www.2ndQuadrant.it

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Massa, Harald Armin
Hello Dario,

When an index is available for a query, the planner decides whether to use
 it or not depending on whether it would make the query perform better,
 right? However if an index, which does not exist, would make the query run
 better the planner is not able (allowed?) to create such index, use it, and
 drop it once the query is done. Why is it so?


From my knowledge there is more then one answer:

a) There is a proposal (and, at the time being) also some code on pgfoundry
creating hypothetical indexes
http://postgresql.1045698.n5.nabble.com/Hypothetical-Indexes-PostgreSQL-extension-PGCON-2010-td3288085.html
http://archives.postgresql.org/pgsql-hackers/2010-12/msg00087.php

The idea is: to play with what-would-be-if-there-would-be-an-index.

With keywords hypothetical index, Index advisor and virtual index
there is some research googleable

b) creating an index requires to read the data-to-be-indexed. So, to have an
index pointing at the interesting rows for your query, the table has to be
read ... which would be the perfect time to allready select the interesting
rows. And after having the interesting rows: the index is worthless

c) PostgreSQL is in fact doing something quite similiar like temporary
indexes during the query; only for the selected rows. The words to google
for is hash maps, bitmap access.

Why is the query planner not allowed to create indexes, but only allowed to
 use or not use what's available?


as in b): Creating an index is quite expensiv

additionally: having multiple possible plans is also creating a new decision
problem: which of the possible plans will lead to the better result; again
with some meanings of better: faster result or less processor usage or
less memory usage or less disk accesses. So adding additional indices during
planning would worsen this problem; which has to be balanced against
possible gains.

So: hypothetical indizes are a good idea, BUT current limitations would most
likely force them to be made outside the life query process.

Best wishes,

Harald

-- 
GHUM GmbH
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607

Amtsgericht Stuttgart, HRB 734971
-
persuadere.
et programmare


Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Gabriele Bartolini

Hi Harald,

On Tue, 21 Dec 2010 11:42:40 +0100, Massa, Harald Armin 
c...@ghum.de wrote:

a) There is a proposal (and, at the time being) also some code on
pgfoundry creating hypothetical indexes

http://postgresql.1045698.n5.nabble.com/Hypothetical-Indexes-PostgreSQL-extension-PGCON-2010-td3288085.html
[1]
http://archives.postgresql.org/pgsql-hackers/2010-12/msg00087.php [2]


I totally missed this. Thanks for posting it.

Cheers,
Gabriele

--
 Gabriele Bartolini - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it - www.2ndQuadrant.it

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to guard PostgreSQL from overcharged and/or malicious (read-only) queries?

2010-12-21 Thread Alban Hertroys
On 21 Dec 2010, at 10:57, Stefan Keller wrote:

 You answered:
 1. Filter out all SQL commands which are *not* read-only (no DROP
 Most people do this using permissions.
 
 Oh, yes: forgot to mention that; that's obvious. What I also looked
 for was the PL/pgSQL's EXECUTE command-string.

I'm not sure what you're getting at here, but if you're saying that you have to 
catch SQL commands called from EXECUTE separately I think you're wrong. I 
wouldn't expect Postgres to not apply permissions in such cases. A simple 
test-case can prove that.

 2. Get the estimated time (units) from PostgreSQL planner in a
 reliable way (if possible standard/ANSI).
 
 Ok; again keep in mind that I have a read-only database. Therefore the
 statistics should be up-to-date (after a vacuum analyse).

Up-to-date? Probably. Correct or adequate? That depends, as I pointed out in my 
previous message.

 See below my attempt to write such a function I called
 secure_execute(text). It's still not functioning and I have
 indicated two problems there. What do you think?

Your problems seem to stem from a lack of experience with set-returning 
functions. You best look up the documentation for those, it explains it better 
than I could. Make sure you look at the docs for the versions of Postgres that 
you're using or expect to use, as there's a relatively new feature in this 
domain returning a set as a table.

 I like the idea letting abandon the query if it's obviously(!) wrong
 or if the planner alerts me about very high costs?
 Or should I rather abandon the idea of such a function and simply rely
 on read-only privileges and a session statement_timeout?


Although I don't think what you're attempting is wrong in any way, I'd first 
see whether it's necessary to do so. You can use the built-in features 
(permissions  statement_timeout) and see whether that's adequate for your 
use-case. If it's not, then it's time to look into tightening things up.

Of course, having some experience with the solution through experimentation 
can't hurt if you can afford to.

What you're doing would by many on this list be pointed out as premature 
optimisation, although that usually involves query performance ;)

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


!DSPAM:737,4d10881c802651631920626!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] How to use pgbouncer

2010-12-21 Thread Adarsh Sharma

Dear all,

I am not able to find any useful document regarding Configuration and 
Running Pgbouncer with Postgres-8.4.2.


How it helps and is it able to boost some performance ?

Or if there is another useful tool available for Connection Pooling. 
Please guide me for this.



Thanks  Regards

Adarsh Sharma

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] [feature request] left/right join + limit optimization

2010-12-21 Thread pasman pasmański
hello.

I think that left/right joins and limit may be optimized.
When there aren't WHERE conditions this may be executed as below:

Limit N
Merge Left Join
  Sort Top N
Bitmap Heap Scan
  ...
  Sort
Bitmap Heap Scan
  ...



pasman

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Dario Beraldi
Ok, thanks a lot to all of you for your answers! (Always impressed by  
the prompt feedback you get on this list!)


Quoting Gabriele Bartolini gabriele.bartol...@2ndquadrant.it:


Ciao Dario,

On Tue, 21 Dec 2010 09:14:36 +, Dario Beraldi  
dario.bera...@ed.ac.uk wrote:

the query run better the planner is not able (allowed?) to create such
index, use it, and drop it once the query is done. Why is it so?


Because it is not its responsibility. This is the simplest and most  
rational answer.


I would reply with some questions that hopefully will give you the  
answer. How can you define a 'good' query from a 'bad' query?  
Consider the case when an user launches a wrong query and the  
planner you propose starts creating an index. I believe that would  
be a mess.


Then ... how could you project this scenario in a concurrent context  
where multiple users launch queries that 'need' an index?


I suggest that you look at the documentation for more information.  
Otherwise, I strongly suggest that you read the chapter on the  
planner from Greg's book on High Performance (which you can find  
from here: http://www.postgresql.org/docs/books/)


Merry Christmas to you too!

Cheers,
Gabriele

--
 Gabriele Bartolini - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it - www.2ndQuadrant.it






--

Dr. Dario Beraldi
Institute of Evolutionary Biology
University of Edinburgh
West Mains Road
Edinburgh EH9 3JT
Scotland, UK


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Jeremy Harris

On 2010-12-21 10:42, Massa, Harald Armin wrote:

b) creating an index requires to read the data-to-be-indexed. So, to have an
index pointing at the interesting rows for your query, the table has to be
read ... which would be the perfect time to allready select the interesting
rows. And after having the interesting rows: the index is worthless


... until another similar query comes along, when suddenly it's a massive win.
Why not auto-create indices for some limited period after database load
(copy?  any large number of inserts from a single connection?), track those
that actually get re-used and remove the rest?   Would this not provide
a better out-of-the-box experience for neophytes?

[...]

Why is the query planner not allowed to create indexes, but only allowed to

use or not use what's available?



as in b): Creating an index is quite expensiv


How much more so than doing that full-table-scan plus sort, which your
query is doing anyway?

Cheers,
Jeremy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Trouble uninstalling old postgresql installs on osx

2010-12-21 Thread Geoff Bowers
Hi there,

Have been trying to uninstall old instances of Postgres from my Snow
Leopard install, preparing to install 9.0

Not sure how old these instances are (probably dates back to 7). I can
see them in the active process list, but I'm not sure how to
permanently stop them. Any old timers have some tips on nuking old
installs? Don't need any settings or data from them.. Just looking for
scorched earth.

Any help much appreciated,

GB

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread tv
 On 2010-12-21 10:42, Massa, Harald Armin wrote:
 b) creating an index requires to read the data-to-be-indexed. So, to
 have an
 index pointing at the interesting rows for your query, the table has to
 be
 read ... which would be the perfect time to allready select the
 interesting
 rows. And after having the interesting rows: the index is worthless

 ... until another similar query comes along, when suddenly it's a massive
 win.
 Why not auto-create indices for some limited period after database load
 (copy?  any large number of inserts from a single connection?), track
 those
 that actually get re-used and remove the rest?   Would this not provide
 a better out-of-the-box experience for neophytes?

Say you have a table with several columns (A,B,C), and the query is using
some of them. What indexes would you create? One index on every column? A
multi-column index on all columns? Indexs for each combination of columns?

There really is no automatic way to solve this puzzle using a single
query. Indexing strategy is a very tough design discipline, and it
requires a complex knowledge of the workload. One slow query does not mean
the index should be created - what if that was just an ad-hoc query and
will not be executed ever again?

So although this 'automatic index creation' seems nice, it really does not
work in practice.

I really don't want to work with products that try to be smarter than me
(and I admit I'm pretty dumb from time to time) and act rather randomly
due to this 'artificial intelligence'. I've already been there and I don't
want to repeat this experience.


 [...]
 Why is the query planner not allowed to create indexes, but only allowed
 to
 use or not use what's available?


 as in b): Creating an index is quite expensiv

 How much more so than doing that full-table-scan plus sort, which your
 query is doing anyway?

A lot. It consumes a lot of CPU, it may consume a lot of memory (up to
maintenance_work_mem) etc. So imagine a few users, building indices on a
big table simultaneously - no, I really don't want to see this on a
production server.

Building an index is just one side of the problem - maintenance of the
indexes is another thing. Each index has an impact on write operations
(INSERT/UPDATE) and may cause that HOT actually does not work (thus
causing unnecessary bloat of the index). And the indexes may need lot of
space on a disk.

But the real show stopper is probably locking. Building an index takes a
write lock on a table, effectively blocking writes. Sure, you can use a
'CREATE INDEX ... CONCURRENTLY' introduced in 8.2, but there are some
disadvantages of that (see
http://www.postgresql.org/docs/9.0/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY).

cheers
Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Trusted Startup

2010-12-21 Thread Kenneth Buckler
On Mon, Dec 20, 2010 at 8:53 PM, Craig Ringer
cr...@postnewspapers.com.au wrote:

 Do you have a trusted boot path from BIOS to bootloader to kernel to init
 core userspace, where everything is digitally signed (by you or someone
 else) and verified before execution? Do you disable kernel module loading?

 If not, you're wasting your time, because a compromise by malicious kernel
 module, modified init, modified md5 command, etc will render your
 precautions totally pointless.

 If your BIOS can't verify the bootloader, which is likely on an x86 / x64
 system, then you can still get some protection by signing your kernels and
 using a bootloader that checks signatures. If someone messes with the
 bootloader you lose, but it'll help protect you against obvious automated
 attacks. You might be able to use the Trusted Platform Module (TPM) on your
 machine to get a fully verified chain of trust, though, by using Trusted
 GRUB.

 http://trousers.sourceforge.net/grub.html

 If you can reasonably trust that the kernel you loaded is OK, you can have
 it verify signatures on binaries before executing them. There was a DigSig
 project for that (http://disec.sourceforge.net/) but it seems to have
 stopped recently. I'm not sure if there's any replacement.

 Without kernel-level signature verification, all you can really do is have a
 custom initrd/initramfs (signed and verified by grub during boot) that
 checks the signatures on init, md5, gpg, libc, etc etc (any binary root
 runs, including scripts) before switching to the real root FS during boot.
 Then you can have your Pg startup scripts (which you signed on a separate,
 trusted machine) verify GnuPG signatures of the Pg binaries before
 execution.

 All in all, it's a painful, clumsy way to do things, and AFAIK there's
 little support in mainline Linux systems for trusted boot and trusted-binary
 systems. You might find out more with a search for linux trusted
 computing, linux trusted boot, linux tpm, linux signed binaries, etc.

 Personally, I'd be using existing system- and network-level intrusion
 detection tools like tripwire and snort to try to spot intrusion if and when
 it happens. I'm not confident that a chain-of-trust approach is workable on
 Linux systems at present, though I'd love to be proved wrong by being
 pointed at existing support I've missed.

 --
 Craig Ringer


I find it very comforting that I am not the only one who finds this
requirement a bit out there.
Unfortunately, these requirements are set in stone, and no matter how
hard I try, can not be altered.
We live in a world where compliance is king.  Nevermind if compliance
doesn't actually make the system more secure.

Unfortunately Tripwire does not meet the full requirement, as it does
not prevent the database from starting.

Ken

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Grzegorz Jaśkiewicz
I don't think planner should do things like creating an index. But it
might hint at doing it in the logs.
There was a discussion around that sort of feature on -hackers not so
long time ago. I don't remember what the conclusion was, but probably
that it just isn't worth wasting planner's cycles when you execute a
query.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread tv
 I don't think planner should do things like creating an index. But it
 might hint at doing it in the logs.
 There was a discussion around that sort of feature on -hackers not so
 long time ago. I don't remember what the conclusion was, but probably
 that it just isn't worth wasting planner's cycles when you execute a
 query.

Yes, that would be a much better solution, definitely. Something like
'watch seq-scan nodes and if the number of matching rows is much smaller
than the total number, put a HINT into log.'

I was thinking about doing something like this for correlated columns
(when the number of matching rows is severely underestimated). But that's
far in the future.

Tomas



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Jeremy Harris

On 2010-12-21 14:26, t...@fuzzy.cz wrote:

Why not auto-create indices for some limited period after database load
(copy?  any large number of inserts from a single connection?), track
those
that actually get re-used and remove the rest?   Would this not provide
a better out-of-the-box experience for neophytes?


Say you have a table with several columns (A,B,C), and the query is using
some of them. What indexes would you create? One index on every column? A
multi-column index on all columns? Indexs for each combination of columns?


One single-column index, on the first index-worthy column appearing.
Keep it simple.Maybe, while you're doing that full-table-scan. gather
stats on all the indexable columns for later reference, to guide choice of
which column to index later.


There really is no automatic way to solve this puzzle using a single
query. Indexing strategy is a very tough design discipline, and it
requires a complex knowledge of the workload. One slow query does not mean
the index should be created - what if that was just an ad-hoc query and
will not be executed ever again?


Then the index you just built gets automatically dropped, as I said above.


I really don't want to work with products that try to be smarter than me
(and I admit I'm pretty dumb from time to time) and act rather randomly
due to this 'artificial intelligence'. I've already been there and I don't
want to repeat this experience.


Then, since you're not a neophyte, leave the feature turned off.   But don't
deny the possibility of using it to actual neophytes.



as in b): Creating an index is quite expensiv


How much more so than doing that full-table-scan plus sort, which your
query is doing anyway?


A lot. It consumes a lot of CPU, it may consume a lot of memory (up to
maintenance_work_mem) etc. So imagine a few users, building indices on a
big table simultaneously


Pffthht.   One simple trylock, used only by the auto-indexer.



Building an index is just one side of the problem - maintenance of the
indexes is another thing. Each index has an impact on write operations
(INSERT/UPDATE) and may cause that HOT actually does not work (thus
causing unnecessary bloat of the index).


This is worth discussing.How much help does the DBMS currently give
the DBA in evaluating these tradeoffs?Could we do better, given an
actual installation and workload?


And the indexes may need lot of
space on a disk.


By all means require limits as well as a don't do that switch.



But the real show stopper is probably locking. Building an index takes a
write lock on a table, effectively blocking writes. Sure, you can use a
'CREATE INDEX ... CONCURRENTLY' introduced in 8.2, but there are some
disadvantages of that (see
http://www.postgresql.org/docs/9.0/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY).


The only disadvantages I see there are a) the leftover invalid index - which 
feels like a
bug; why is it not auto-dropped?   and b) the second scan, which implies more 
total work
and a wish to background that portion after completing the query triggering
the auto-index.

Don't forget I suggested doing this only for a limited time after DB creation.  
Possibly another
reason to turn it off should be any manual index creation, as that implies that 
the DBA
knows about indexing.

I see in another thread you suggest merely placing hints in the log.   That's a 
fine
first step - but I'll then be wanting to auto-parse that log to auto-create

Cheers,
Jeremy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread bricklen
On Tue, Dec 21, 2010 at 7:34 AM, Jeremy Harris j...@wizmail.org wrote:
 On 2010-12-21 14:26, t...@fuzzy.cz wrote:

 Why not auto-create indices for some limited period after database load
 (copy?  any large number of inserts from a single connection?), track
 those
 that actually get re-used and remove the rest?   Would this not provide
 a better out-of-the-box experience for neophytes?

 Say you have a table with several columns (A,B,C), and the query is using
 some of them. What indexes would you create? One index on every column? A
 multi-column index on all columns? Indexs for each combination of columns?

 One single-column index, on the first index-worthy column appearing.
 Keep it simple.    Maybe, while you're doing that full-table-scan. gather
 stats on all the indexable columns for later reference, to guide choice of
 which column to index later.

 There really is no automatic way to solve this puzzle using a single
 query. Indexing strategy is a very tough design discipline, and it
 requires a complex knowledge of the workload. One slow query does not mean
 the index should be created - what if that was just an ad-hoc query and
 will not be executed ever again?

 Then the index you just built gets automatically dropped, as I said above.

 I really don't want to work with products that try to be smarter than me
 (and I admit I'm pretty dumb from time to time) and act rather randomly
 due to this 'artificial intelligence'. I've already been there and I don't
 want to repeat this experience.

 Then, since you're not a neophyte, leave the feature turned off.   But don't
 deny the possibility of using it to actual neophytes.


 as in b): Creating an index is quite expensiv

 How much more so than doing that full-table-scan plus sort, which your
 query is doing anyway?

 A lot. It consumes a lot of CPU, it may consume a lot of memory (up to
 maintenance_work_mem) etc. So imagine a few users, building indices on a
 big table simultaneously

 Pffthht.   One simple trylock, used only by the auto-indexer.


 Building an index is just one side of the problem - maintenance of the
 indexes is another thing. Each index has an impact on write operations
 (INSERT/UPDATE) and may cause that HOT actually does not work (thus
 causing unnecessary bloat of the index).

 This is worth discussing.    How much help does the DBMS currently give
 the DBA in evaluating these tradeoffs?    Could we do better, given an
 actual installation and workload?

 And the indexes may need lot of
 space on a disk.

 By all means require limits as well as a don't do that switch.


 But the real show stopper is probably locking. Building an index takes a
 write lock on a table, effectively blocking writes. Sure, you can use a
 'CREATE INDEX ... CONCURRENTLY' introduced in 8.2, but there are some
 disadvantages of that (see

 http://www.postgresql.org/docs/9.0/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY).

 The only disadvantages I see there are a) the leftover invalid index -
 which feels like a
 bug; why is it not auto-dropped?   and b) the second scan, which implies
 more total work
 and a wish to background that portion after completing the query triggering
 the auto-index.

 Don't forget I suggested doing this only for a limited time after DB
 creation.  Possibly another
 reason to turn it off should be any manual index creation, as that implies
 that the DBA
 knows about indexing.

 I see in another thread you suggest merely placing hints in the log.
 That's a fine
 first step - but I'll then be wanting to auto-parse that log to
 auto-create

 Cheers,
    Jeremy


This thread offers up some interesting possibilities. Expanding on
what has already been discussed, maybe a contrib module for query
tuning/index suggestions?
Some things that came to mind immediately that the module could do
(feasible or not):
- Look at the EXPLAIN ANALYZE
- Examine the index access methods and table scans + costs/rows
- Which indexes were used?
- What were the blocks/tuples hit  read?
- Look at join conditions and WHERE clause filters
- Data types in the joins (mismatched?)
- Churn rate of the tables, eg. the updates/deletes/inserts. This
might allow suggestion of other index types (eg. gist)

Tool then provides feedback on possibly helpful indexes to test, and
why (hypothetical indexes could be applied here). Possibly provided
suggestions on ways to improve the query, eg. data types don't match
in the join, EXISTS vs IN, etc

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to use pgbouncer

2010-12-21 Thread Filip Rembiałkowski
2010/12/21 Adarsh Sharma adarsh.sha...@orkash.com:
 Dear all,

 I am not able to find any useful document regarding Configuration and
 Running Pgbouncer with Postgres-8.4.2.

that's strange, there are several good pages on the web; there is also
my mini-howto:
http://filip.rembialkowski.net/pgbouncer-mini-howto-benchmark/


 How it helps and is it able to boost some performance ?

PgBouncer is a connection pooler. I wonder why do you want to use it,
if you don't know what it is?

it can help in many ways, two most obvious points:
- reduce number of postgres backends (more resources for the server)
- reduce cost of single connection from the application to postgres


 Or if there is another useful tool available for Connection Pooling. Please
 guide me for this.

yes there are some; see
http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling

it depends on what you need. pgbouncer is the smallest and most
lightweight pooler available.


FR

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to use pgbouncer

2010-12-21 Thread Richard Broersma
2010/12/21 Filip Rembiałkowski filip.rembialkow...@gmail.com:
 Or if there is another useful tool available for Connection Pooling. Please
 guide me for this.

 yes there are some; see
 http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling

 it depends on what you need. pgbouncer is the smallest and most
 lightweight pooler available.

Also, Pgbouncer is the only connection-pooler that work on MS-Windows
that is released for production use.


-- 
Regards,
Richard Broersma Jr.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to use pgbouncer

2010-12-21 Thread Andreas Kretschmer
Richard Broersma richard.broer...@gmail.com wrote:

 2010/12/21 Filip Rembiałkowski filip.rembialkow...@gmail.com:
  Or if there is another useful tool available for Connection Pooling. Please
  guide me for this.
 
  yes there are some; see
  http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
 
  it depends on what you need. pgbouncer is the smallest and most
  lightweight pooler available.
 
 Also, Pgbouncer is the only connection-pooler that work on MS-Windows
 that is released for production use.

I'm looking for a solution to split read and write access to different
servers (streaming replication, you know ...). Can i do that with
pgpool? (setting backend_weightX=0 or 1)? I have read the doc, but i'm
not sure if pgpool the right solution, maybe you can enlighten me?
(or someone else ...)


Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
If I was god, I would recompile penguin with --enable-fly.   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Constraining overlapping date ranges

2010-12-21 Thread McGehee, Robert
PostgreSQLers,
I'm hoping for some help creating a constraint/key on a table such that there 
are no overlapping ranges of dates for any id. 

Specifically: Using PostgreSQL 9.0.1, I'm creating a name-value pair table as 
such this:

CREATE TABLE tbl (id INTEGER, start_date DATE, stop_date DATE, value REAL);

For a given id, I'd like to enforce that there is only one valid value on a 
given date. For instance, this would be acceptable:

id  start_date  stop_date   value
2   2010-11-01  2010-12-01  3
2   2010-12-02  2010-12-15  4
3   2010-10-15  2010-12-15  -3

But this would not: (notice start_date of line 2 is before stop_date of line 1).
id  start_date  stop_date   value
2   2010-11-01  2010-12-01  3
2   2010-11-30  2010-12-15  4
3   2010-10-15  2010-12-15  -3

I'd also appreciate it if anyone can provide any indexing hints on this table 
to optimize queries like:
SELECT value FROM tbl WHERE id=2 and '2010-12-02' BETWEEN start_date AND 
stop_date;

Thanks in advance, and sorry if I overlooked any obvious documentation!

Robert McGehee

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Constraining overlapping date ranges

2010-12-21 Thread Richard Broersma
On Tue, Dec 21, 2010 at 7:49 AM, McGehee, Robert
robert.mcge...@geodecapital.com wrote:
 PostgreSQLers,
 I'm hoping for some help creating a constraint/key on a table such that there 
 are no overlapping ranges of dates for any id.

 There is something you can try, but it is not exactly what you want
since it is based on timestamps rather than dates:

http://temporal.projects.postgresql.org/


-- 
Regards,
Richard Broersma Jr.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Kelly Burkhart
Hello, I'm sending a group of queries to the database with PQsendQuery
and using PQgetResult to return results similar to this:

PQsendQuery( select current_timestamp; select pg_sleep(1); select
current_timestamp );

while( result = PQgetResult() )
doSomethingWith( result )

I'm finding that PQgetResult() will not return the first result until
all three results are ready.  In real life I would be sending
consequential queries and would like the front end to be processing
one result while the backend is producing the next one.  The
documentation for PQgetResult in section 30.4 of the 8.4.5 manual
suggests that this is what should happen.

Can anyone explain if sendQuery/getResult is intended to work as
documented?  And if so what my problem may be?  (perhaps my pg_sleep
is preventing the first result from being written?)

Thanks,

-Kelly

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Tomas Vondra
Dne 21.12.2010 16:34, Jeremy Harris napsal(a):
 There really is no automatic way to solve this puzzle using a single
 query. Indexing strategy is a very tough design discipline, and it
 requires a complex knowledge of the workload. One slow query does not
 mean
 the index should be created - what if that was just an ad-hoc query and
 will not be executed ever again?
 
 Then the index you just built gets automatically dropped, as I said above.

I'm a bit confused. Should the indexes be dropped automatically (as you
state here) or kept for the future. Because if they should be dropped,
then it does not make sense to do this magic just for a limited time
after the DB goes live.

 I really don't want to work with products that try to be smarter than me
 (and I admit I'm pretty dumb from time to time) and act rather randomly
 due to this 'artificial intelligence'. I've already been there and I
 don't
 want to repeat this experience.
 
 Then, since you're not a neophyte, leave the feature turned off. But
 don't deny the possibility of using it to actual neophytes.

This is not a question of whether I am a neophyte or not, this is a
question of good practices. My experience is that building indexes
should be done when designing the application, and tested at in a test
environment. At production, log slow queries, analyze the log and add
indexes when needed.

This 'neophyte approach' is a really bad idea from my point of view. It
inevitably leads to a database with a zillion of unnecessary indexes,
missing the right ones. And any limits don't prevent this. In the end
you'll get a slow database, and the neophytes would blame the database
although that state is inevitable.

I've been working with several database products over the years, and
AFAIK none of them does this. In most cases the whole industry is
wrong is usually a really bad sign (does not hold if you're Steve Jobs).

Most of them do have 'advisors' though - that's a good idea, and I think
sooner or later that will be in PostgreSQL too.

 as in b): Creating an index is quite expensiv

 How much more so than doing that full-table-scan plus sort, which your
 query is doing anyway?

 A lot. It consumes a lot of CPU, it may consume a lot of memory (up to
 maintenance_work_mem) etc. So imagine a few users, building indices on a
 big table simultaneously
 
 Pffthht.   One simple trylock, used only by the auto-indexer.

Pffthht? Anyway trylock is not an answer. Think about a different table
for each user.

 Building an index is just one side of the problem - maintenance of the
 indexes is another thing. Each index has an impact on write operations
 (INSERT/UPDATE) and may cause that HOT actually does not work (thus
 causing unnecessary bloat of the index).
 
 This is worth discussing.How much help does the DBMS currently give
 the DBA in evaluating these tradeoffs?Could we do better, given an
 actual installation and workload?

The database gives you explain plans, plus pg_stat_* and pg_statio_*
views. That's a lot of data, although does not provide a complete view
in many cases. And then there's a log_min_duration, which is the primary
weapon in fighting slow queries after going live.

 And the indexes may need lot of
 space on a disk.
 
 By all means require limits as well as a don't do that switch.

As I said, this is not a good approach from my POV. And it's really
really hard to implement this in a way that those who don't want to use
it don't have to pay the price. Because what portion of users would
really use this feature? 0.0001%?

 But the real show stopper is probably locking. Building an index takes a
 write lock on a table, effectively blocking writes. Sure, you can use a
 'CREATE INDEX ... CONCURRENTLY' introduced in 8.2, but there are some
 disadvantages of that (see
 http://www.postgresql.org/docs/9.0/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY).

 
 The only disadvantages I see there are a) the leftover invalid index -
 which feels like a
 bug; why is it not auto-dropped?   and b) the second scan, which implies
 more total work
 and a wish to background that portion after completing the query triggering
 the auto-index.

Dropping an index automatically is a really bad idea. Actually dropping
indexes in general (no matter if the index was created automatically or
manually) is a very tricky thing.

How do you decide which indexes are really leftover from those that
are used only rarely? I've seen application failing spectacularily
because a DBA dropped an index that was not used ... except for a batch
process that runs once a year, to close a fiscal year.

 Don't forget I suggested doing this only for a limited time after DB
 creation.  Possibly another
 reason to turn it off should be any manual index creation, as that
 implies that the DBA
 knows about indexing.
 
 I see in another thread you suggest merely placing hints in the log.  
 That's a fine
 first step - but I'll then be wanting to auto-parse that log to
 

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Jeremy Harris

On 2010-12-21 18:50, Tomas Vondra wrote:

Then the index you just built gets automatically dropped, as I said above.


I'm a bit confused. Should the indexes be dropped automatically (as you
state here) or kept for the future. Because if they should be dropped,
then it does not make sense to do this magic just for a limited time
after the DB goes live.


Here what I said:
track those that actually get re-used and remove the rest.

Which part is confusing?

- Jeremy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Peter Geoghegan
You can't concurrently execute queries from within a single
connection. Perhaps you should use multiple connections, while
understanding the implications of having each operate within a
separate snapshot.

Don't forget to free memory with PQclear() . I guess you omitted that
because it's just pseudo-code.

-- 
Regards,
Peter Geoghegan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Tomas Vondra
Dne 21.12.2010 20:03, Jeremy Harris napsal(a):
 On 2010-12-21 18:50, Tomas Vondra wrote:
 Then the index you just built gets automatically dropped, as I said
 above.

 I'm a bit confused. Should the indexes be dropped automatically (as you
 state here) or kept for the future. Because if they should be dropped,
 then it does not make sense to do this magic just for a limited time
 after the DB goes live.
 
 Here what I said:
 track those that actually get re-used and remove the rest.
 
 Which part is confusing?

As I described, identifying which indexes are actually used is a very
tricky task. And it's not difficult to come up with scenarios where this
causes significantly more harm than good.

Basically the time to keep the indices needs to be long enough that the
indexes that are actually used are not dropped (and the resources spent
creating them actually pays off). But on the other side it needs to be
short so that resources are not wasted because of unused indices.

Which are clearly contradictory requirements.

And the 'limits' you've proposed make that even worse, because when the
unnecessary indices get created first and take most of the resources
(e.g. disk space), then the indexes that are actually needed won't be
created because of those limits.

regards
Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Merlin Moncure
On Tue, Dec 21, 2010 at 2:21 PM, Peter Geoghegan
peter.geoghega...@gmail.com wrote:
 You can't concurrently execute queries from within a single
 connection. Perhaps you should use multiple connections, while
 understanding the implications of having each operate within a
 separate snapshot.

OP is not suggesting that queries run concurrently, but asking why
result sets can't be popped off as the queries resolve.  It's a good
question; it's probably either a bug in the database or the
documentation (if it does not turn out to be operator error).

Kelly, if you can produce small test case in C I'll double check it.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Kelly Burkhart
Yes, I omitted the PQclear for simplicity.

I'm not concurrently executing queries, I'm sending multiple queries
to be executed serially by the backend.  I'm expecting the server to
send me the PQresult objects as each query completes rather than
sending them all *after* all of the queries have completed.

That will result in some amount of concurrency as my client will be
processing one result while the backend is processing the next query.

-K

On Tue, Dec 21, 2010 at 1:21 PM, Peter Geoghegan
peter.geoghega...@gmail.com wrote:
 You can't concurrently execute queries from within a single
 connection. Perhaps you should use multiple connections, while
 understanding the implications of having each operate within a
 separate snapshot.

 Don't forget to free memory with PQclear() . I guess you omitted that
 because it's just pseudo-code.

 --
 Regards,
 Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Kelly Burkhart
This should do it:

#include stdio.h
#include stdlib.h
#include libpq-fe.h

#define CONNINFO your info here
#define COMMANDS select current_timestamp; select pg_sleep(5); select
current_timestamp

void fatal( const char *msg ) { fprintf( stderr, %s\n, msg ); exit(1); }

int
main()
{
PGresult *res = 0;
PGconn *conn = PQconnectdb( CONNINFO );
if (!conn) fatal(PQconnectdb returned null);

if ( PQstatus(conn) != CONNECTION_OK ) {
PQfinish( conn );
fatal(PQconnectdb failed);
}

if (!PQsendQuery(conn, COMMANDS)) {
PQfinish( conn );
fatal(PQsendQuery failed);
}

while( (res = PQgetResult( conn )) != 0 ) {
printf(retrieved result\n);
PQclear(res);
}

PQfinish( conn );
return 0;
}



On Tue, Dec 21, 2010 at 1:26 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Dec 21, 2010 at 2:21 PM, Peter Geoghegan
 peter.geoghega...@gmail.com wrote:
 You can't concurrently execute queries from within a single
 connection. Perhaps you should use multiple connections, while
 understanding the implications of having each operate within a
 separate snapshot.

 OP is not suggesting that queries run concurrently, but asking why
 result sets can't be popped off as the queries resolve.  It's a good
 question; it's probably either a bug in the database or the
 documentation (if it does not turn out to be operator error).

 Kelly, if you can produce small test case in C I'll double check it.

 merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Daniel Verite
Kelly Burkhart wrote:

 #define COMMANDS select current_timestamp; select pg_sleep(5); select
 current_timestamp

You should use current_clock() instead of current_timestamp, because
current_timestamp returns a fixed value throughout a transaction.

Best regards,
-- 
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Merlin Moncure
On Tue, Dec 21, 2010 at 3:07 PM, Daniel Verite dan...@manitou-mail.org wrote:
        Kelly Burkhart wrote:

 #define COMMANDS select current_timestamp; select pg_sleep(5); select
 current_timestamp

 You should use current_clock() instead of current_timestamp, because
 current_timestamp returns a fixed value throughout a transaction.

Well, that's correct, but irrelevant -- Kelly's analysis is correct.
The documentation for PQgetResult states:

Using PQsendQuery and PQgetResult solves one of PQexec's problems: If
a command string contains multiple SQL commands, the results of those
commands can be obtained individually. (This allows a simple form of
overlapped processing, by the way: the client can be handling the
results of one command while the server is still working on later
queries in the same command string.) However, calling PQgetResult will
still cause the client to block until the server completes the next
SQL command. This can be avoided by proper use of two more functions:

but control is not returned until all three queries have resolved.
this is probably an issue with libpq.  investigating...

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Merlin Moncure
On Tue, Dec 21, 2010 at 3:14 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Dec 21, 2010 at 3:07 PM, Daniel Verite dan...@manitou-mail.org 
 wrote:
        Kelly Burkhart wrote:

 #define COMMANDS select current_timestamp; select pg_sleep(5); select
 current_timestamp

 You should use current_clock() instead of current_timestamp, because
 current_timestamp returns a fixed value throughout a transaction.

 Well, that's correct, but irrelevant -- Kelly's analysis is correct.
 The documentation for PQgetResult states:

 Using PQsendQuery and PQgetResult solves one of PQexec's problems: If
 a command string contains multiple SQL commands, the results of those
 commands can be obtained individually. (This allows a simple form of
 overlapped processing, by the way: the client can be handling the
 results of one command while the server is still working on later
 queries in the same command string.) However, calling PQgetResult will
 still cause the client to block until the server completes the next
 SQL command. This can be avoided by proper use of two more functions:

 but control is not returned until all three queries have resolved.
 this is probably an issue with libpq.  investigating...

hm, it looks like the backend is not flushing the command complete for
each command until finishing all the queries.  This is what signals
libpq that a particular command has been executed.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Merlin Moncure
On Tue, Dec 21, 2010 at 3:37 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Dec 21, 2010 at 3:14 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Dec 21, 2010 at 3:07 PM, Daniel Verite dan...@manitou-mail.org 
 wrote:
        Kelly Burkhart wrote:

 #define COMMANDS select current_timestamp; select pg_sleep(5); select
 current_timestamp

 You should use current_clock() instead of current_timestamp, because
 current_timestamp returns a fixed value throughout a transaction.

 Well, that's correct, but irrelevant -- Kelly's analysis is correct.
 The documentation for PQgetResult states:

 Using PQsendQuery and PQgetResult solves one of PQexec's problems: If
 a command string contains multiple SQL commands, the results of those
 commands can be obtained individually. (This allows a simple form of
 overlapped processing, by the way: the client can be handling the
 results of one command while the server is still working on later
 queries in the same command string.) However, calling PQgetResult will
 still cause the client to block until the server completes the next
 SQL command. This can be avoided by proper use of two more functions:

 but control is not returned until all three queries have resolved.
 this is probably an issue with libpq.  investigating...

 hm, it looks like the backend is not flushing the command complete for
 each command until finishing all the queries.  This is what signals
 libpq that a particular command has been executed.

to see this in action, you can interject a query between queries 1  2
that sends a lot of data. the 'lots of data' forces query one protocol
to flush out, which the client handles properly.  this is likely
backend bug -- it needs to force a flush upon command completion?

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Merlin Moncure
On Tue, Dec 21, 2010 at 3:40 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Dec 21, 2010 at 3:37 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Dec 21, 2010 at 3:14 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Dec 21, 2010 at 3:07 PM, Daniel Verite dan...@manitou-mail.org 
 wrote:
        Kelly Burkhart wrote:

 #define COMMANDS select current_timestamp; select pg_sleep(5); select
 current_timestamp

 You should use current_clock() instead of current_timestamp, because
 current_timestamp returns a fixed value throughout a transaction.

 Well, that's correct, but irrelevant -- Kelly's analysis is correct.
 The documentation for PQgetResult states:

 Using PQsendQuery and PQgetResult solves one of PQexec's problems: If
 a command string contains multiple SQL commands, the results of those
 commands can be obtained individually. (This allows a simple form of
 overlapped processing, by the way: the client can be handling the
 results of one command while the server is still working on later
 queries in the same command string.) However, calling PQgetResult will
 still cause the client to block until the server completes the next
 SQL command. This can be avoided by proper use of two more functions:

 but control is not returned until all three queries have resolved.
 this is probably an issue with libpq.  investigating...

 hm, it looks like the backend is not flushing the command complete for
 each command until finishing all the queries.  This is what signals
 libpq that a particular command has been executed.

 to see this in action, you can interject a query between queries 1  2
 that sends a lot of data. the 'lots of data' forces query one protocol
 to flush out, which the client handles properly.  this is likely
 backend bug -- it needs to force a flush upon command completion?

hm, a pq_flush() after command completion putmessage in
backend/tcop/dest.c seems to fix the problem.  I'll send up a patch to
-hackers.  They might backpatch it, unless there is a good reason not
to do this (I can't think of any).

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to use pgbouncer

2010-12-21 Thread Filip Rembiałkowski
2010/12/21 Andreas Kretschmer akretsch...@spamfence.net:

 I'm looking for a solution to split read and write access to different
 servers (streaming replication, you know ...). Can i do that with
 pgpool? (setting backend_weightX=0 or 1)? I have read the doc, but i'm
 not sure if pgpool the right solution, maybe you can enlighten me?
 (or someone else ...)

did you see 
http://pgpool.projects.postgresql.org/pgpool-II/doc/pgpool-en.html#master_slave_mode
? I think it answers your question :-)

warning:
it works by detecting which queries are read-only, which does not
always work out of the box.
you will sometimes have to modify driver settings or worse, modify SQL
queries including hints for pgpool.

see:
http://pgpool.projects.postgresql.org/pgpool-II/doc/pgpool-en.html#restriction
http://pgpool.projects.postgresql.org/pgpool-II/doc/pgpool-en.html#white_function_list
http://pgpool.projects.postgresql.org/pgpool-II/doc/pgpool-en.html#black_function_list

greets,
  Filip

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What Programs Do You Use For PG?

2010-12-21 Thread Merlin Moncure
On Tue, Dec 21, 2010 at 1:32 AM, Neil D'Souza
neil.xavier.dso...@gmail.com wrote:
 You can have a look at my project on sourceforge:
 http://sourceforge.net/projects/proghelp builds applications with PG as a
 backend automatically. It uses a modified create table sql grammar as an
 input.

 1. It automatically  generates stored procedures to insert and retrieve data
 2. a database api layer to access the data in C++
 3. a user interface using the Wt C++ library
     i.   search keys marked in the input file automatically become search
 keys in the user interface which correctly invoke the generated stored
 procedures.
     ii.  Foreign key references automatically become dialog boxes.
     iii. Any search keys in the referenced tables become will become search
 keys in the dialog box - again with correct invocations.
     iv. Any tables marked as references multi in the input file
 automatically become master details tables - using a tabview widget for
 details tables. The details tables automatically use the primary key from
 the master table to do their inserts/retrieval of data from the system.

 4. Random data for testing.
 5. scripts to create tables in topological order (when there are foreign key
 references the referenced tables are output first, for the master/details
 tables the master tables are output first), and drop tables and stored
 procedures ( when you want to prevent clutter). All the scripts are
 collected in a unified script generator to save you the trouble of running
 them individually.
 6. Automatic creation of authorization and authentication modules - with
 fine grained control on view, add, edit and view summary.

 The input file is about 200 lines, the generated code is about 15,000 lines
 of c++, and about 12,000 lines of stored procedures.

 Whats not yet implemented

 1. Paging of records.
 2. Automatic edit for records.
 3. Currently the search is properly implemented for varchar fields, need to
 fix it for dates and int/float data
 4. A better random data generator.
 5. god knows what else


 You can see a sample application developed using the code generator here:

  http://173.230.133.34:8080/

 User logins and passwords are below.

 user_login_code | user_login_name | user_password | employee_code
 -+-+---+---
    1 | zenond  | zenond123 | 1
    2 | atuld   | atul123   | 2
    3 | michaeld    | michael123    | 3
    4 | nxd | nxd123    | 4
    5 | veerad  | veerad123 | 5
    6 | chanchud    | chanchud123   | 6
    7 | wilburd | wilburd123    | 7
    8 | abhishekc   | abhishekc123  | 8

 If it crashes - please mail me and I will restart it.

 Kind Regards,
 Neil

interesting project.  do you have any documentation describing how it works?

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] schemaverse!

2010-12-21 Thread Merlin Moncure
A postgresql based game, that you can play from psql!  Written by Abstrct (Josh)

http://www.schemaverse.com/

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] schemaverse!

2010-12-21 Thread A.M.

On Dec 21, 2010, at 5:06 PM, Merlin Moncure wrote:

 A postgresql based game, that you can play from psql!  Written by Abstrct 
 (Josh)
 
 http://www.schemaverse.com/

Finally, a game which makes it look like I am doing work!


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Understanding PG9.0 streaming replication feature

2010-12-21 Thread Ben Carbery
FYI, not looking for a detailed how to here.. I have read the manual twice
and just can't figure which sections are relevant. The manual seems to be
trying to cover all uses simultaneously which is always going to get
confusing :) For example do I need I need WAL archiving or not?

On Tue, Dec 21, 2010 at 2:40 PM, Ben Carbery ben.carb...@gmail.com wrote:

 Hi,

 I am having some trouble trying to figure out how to configure this
 particular scenario..

 I have a pair of pg servers that I want to put in a Master/Standby
 configuration. Currently a script dumps the master db every hour, copies it
 to the standby, restores, and restarts the server. The aim is to replace the
 dumps/restores with streaming replication and ensure the standby is always
 up to date.

 In this case writes are infrequent, but reads are constant, and I only need
 high availability for reads. I would ideally like both master and standby to
 be available simultaneously to allow load-balancing.
 My confusion seems to be around the fact I don't need failover - my
 applications will detect a master down and immediately start using the
 standby, so there is no need to allow writes on the standby, they will just
 wait for the master to be available again - I am not sure what the minimum
 config needed for this scenario is..

 cheers,

 Ben





[GENERAL] plpython returns integer[] fails for multi-dimensional array

2010-12-21 Thread TJ O'Donnell
In postgresql-9.0.1 I have to modify my plpython functions that return arrays.
It seems one dimesional arrays are handled properly, but not
2-dimensional arrays.

create or replace function atest() returns integer[] as $eopy$
 a = list()
 a.append(1)
 a.append(2)
 a.append(3)
 #return a works fine
 b = list()
 b.append(a)
 b.append(a)
 # error
 return b
$eopy$ language plpythonu


select atest() gives
obtest=# select atest();
ERROR:  invalid input syntax for integer: [1, 2, 3]
CONTEXT:  while creating return value
PL/Python function atest

How can I return multi-dimensional arrays in plpython?

TJ O'Donnell

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] plpython returns integer[] fails for multi-dimensional array

2010-12-21 Thread Thom Brown
On 21 December 2010 22:48, TJ O'Donnell t...@acm.org wrote:
 In postgresql-9.0.1 I have to modify my plpython functions that return arrays.
 It seems one dimesional arrays are handled properly, but not
 2-dimensional arrays.

 create or replace function atest() returns integer[] as $eopy$
  a = list()
  a.append(1)
  a.append(2)
  a.append(3)
  #return a works fine
  b = list()
  b.append(a)
  b.append(a)
  # error
  return b
 $eopy$ language plpythonu


 select atest() gives
 obtest=# select atest();
 ERROR:  invalid input syntax for integer: [1, 2, 3]
 CONTEXT:  while creating return value
 PL/Python function atest

 How can I return multi-dimensional arrays in plpython?

Are you sure that a returns okay in that scenario.  You're using a
list.  Shouldn't you be using an array?  Like: a = []

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] plpython returns integer[] fails for multi-dimensional array

2010-12-21 Thread Peter Geoghegan
On 21 December 2010 23:17, Thom Brown t...@linux.com wrote:
 Are you sure that a returns okay in that scenario.  You're using a
 list.  Shouldn't you be using an array?  Like: a = []

a =[] actually declares an empty list in Python. You can return a list
or a tuple from a pl/python function in 9.0 and it will be interpreted
as an array at the SQL call site. You cannot in prior versions.

--
Regards,
Peter Geoghegan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Understanding PG9.0 streaming replication feature

2010-12-21 Thread Satoshi Nagayasu

Hi Ben,

On 2010/12/22 7:46, Ben Carbery wrote:

FYI, not looking for a detailed how to here.. I have read the manual twice and 
just can't figure which sections are relevant. The manual seems to be trying to 
cover all uses simultaneously which is always going to get confusing :) For 
example do I need I need WAL archiving or not?


My blog entry would be a good entry point for you.  :)

5 steps to implement a PostgreSQL replication system
http://pgsnaga.blogspot.com/2010/05/5-steps-to-implement-postgresql.html

It was written to be a guide for building a simple master-slave config.

Please take a look, including the comments.

Thanks,


On 2010/12/22 7:46, Ben Carbery wrote:

FYI, not looking for a detailed how to here.. I have read the manual twice and 
just can't figure which sections are relevant. The manual seems to be trying to 
cover all uses simultaneously which is always going to get confusing :) For 
example do I need I need WAL archiving or not?

On Tue, Dec 21, 2010 at 2:40 PM, Ben Carbery ben.carb...@gmail.com 
mailto:ben.carb...@gmail.com wrote:

Hi,

I am having some trouble trying to figure out how to configure this 
particular scenario..

I have a pair of pg servers that I want to put in a Master/Standby 
configuration. Currently a script dumps the master db every hour, copies it to 
the standby, restores, and restarts the server. The aim is to replace the 
dumps/restores with streaming replication and ensure the standby is always up 
to date.

In this case writes are infrequent, but reads are constant, and I only need 
high availability for reads. I would ideally like both master and standby to be 
available simultaneously to allow load-balancing.
My confusion seems to be around the fact I don't need failover - my 
applications will detect a master down and immediately start using the standby, 
so there is no need to allow writes on the standby, they will just wait for the 
master to be available again - I am not sure what the minimum config needed for 
this scenario is..

cheers,

Ben







--
NAGAYASU Satoshi satoshi.nagay...@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] plpython returns integer[] fails for multi-dimensional array

2010-12-21 Thread Adrian Klaver
On Tuesday 21 December 2010 2:48:16 pm TJ O'Donnell wrote:
 In postgresql-9.0.1 I have to modify my plpython functions that return
 arrays. It seems one dimesional arrays are handled properly, but not
 2-dimensional arrays.

 create or replace function atest() returns integer[] as $eopy$
  a = list()
  a.append(1)
  a.append(2)
  a.append(3)
  #return a works fine
  b = list()
  b.append(a)
  b.append(a)
  # error
  return b
 $eopy$ language plpythonu


 select atest() gives
 obtest=# select atest();
 ERROR:  invalid input syntax for integer: [1, 2, 3]
 CONTEXT:  while creating return value
 PL/Python function atest

 How can I return multi-dimensional arrays in plpython?

 TJ O'Donnell

Maybe:
create or replace function atest() returns integer[][] 

-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 hm, a pq_flush() after command completion putmessage in
 backend/tcop/dest.c seems to fix the problem.  I'll send up a patch to
 -hackers.  They might backpatch it, unless there is a good reason not
 to do this (I can't think of any).

If you just unconditionally flush there, it will result in an extra
network message in the normal case where there's not another query
to do.  The current code is designed not to flush until it sends
ReadyForQuery.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] plpython returns integer[] fails for multi-dimensional array

2010-12-21 Thread Adrian Klaver
On Tuesday 21 December 2010 3:25:48 pm Peter Geoghegan wrote:
 On 21 December 2010 23:17, Thom Brown t...@linux.com wrote:
  Are you sure that a returns okay in that scenario.  You're using a
  list.  Shouldn't you be using an array?  Like: a = []

 a =[] actually declares an empty list in Python. You can return a list
 or a tuple from a pl/python function in 9.0 and it will be interpreted
 as an array at the SQL call site. You cannot in prior versions.

 --
 Regards,
 Peter Geoghegan

Digging into the source for plpython seems to show it only supports one 
dimensional arrays. When I tried my previous example on a 9.0.1 instance it 
kept changing integer[][] to integer[].

-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Merlin Moncure
On Tue, Dec 21, 2010 at 6:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 hm, a pq_flush() after command completion putmessage in
 backend/tcop/dest.c seems to fix the problem.  I'll send up a patch to
 -hackers.  They might backpatch it, unless there is a good reason not
 to do this (I can't think of any).

 If you just unconditionally flush there, it will result in an extra
 network message in the normal case where there's not another query
 to do.  The current code is designed not to flush until it sends
 ReadyForQuery.

yeah, I was looking at that.  I don't see an easy way to test if there
is another query waiting to execute right there.  Maybe a
documentation patch is in order :-).

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Tue, Dec 21, 2010 at 6:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 If you just unconditionally flush there, it will result in an extra
 network message in the normal case where there's not another query
 to do.  The current code is designed not to flush until it sends
 ReadyForQuery.

 yeah, I was looking at that.  I don't see an easy way to test if there
 is another query waiting to execute right there.  Maybe a
 documentation patch is in order :-).

dest.c doesn't have the info available.  I think that to do this, we'd
need to move the responsibility for calling pq_flush out to postgres.c.
Not sure if it's worth it.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Trusted Startup

2010-12-21 Thread Craig Ringer

On 12/22/2010 02:05 AM, Kenneth Buckler wrote:


I find it very comforting that I am not the only one who finds this
requirement a bit out there.
Unfortunately, these requirements are set in stone, and no matter how
hard I try, can not be altered.
We live in a world where compliance is king.  Nevermind if compliance
doesn't actually make the system more secure.

Unfortunately Tripwire does not meet the full requirement, as it does
not prevent the database from starting.


In this case, here's what I'd do:

- Call Red Hat
- Say I'd like to buy RHEL, but have x weird requirement, can you
  help me

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Trusted Startup

2010-12-21 Thread Craig Ringer

We live in a world where compliance is king.  Nevermind if compliance
doesn't actually make the system more secure.


Er .. re my previous post, I don't mean lie to RH and claim to want to 
buy RHEL to get free support. I mean that you should consider going to 
management and getting approval for professional support and integration 
work from a specialist, because you're going to need it.


Alternately you could do the dodgy Trusted GRUB + signed kernel + signed 
initrd with scripted GnuPG verification hack. It'd be a lot better than 
nothing if your target server has a TPM you can enable and use for 
Trusted GRUB.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Understanding PG9.0 streaming replication feature

2010-12-21 Thread Matthias Müller
Hi Ben,

load balancing is not possible with the tools that are in the postgres 
installation. There is no automatic switch-over to a slave if the master 
fails. The trigger file needs to be created to make a slave to the master. This 
is not done automaitcally by postgres, but should be done by a cluster 
software (i.e. pacemaker). 

If you can live without load balancing, read the page of Satoshi. 

But the slaves can handle read requests (SELECT). Maybe this helps a little 
bit. 

Regards 
Matthias

Am Mittwoch, 22. Dezember 2010, um 00:31:14 schrieb Satoshi Nagayasu:
 Hi Ben,
 
 On 2010/12/22 7:46, Ben Carbery wrote:
  FYI, not looking for a detailed how to here.. I have read the manual
  twice and just can't figure which sections are relevant. The manual
  seems to be trying to cover all uses simultaneously which is always
  going to get confusing :) For example do I need I need WAL archiving or
  not?
 
 My blog entry would be a good entry point for you.  :)
 
 5 steps to implement a PostgreSQL replication system
 http://pgsnaga.blogspot.com/2010/05/5-steps-to-implement-postgresql.html
 
 It was written to be a guide for building a simple master-slave config.
 
 Please take a look, including the comments.
 
 Thanks,
 
 On 2010/12/22 7:46, Ben Carbery wrote:
  FYI, not looking for a detailed how to here.. I have read the manual
  twice and just can't figure which sections are relevant. The manual
  seems to be trying to cover all uses simultaneously which is always
  going to get confusing :) For example do I need I need WAL archiving or
  not?
  
  On Tue, Dec 21, 2010 at 2:40 PM, Ben Carbery ben.carb...@gmail.com 
mailto:ben.carb...@gmail.com wrote:
  Hi,
  
  I am having some trouble trying to figure out how to configure this
  particular scenario..
  
  I have a pair of pg servers that I want to put in a Master/Standby
  configuration. Currently a script dumps the master db every hour,
  copies it to the standby, restores, and restarts the server. The aim
  is to replace the dumps/restores with streaming replication and
  ensure the standby is always up to date.
  
  In this case writes are infrequent, but reads are constant, and I
  only need high availability for reads. I would ideally like both
  master and standby to be available simultaneously to allow
  load-balancing. My confusion seems to be around the fact I don't
  need failover - my applications will detect a master down and
  immediately start using the standby, so there is no need to allow
  writes on the standby, they will just wait for the master to be
  available again - I am not sure what the minimum config needed for
  this scenario is..
  
  cheers,
  
  Ben


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Cannot unsubscribe

2010-12-21 Thread William Gordon Rutherdale (rutherw)
I attempted to unsubscribe from this list (for the holidays) without
success.

Could anyone please help me.  I am continuing to get messages from the
list.

I broke open the message header and did as it said for unsubscribing.

See below for what the majordomo sent back.

-Will


 unsub pgsql-general
 The unsubscribe command did not succeed.
 
 No e-mail addresses matching
   William Gordon Rutherdale (rutherw) ruth...@cisco.com
 are subscribed to the pgsql-general mailing list.
 

Valid commands processed: 1
0 succeeded, 0 stalled, and 1 failed.


Use the following command: 
  sessioninfo ea4ce02c75fbad19425a55ff2bd2bc20b17fd794
to see technical information about this session.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Understanding PG9.0 streaming replication feature

2010-12-21 Thread Fujii Masao
On Wed, Dec 22, 2010 at 8:31 AM, Satoshi Nagayasu
satoshi.nagay...@gmail.com wrote:
 My blog entry would be a good entry point for you.  :)

 5 steps to implement a PostgreSQL replication system
 http://pgsnaga.blogspot.com/2010/05/5-steps-to-implement-postgresql.html

Or
http://wiki.postgresql.org/wiki/Binary_Replication_Tutorial
http://wiki.postgresql.org/wiki/Streaming_Replication#How_to_Use

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] plpython returns integer[] fails for multi-dimensional array

2010-12-21 Thread TJ O'Donnell
In previous versions (8.x) for plpython fn returning integer[]
 I created (had to create) a string in the proper SQL format {
{1,2,3}, {4,5,6} }
and returned that.  It worked fine.

I LIKE the ability to not have to do that in 9.0
but I CAN'T return and string like  { {1,2,3}, {4,5,6} } for a fn that
returns integer[]
AND I can't return a two-dimensional array.  Not a happy 9.0 camper.

Anyone know of any plans to 9.0 plpython to support multi-dimensional arrays?

TJ O'Donnell

On Tue, Dec 21, 2010 at 4:02 PM, Adrian Klaver adrian.kla...@gmail.com wrote:
 On Tuesday 21 December 2010 3:25:48 pm Peter Geoghegan wrote:
 On 21 December 2010 23:17, Thom Brown t...@linux.com wrote:
  Are you sure that a returns okay in that scenario.  You're using a
  list.  Shouldn't you be using an array?  Like: a = []

 a =[] actually declares an empty list in Python. You can return a list
 or a tuple from a pl/python function in 9.0 and it will be interpreted
 as an array at the SQL call site. You cannot in prior versions.

 --
 Regards,
 Peter Geoghegan

 Digging into the source for plpython seems to show it only supports one
 dimensional arrays. When I tried my previous example on a 9.0.1 instance it
 kept changing integer[][] to integer[].

 --
 Adrian Klaver
 adrian.kla...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Database file copy

2010-12-21 Thread Srini Raghavan
Hello,

We are looking to distribute postgres databases to our customers along with our 
application. We are currently evaluating postgres version 8.4.4. The database 
can be of size 25 gb (compressed files fits in few dvds, the product is 
distributed on dvds). The pg_restore of this database takes several hours on 
the 
low end machines running windows os. The pg_restore is run during our product 
install, and the current install time projection is not acceptable. Our 
customers can purchase different databases over a period of time, and the 
application makes transactional updates to the databases after installation. 
Hence, copying the entire data folder instead of using the pg_restore is not an 
option, as the transactional updates will be lost.

I have read the documentation and the few posts available that discourages file 
copy based restore of individual databases, but, I have found a way to do this. 
I would appreciate if the experts can read and advise if the approach will 
work, 
given our environment and usage boundaries.

Master Postgres instance (this is where we create the data, we have complete 
control of this environment):
1. Create the database and populate data.
2. Set vacuum_freeze_table_age to 0 in the postgresql.conf
3. Run vacuum full - this will reset the row xid to the FrozenXid
4. Shutdown postgres and take a copy of the files for the given database.

In the deploy instance at the customer site:
1. Create the new database.
2. Shutdown postgres instance and copy the database files created in the master 
instance to the database specific folder.
3. Start postgres instance.

We don't use table row oids. If the cluster wide oid collides with the oid in 
the copied database files during subsequent ddl operations, postgres resolves 
this by skipping to the next available oid. There will be a delay to find the 
next available oid, which is acceptable in our case, as the ddl operations at 
the customer site are rare.  And, the vacuum full with vacuum_freeze_table_age 
set to 0 on the master instance takes care of the xmin, allowing transactions 
to 
be visible, and for further transactions at the customer site to continue 
without colliding. 


I have tested this and it works, and I am continuing to test it more. I would 
like for validation of this idea from the experts and the community to make 
sure 
I haven't overlooked something obvious that might cause issues.

Thank you,
Srini


  

Re: [GENERAL] Cannot unsubscribe

2010-12-21 Thread Adrian Klaver
On Tuesday 21 December 2010 4:16:00 pm William Gordon Rutherdale (rutherw) 
wrote:
 I attempted to unsubscribe from this list (for the holidays) without
 success.

 Could anyone please help me.  I am continuing to get messages from the
 list.

 I broke open the message header and did as it said for unsubscribing.

 See below for what the majordomo sent back.

 -Will

  unsub pgsql-general

  The unsubscribe command did not succeed.
 
  No e-mail addresses matching
    William Gordon Rutherdale (rutherw) ruth...@cisco.com
  are subscribed to the pgsql-general mailing list.
 

 Valid commands processed: 1
 0 succeeded, 0 stalled, and 1 failed.


 Use the following command:
   sessioninfo ea4ce02c75fbad19425a55ff2bd2bc20b17fd794
 to see technical information about this session.

You may want to try the Web link at the bottom of the page and access your 
subscription from there. You will need to know the password you where issued 
when you joined though.

-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Table inheritance foreign key problem

2010-12-21 Thread Andy Chambers

Hi,

One of the caveats described in the documentation for table inheritance is  
that foreign key constraints cannot cover the case where you want to check  
that a value is found somewhere in a table or in that table's  
descendants.  It says there is no good workaround for this.


What about using check constraints?

So say you've got cities and capitals from the example and you had some  
other table that wanted to put a foreign key on cities (plus capitals).   
For example, lets keep guidebook info for the cities.  Some cities are  
worthy of guidebooks even though they're not capitals.  Rather than put a  
foreign key constraint on city, would the following work?  What are the  
drawbacks?


create table guidebooks (
  city check (city in (select name
 from cities)),
  isbn text,
  author text,
  publisher text);

insert into guidebooks ('Barcelona', ) -- not a capital
insert into guidebooks ('Edinburgh', ) -- a capital
insert into guidebooks ('France', ) -- fail

--
Andy Chambers

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Table inheritance foreign key problem

2010-12-21 Thread Richard Broersma
On Tue, Dec 21, 2010 at 9:32 PM, Andy Chambers achamb...@mcna.net wrote:
\
 create table guidebooks (
  city check (city in (select name
                         from cities)),
  isbn text,
  author text,
  publisher text);

This is a nice idea.  They only problem is that PostggreSQL doesn't
support sub-selects in a tables check constraints:
http://www.postgresql.org/docs/9.0/interactive/sql-createtable.html


-- 
Regards,
Richard Broersma Jr.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general