date:20101221

Re: [GENERAL] Table inheritance foreign key problem

2010-12-21 Thread Richard Broersma

On Tue, Dec 21, 2010 at 9:32 PM, Andy Chambers  wrote:
\
> create table guidebooks (
>  city check (city in (select name
>                         from cities)),
>  isbn text,
>  author text,
>  publisher text);

This is a nice idea.  They only problem is that PostggreSQL doesn't
support sub-selects in a tables check constraints:
http://www.postgresql.org/docs/9.0/interactive/sql-createtable.html

-- 
Regards,
Richard Broersma Jr.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Table inheritance foreign key problem

2010-12-21 Thread Andy Chambers


Hi,

One of the caveats described in the documentation for table inheritance is  
that foreign key constraints cannot cover the case where you want to check  
that a value is found somewhere in a table or in that table's  
descendants.  It says there is no "good" workaround for this.


What about using check constraints?

So say you've got cities and capitals from the example and you had some  
other table that wanted to put a foreign key on cities (plus capitals).   
For example, lets keep "guidebook" info for the cities.  Some cities are  
worthy of guidebooks even though they're not capitals.  Rather than put a  
foreign key constraint on "city", would the following work?  What are the  
drawbacks?


create table guidebooks (
  city check (city in (select name
 from cities)),
  isbn text,
  author text,
  publisher text);

insert into guidebooks ('Barcelona', ) -- not a capital
insert into guidebooks ('Edinburgh', ) -- a capital
insert into guidebooks ('France', ) -- fail

--
Andy Chambers

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Cannot unsubscribe

2010-12-21 Thread Adrian Klaver

On Tuesday 21 December 2010 4:16:00 pm William Gordon Rutherdale (rutherw) 
wrote:
> I attempted to unsubscribe from this list (for the holidays) without
> success.
>
> Could anyone please help me.  I am continuing to get messages from the
> list.
>
> I broke open the message header and did as it said for unsubscribing.
>
> See below for what the majordomo sent back.
>
> -Will
>
>  unsub pgsql-general
>
>  The unsubscribe command did not succeed.
> 
>  No e-mail addresses matching
>    "William Gordon Rutherdale (rutherw)" 
>  are subscribed to the pgsql-general mailing list.
> 
>
> Valid commands processed: 1
> 0 succeeded, 0 stalled, and 1 failed.
>
>
> Use the following command:
>   sessioninfo ea4ce02c75fbad19425a55ff2bd2bc20b17fd794
> to see technical information about this session.

You may want to try the Web link at the bottom of the page and access your 
subscription from there. You will need to know the password you where issued 
when you joined though.

-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Database file copy

2010-12-21 Thread Srini Raghavan

Hello,

We are looking to distribute postgres databases to our customers along with our 
application. We are currently evaluating postgres version 8.4.4. The database 
can be of size 25 gb (compressed files fits in few dvds, the product is 
distributed on dvds). The pg_restore of this database takes several hours on 
the 
low end machines running windows os. The pg_restore is run during our product 
install, and the current install time projection is not acceptable. Our 
customers can purchase different databases over a period of time, and the 
application makes transactional updates to the databases after installation. 
Hence, copying the entire data folder instead of using the pg_restore is not an 
option, as the transactional updates will be lost.

I have read the documentation and the few posts available that discourages file 
copy based restore of individual databases, but, I have found a way to do this. 
I would appreciate if the experts can read and advise if the approach will 
work, 
given our environment and usage boundaries.

Master Postgres instance (this is where we create the data, we have complete 
control of this environment):
1. Create the database and populate data.
2. Set vacuum_freeze_table_age to 0 in the postgresql.conf
3. Run vacuum full - this will reset the row xid to the FrozenXid
4. Shutdown postgres and take a copy of the files for the given database.

In the deploy instance at the customer site:
1. Create the new database.
2. Shutdown postgres instance and copy the database files created in the master 
instance to the database specific folder.
3. Start postgres instance.

We don't use table row oids. If the cluster wide oid collides with the oid in 
the copied database files during subsequent ddl operations, postgres resolves 
this by skipping to the next available oid. There will be a delay to find the 
next available oid, which is acceptable in our case, as the ddl operations at 
the customer site are rare.  And, the vacuum full with vacuum_freeze_table_age 
set to 0 on the master instance takes care of the xmin, allowing transactions 
to 
be visible, and for further transactions at the customer site to continue 
without colliding. 


I have tested this and it works, and I am continuing to test it more. I would 
like for validation of this idea from the experts and the community to make 
sure 
I haven't overlooked something obvious that might cause issues.

Thank you,
Srini

Re: [GENERAL] plpython returns integer[] fails for multi-dimensional array

2010-12-21 Thread TJ O'Donnell

In previous versions (8.x) for plpython fn returning integer[]
 I created (had to create) a string in the proper SQL format {
{1,2,3}, {4,5,6} }
and returned that.  It worked fine.

I LIKE the ability to not have to do that in 9.0
but I CAN'T return and string like  { {1,2,3}, {4,5,6} } for a fn that
returns integer[]
AND I can't return a two-dimensional array.  Not a happy 9.0 camper.

Anyone know of any plans to 9.0 plpython to support multi-dimensional arrays?

TJ O'Donnell

On Tue, Dec 21, 2010 at 4:02 PM, Adrian Klaver  wrote:
> On Tuesday 21 December 2010 3:25:48 pm Peter Geoghegan wrote:
>> On 21 December 2010 23:17, Thom Brown  wrote:
>> > Are you sure that "a" returns okay in that scenario.  You're using a
>> > list.  Shouldn't you be using an array?  Like: a = []
>>
>> a =[] actually declares an empty list in Python. You can return a list
>> or a tuple from a pl/python function in 9.0 and it will be interpreted
>> as an array at the SQL call site. You cannot in prior versions.
>>
>> --
>> Regards,
>> Peter Geoghegan
>
> Digging into the source for plpython seems to show it only supports one
> dimensional arrays. When I tried my previous example on a 9.0.1 instance it
> kept changing integer[][] to integer[].
>
> --
> Adrian Klaver
> adrian.kla...@gmail.com
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Understanding PG9.0 streaming replication feature

2010-12-21 Thread Fujii Masao

On Wed, Dec 22, 2010 at 8:31 AM, Satoshi Nagayasu
 wrote:
> My blog entry would be a good entry point for you.  :)
>
> 5 steps to implement a PostgreSQL replication system
> http://pgsnaga.blogspot.com/2010/05/5-steps-to-implement-postgresql.html

Or
http://wiki.postgresql.org/wiki/Binary_Replication_Tutorial
http://wiki.postgresql.org/wiki/Streaming_Replication#How_to_Use

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Cannot unsubscribe

2010-12-21 Thread William Gordon Rutherdale (rutherw)

I attempted to unsubscribe from this list (for the holidays) without
success.

Could anyone please help me.  I am continuing to get messages from the
list.

I broke open the message header and did as it said for unsubscribing.

See below for what the majordomo sent back.

-Will


 unsub pgsql-general
 The unsubscribe command did not succeed.
 
 No e-mail addresses matching
   "William Gordon Rutherdale (rutherw)" 
 are subscribed to the pgsql-general mailing list.
 

Valid commands processed: 1
0 succeeded, 0 stalled, and 1 failed.


Use the following command: 
  sessioninfo ea4ce02c75fbad19425a55ff2bd2bc20b17fd794
to see technical information about this session.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Understanding PG9.0 streaming replication feature

2010-12-21 Thread Matthias Müller

Hi Ben,

load balancing is not possible with the tools that are in the postgres 
installation. There is no automatic switch-over to a slave if the master 
fails. The trigger file needs to be created to make a slave to the master. This 
is not done automaitcally by postgres, but should be done by a cluster 
software (i.e. pacemaker). 

If you can live without load balancing, read the page of Satoshi. 

But the slaves can handle read requests (SELECT). Maybe this helps a little 
bit. 

Regards 
Matthias

Am Mittwoch, 22. Dezember 2010, um 00:31:14 schrieb Satoshi Nagayasu:
> Hi Ben,
> 
> On 2010/12/22 7:46, Ben Carbery wrote:
> > FYI, not looking for a detailed how to here.. I have read the manual
> > twice and just can't figure which sections are relevant. The manual
> > seems to be trying to cover all uses simultaneously which is always
> > going to get confusing :) For example do I need I need WAL archiving or
> > not?
> 
> My blog entry would be a good entry point for you.  :)
> 
> 5 steps to implement a PostgreSQL replication system
> http://pgsnaga.blogspot.com/2010/05/5-steps-to-implement-postgresql.html
> 
> It was written to be a guide for building a simple master-slave config.
> 
> Please take a look, including the comments.
> 
> Thanks,
> 
> On 2010/12/22 7:46, Ben Carbery wrote:
> > FYI, not looking for a detailed how to here.. I have read the manual
> > twice and just can't figure which sections are relevant. The manual
> > seems to be trying to cover all uses simultaneously which is always
> > going to get confusing :) For example do I need I need WAL archiving or
> > not?
> > 
> > On Tue, Dec 21, 2010 at 2:40 PM, Ben Carbery mailto:ben.carb...@gmail.com>> wrote:
> > Hi,
> > 
> > I am having some trouble trying to figure out how to configure this
> > particular scenario..
> > 
> > I have a pair of pg servers that I want to put in a Master/Standby
> > configuration. Currently a script dumps the master db every hour,
> > copies it to the standby, restores, and restarts the server. The aim
> > is to replace the dumps/restores with streaming replication and
> > ensure the standby is always up to date.
> > 
> > In this case writes are infrequent, but reads are constant, and I
> > only need high availability for reads. I would ideally like both
> > master and standby to be available simultaneously to allow
> > load-balancing. My confusion seems to be around the fact I don't
> > need failover - my applications will detect a master down and
> > immediately start using the standby, so there is no need to allow
> > writes on the standby, they will just wait for the master to be
> > available again - I am not sure what the minimum config needed for
> > this scenario is..
> > 
> > cheers,
> > 
> > Ben


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL Trusted Startup

2010-12-21 Thread Craig Ringer


We live in a world where compliance is king.  Nevermind if compliance
doesn't actually make the system more secure.


Er .. re my previous post, I don't mean "lie to RH and claim to want to 
buy RHEL to get free support". I mean that you should consider going to 
management and getting approval for professional support and integration 
work from a specialist, because you're going to need it.


Alternately you could do the dodgy Trusted GRUB + signed kernel + signed 
initrd with scripted GnuPG verification hack. It'd be a lot better than 
nothing if your target server has a TPM you can enable and use for 
Trusted GRUB.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL Trusted Startup

2010-12-21 Thread Craig Ringer


On 12/22/2010 02:05 AM, Kenneth Buckler wrote:


I find it very comforting that I am not the only one who finds this
requirement a bit "out there".
Unfortunately, these requirements are set in stone, and no matter how
hard I try, can not be altered.
We live in a world where compliance is king.  Nevermind if compliance
doesn't actually make the system more secure.

Unfortunately Tripwire does not meet the full requirement, as it does
not prevent the database from starting.


In this case, here's what I'd do:

- Call Red Hat
- Say "I'd like to buy RHEL, but have  weird requirement, can you
  help me"

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Tom Lane

Merlin Moncure  writes:
> On Tue, Dec 21, 2010 at 6:49 PM, Tom Lane  wrote:
>> If you just unconditionally flush there, it will result in an extra
>> network message in the normal case where there's not another query
>> to do.  The current code is designed not to flush until it sends
>> ReadyForQuery.

> yeah, I was looking at that.  I don't see an easy way to test if there
> is another query waiting to execute right there.  Maybe a
> documentation patch is in order :-).

dest.c doesn't have the info available.  I think that to do this, we'd
need to move the responsibility for calling pq_flush out to postgres.c.
Not sure if it's worth it.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Merlin Moncure

On Tue, Dec 21, 2010 at 6:49 PM, Tom Lane  wrote:
> Merlin Moncure  writes:
>> hm, a pq_flush() after command completion putmessage in
>> backend/tcop/dest.c seems to fix the problem.  I'll send up a patch to
>> -hackers.  They might backpatch it, unless there is a good reason not
>> to do this (I can't think of any).
>
> If you just unconditionally flush there, it will result in an extra
> network message in the normal case where there's not another query
> to do.  The current code is designed not to flush until it sends
> ReadyForQuery.

yeah, I was looking at that.  I don't see an easy way to test if there
is another query waiting to execute right there.  Maybe a
documentation patch is in order :-).

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] plpython returns integer[] fails for multi-dimensional array

2010-12-21 Thread Adrian Klaver

On Tuesday 21 December 2010 3:25:48 pm Peter Geoghegan wrote:
> On 21 December 2010 23:17, Thom Brown  wrote:
> > Are you sure that "a" returns okay in that scenario.  You're using a
> > list.  Shouldn't you be using an array?  Like: a = []
>
> a =[] actually declares an empty list in Python. You can return a list
> or a tuple from a pl/python function in 9.0 and it will be interpreted
> as an array at the SQL call site. You cannot in prior versions.
>
> --
> Regards,
> Peter Geoghegan

Digging into the source for plpython seems to show it only supports one 
dimensional arrays. When I tried my previous example on a 9.0.1 instance it 
kept changing integer[][] to integer[].

-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Tom Lane

Merlin Moncure  writes:
> hm, a pq_flush() after command completion putmessage in
> backend/tcop/dest.c seems to fix the problem.  I'll send up a patch to
> -hackers.  They might backpatch it, unless there is a good reason not
> to do this (I can't think of any).

If you just unconditionally flush there, it will result in an extra
network message in the normal case where there's not another query
to do.  The current code is designed not to flush until it sends
ReadyForQuery.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] plpython returns integer[] fails for multi-dimensional array

2010-12-21 Thread Adrian Klaver

On Tuesday 21 December 2010 2:48:16 pm TJ O'Donnell wrote:
> In postgresql-9.0.1 I have to modify my plpython functions that return
> arrays. It seems one dimesional arrays are handled properly, but not
> 2-dimensional arrays.
>
> create or replace function atest() returns integer[] as $eopy$
>  a = list()
>  a.append(1)
>  a.append(2)
>  a.append(3)
>  #return a works fine
>  b = list()
>  b.append(a)
>  b.append(a)
>  # error
>  return b
> $eopy$ language plpythonu
>
>
> select atest() gives
> obtest=# select atest();
> ERROR:  invalid input syntax for integer: "[1, 2, 3]"
> CONTEXT:  while creating return value
> PL/Python function "atest"
>
> How can I return multi-dimensional arrays in plpython?
>
> TJ O'Donnell

Maybe:
create or replace function atest() returns integer[][] 

-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Understanding PG9.0 streaming replication feature

2010-12-21 Thread Satoshi Nagayasu


Hi Ben,

On 2010/12/22 7:46, Ben Carbery wrote:

FYI, not looking for a detailed how to here.. I have read the manual twice and 
just can't figure which sections are relevant. The manual seems to be trying to 
cover all uses simultaneously which is always going to get confusing :) For 
example do I need I need WAL archiving or not?


My blog entry would be a good entry point for you.  :)

5 steps to implement a PostgreSQL replication system
http://pgsnaga.blogspot.com/2010/05/5-steps-to-implement-postgresql.html

It was written to be a guide for building a simple master-slave config.

Please take a look, including the comments.

Thanks,


On 2010/12/22 7:46, Ben Carbery wrote:

FYI, not looking for a detailed how to here.. I have read the manual twice and 
just can't figure which sections are relevant. The manual seems to be trying to 
cover all uses simultaneously which is always going to get confusing :) For 
example do I need I need WAL archiving or not?

On Tue, Dec 21, 2010 at 2:40 PM, Ben Carbery mailto:ben.carb...@gmail.com>> wrote:

Hi,

I am having some trouble trying to figure out how to configure this 
particular scenario..

I have a pair of pg servers that I want to put in a Master/Standby 
configuration. Currently a script dumps the master db every hour, copies it to 
the standby, restores, and restarts the server. The aim is to replace the 
dumps/restores with streaming replication and ensure the standby is always up 
to date.

In this case writes are infrequent, but reads are constant, and I only need 
high availability for reads. I would ideally like both master and standby to be 
available simultaneously to allow load-balancing.
My confusion seems to be around the fact I don't need failover - my 
applications will detect a master down and immediately start using the standby, 
so there is no need to allow writes on the standby, they will just wait for the 
master to be available again - I am not sure what the minimum config needed for 
this scenario is..

cheers,

Ben







--
NAGAYASU Satoshi 

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] plpython returns integer[] fails for multi-dimensional array

2010-12-21 Thread Peter Geoghegan

On 21 December 2010 23:17, Thom Brown  wrote:
> Are you sure that "a" returns okay in that scenario.  You're using a
> list.  Shouldn't you be using an array?  Like: a = []

a =[] actually declares an empty list in Python. You can return a list
or a tuple from a pl/python function in 9.0 and it will be interpreted
as an array at the SQL call site. You cannot in prior versions.

--
Regards,
Peter Geoghegan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] plpython returns integer[] fails for multi-dimensional array

2010-12-21 Thread Thom Brown

On 21 December 2010 22:48, TJ O'Donnell  wrote:
> In postgresql-9.0.1 I have to modify my plpython functions that return arrays.
> It seems one dimesional arrays are handled properly, but not
> 2-dimensional arrays.
>
> create or replace function atest() returns integer[] as $eopy$
>  a = list()
>  a.append(1)
>  a.append(2)
>  a.append(3)
>  #return a works fine
>  b = list()
>  b.append(a)
>  b.append(a)
>  # error
>  return b
> $eopy$ language plpythonu
>
>
> select atest() gives
> obtest=# select atest();
> ERROR:  invalid input syntax for integer: "[1, 2, 3]"
> CONTEXT:  while creating return value
> PL/Python function "atest"
>
> How can I return multi-dimensional arrays in plpython?

Are you sure that "a" returns okay in that scenario.  You're using a
list.  Shouldn't you be using an array?  Like: a = []

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] plpython returns integer[] fails for multi-dimensional array

2010-12-21 Thread TJ O'Donnell

In postgresql-9.0.1 I have to modify my plpython functions that return arrays.
It seems one dimesional arrays are handled properly, but not
2-dimensional arrays.

create or replace function atest() returns integer[] as $eopy$
 a = list()
 a.append(1)
 a.append(2)
 a.append(3)
 #return a works fine
 b = list()
 b.append(a)
 b.append(a)
 # error
 return b
$eopy$ language plpythonu


select atest() gives
obtest=# select atest();
ERROR:  invalid input syntax for integer: "[1, 2, 3]"
CONTEXT:  while creating return value
PL/Python function "atest"

How can I return multi-dimensional arrays in plpython?

TJ O'Donnell

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Understanding PG9.0 streaming replication feature

2010-12-21 Thread Ben Carbery

FYI, not looking for a detailed how to here.. I have read the manual twice
and just can't figure which sections are relevant. The manual seems to be
trying to cover all uses simultaneously which is always going to get
confusing :) For example do I need I need WAL archiving or not?

On Tue, Dec 21, 2010 at 2:40 PM, Ben Carbery  wrote:

> Hi,
>
> I am having some trouble trying to figure out how to configure this
> particular scenario..
>
> I have a pair of pg servers that I want to put in a Master/Standby
> configuration. Currently a script dumps the master db every hour, copies it
> to the standby, restores, and restarts the server. The aim is to replace the
> dumps/restores with streaming replication and ensure the standby is always
> up to date.
>
> In this case writes are infrequent, but reads are constant, and I only need
> high availability for reads. I would ideally like both master and standby to
> be available simultaneously to allow load-balancing.
> My confusion seems to be around the fact I don't need failover - my
> applications will detect a master down and immediately start using the
> standby, so there is no need to allow writes on the standby, they will just
> wait for the master to be available again - I am not sure what the minimum
> config needed for this scenario is..
>
> cheers,
>
> Ben
>
>
>

Re: [GENERAL] schemaverse!

2010-12-21 Thread A.M.


On Dec 21, 2010, at 5:06 PM, Merlin Moncure wrote:

> A postgresql based game, that you can play from psql!  Written by Abstrct 
> (Josh)
> 
> http://www.schemaverse.com/

Finally, a game which makes it look like I am doing work!


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] schemaverse!

2010-12-21 Thread Merlin Moncure

A postgresql based game, that you can play from psql!  Written by Abstrct (Josh)

http://www.schemaverse.com/

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] What Programs Do You Use For PG?

2010-12-21 Thread Merlin Moncure

On Tue, Dec 21, 2010 at 1:32 AM, Neil D'Souza
 wrote:
> You can have a look at my project on sourceforge:
> http://sourceforge.net/projects/proghelp builds applications with PG as a
> backend automatically. It uses a modified create table sql grammar as an
> input.
>
> 1. It automatically  generates stored procedures to insert and retrieve data
> 2. a database api layer to access the data in C++
> 3. a user interface using the Wt C++ library
>     i.   search keys marked in the input file automatically become search
> keys in the user interface which correctly invoke the generated stored
> procedures.
>     ii.  Foreign key references automatically become dialog boxes.
>     iii. Any search keys in the referenced tables become will become search
> keys in the dialog box - again with correct invocations.
>     iv. Any tables marked as "references multi" in the input file
> automatically become master details tables - using a tabview widget for
> details tables. The details tables automatically use the primary key from
> the master table to do their inserts/retrieval of data from the system.
>
> 4. Random data for testing.
> 5. scripts to create tables in topological order (when there are foreign key
> references the referenced tables are output first, for the master/details
> tables the master tables are output first), and drop tables and stored
> procedures ( when you want to prevent clutter). All the scripts are
> collected in a unified script generator to save you the trouble of running
> them individually.
> 6. Automatic creation of authorization and authentication modules - with
> fine grained control on view, add, edit and view summary.
>
> The input file is about 200 lines, the generated code is about 15,000 lines
> of c++, and about 12,000 lines of stored procedures.
>
> Whats not yet implemented
>
> 1. Paging of records.
> 2. Automatic edit for records.
> 3. Currently the search is properly implemented for varchar fields, need to
> fix it for dates and int/float data
> 4. A better random data generator.
> 5. god knows what else
>
>
> You can see a sample application developed using the code generator here:
>
>  http://173.230.133.34:8080/
>
> User logins and passwords are below.
>
> user_login_code | user_login_name | user_password | employee_code
> -+-+---+---
>    1 | zenond  | zenond123 | 1
>    2 | atuld   | atul123   | 2
>    3 | michaeld    | michael123    | 3
>    4 | nxd | nxd123    | 4
>    5 | veerad  | veerad123 | 5
>    6 | chanchud    | chanchud123   | 6
>    7 | wilburd | wilburd123    | 7
>    8 | abhishekc   | abhishekc123  | 8
>
> If it crashes - please mail me and I will restart it.
>
> Kind Regards,
> Neil

interesting project.  do you have any documentation describing how it works?

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How to use pgbouncer

2010-12-21 Thread Filip Rembiałkowski

2010/12/21 Andreas Kretschmer :
>
> I'm looking for a solution to split read and write access to different
> servers (streaming replication, you know ...). Can i do that with
> pgpool? (setting backend_weightX=0 or 1)? I have read the doc, but i'm
> not sure if pgpool the right solution, maybe you can enlighten me?
> (or someone else ...)

did you see 
http://pgpool.projects.postgresql.org/pgpool-II/doc/pgpool-en.html#master_slave_mode
? I think it answers your question :-)

warning:
it works by detecting which queries are read-only, which does not
always work "out of the box".
you will sometimes have to modify driver settings or worse, modify SQL
queries including "hints" for pgpool.

see:
http://pgpool.projects.postgresql.org/pgpool-II/doc/pgpool-en.html#restriction
http://pgpool.projects.postgresql.org/pgpool-II/doc/pgpool-en.html#white_function_list
http://pgpool.projects.postgresql.org/pgpool-II/doc/pgpool-en.html#black_function_list

greets,
  Filip

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Merlin Moncure

On Tue, Dec 21, 2010 at 3:40 PM, Merlin Moncure  wrote:
> On Tue, Dec 21, 2010 at 3:37 PM, Merlin Moncure  wrote:
>> On Tue, Dec 21, 2010 at 3:14 PM, Merlin Moncure  wrote:
>>> On Tue, Dec 21, 2010 at 3:07 PM, Daniel Verite  
>>> wrote:
        Kelly Burkhart wrote:

> #define COMMANDS "select current_timestamp; select pg_sleep(5); select
> current_timestamp"

 You should use current_clock() instead of current_timestamp, because
 current_timestamp returns a fixed value throughout a transaction.
>>>
>>> Well, that's correct, but irrelevant -- Kelly's analysis is correct.
>>> The documentation for PQgetResult states:
>>>
>>> "Using PQsendQuery and PQgetResult solves one of PQexec's problems: If
>>> a command string contains multiple SQL commands, the results of those
>>> commands can be obtained individually. (This allows a simple form of
>>> overlapped processing, by the way: the client can be handling the
>>> results of one command while the server is still working on later
>>> queries in the same command string.) However, calling PQgetResult will
>>> still cause the client to block until the server completes the next
>>> SQL command. This can be avoided by proper use of two more functions:"
>>>
>>> but control is not returned until all three queries have resolved.
>>> this is probably an issue with libpq.  investigating...
>>
>> hm, it looks like the backend is not flushing the command complete for
>> each command until finishing all the queries.  This is what signals
>> libpq that a particular command has been executed.
>
> to see this in action, you can interject a query between queries 1 & 2
> that sends a lot of data. the 'lots of data' forces query one protocol
> to flush out, which the client handles properly.  this is likely
> backend bug -- it needs to force a flush upon command completion?

hm, a pq_flush() after command completion putmessage in
backend/tcop/dest.c seems to fix the problem.  I'll send up a patch to
-hackers.  They might backpatch it, unless there is a good reason not
to do this (I can't think of any).

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Merlin Moncure

On Tue, Dec 21, 2010 at 3:37 PM, Merlin Moncure  wrote:
> On Tue, Dec 21, 2010 at 3:14 PM, Merlin Moncure  wrote:
>> On Tue, Dec 21, 2010 at 3:07 PM, Daniel Verite  
>> wrote:
>>>        Kelly Burkhart wrote:
>>>
 #define COMMANDS "select current_timestamp; select pg_sleep(5); select
 current_timestamp"
>>>
>>> You should use current_clock() instead of current_timestamp, because
>>> current_timestamp returns a fixed value throughout a transaction.
>>
>> Well, that's correct, but irrelevant -- Kelly's analysis is correct.
>> The documentation for PQgetResult states:
>>
>> "Using PQsendQuery and PQgetResult solves one of PQexec's problems: If
>> a command string contains multiple SQL commands, the results of those
>> commands can be obtained individually. (This allows a simple form of
>> overlapped processing, by the way: the client can be handling the
>> results of one command while the server is still working on later
>> queries in the same command string.) However, calling PQgetResult will
>> still cause the client to block until the server completes the next
>> SQL command. This can be avoided by proper use of two more functions:"
>>
>> but control is not returned until all three queries have resolved.
>> this is probably an issue with libpq.  investigating...
>
> hm, it looks like the backend is not flushing the command complete for
> each command until finishing all the queries.  This is what signals
> libpq that a particular command has been executed.

to see this in action, you can interject a query between queries 1 & 2
that sends a lot of data. the 'lots of data' forces query one protocol
to flush out, which the client handles properly.  this is likely
backend bug -- it needs to force a flush upon command completion?

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Merlin Moncure

On Tue, Dec 21, 2010 at 3:14 PM, Merlin Moncure  wrote:
> On Tue, Dec 21, 2010 at 3:07 PM, Daniel Verite  
> wrote:
>>        Kelly Burkhart wrote:
>>
>>> #define COMMANDS "select current_timestamp; select pg_sleep(5); select
>>> current_timestamp"
>>
>> You should use current_clock() instead of current_timestamp, because
>> current_timestamp returns a fixed value throughout a transaction.
>
> Well, that's correct, but irrelevant -- Kelly's analysis is correct.
> The documentation for PQgetResult states:
>
> "Using PQsendQuery and PQgetResult solves one of PQexec's problems: If
> a command string contains multiple SQL commands, the results of those
> commands can be obtained individually. (This allows a simple form of
> overlapped processing, by the way: the client can be handling the
> results of one command while the server is still working on later
> queries in the same command string.) However, calling PQgetResult will
> still cause the client to block until the server completes the next
> SQL command. This can be avoided by proper use of two more functions:"
>
> but control is not returned until all three queries have resolved.
> this is probably an issue with libpq.  investigating...

hm, it looks like the backend is not flushing the command complete for
each command until finishing all the queries.  This is what signals
libpq that a particular command has been executed.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Merlin Moncure

On Tue, Dec 21, 2010 at 3:07 PM, Daniel Verite  wrote:
>        Kelly Burkhart wrote:
>
>> #define COMMANDS "select current_timestamp; select pg_sleep(5); select
>> current_timestamp"
>
> You should use current_clock() instead of current_timestamp, because
> current_timestamp returns a fixed value throughout a transaction.

Well, that's correct, but irrelevant -- Kelly's analysis is correct.
The documentation for PQgetResult states:

"Using PQsendQuery and PQgetResult solves one of PQexec's problems: If
a command string contains multiple SQL commands, the results of those
commands can be obtained individually. (This allows a simple form of
overlapped processing, by the way: the client can be handling the
results of one command while the server is still working on later
queries in the same command string.) However, calling PQgetResult will
still cause the client to block until the server completes the next
SQL command. This can be avoided by proper use of two more functions:"

but control is not returned until all three queries have resolved.
this is probably an issue with libpq.  investigating...

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Daniel Verite

Kelly Burkhart wrote:

> #define COMMANDS "select current_timestamp; select pg_sleep(5); select
> current_timestamp"

You should use current_clock() instead of current_timestamp, because
current_timestamp returns a fixed value throughout a transaction.

Best regards,
-- 
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Kelly Burkhart

This should do it:

#include 
#include 
#include 

#define CONNINFO "your info here"
#define COMMANDS "select current_timestamp; select pg_sleep(5); select
current_timestamp"

void fatal( const char *msg ) { fprintf( stderr, "%s\n", msg ); exit(1); }

int
main()
{
PGresult *res = 0;
PGconn *conn = PQconnectdb( CONNINFO );
if (!conn) fatal("PQconnectdb returned null");

if ( PQstatus(conn) != CONNECTION_OK ) {
PQfinish( conn );
fatal("PQconnectdb failed");
}

if (!PQsendQuery(conn, COMMANDS)) {
PQfinish( conn );
fatal("PQsendQuery failed");
}

while( (res = PQgetResult( conn )) != 0 ) {
printf("retrieved result\n");
PQclear(res);
}

PQfinish( conn );
return 0;
}



On Tue, Dec 21, 2010 at 1:26 PM, Merlin Moncure  wrote:
> On Tue, Dec 21, 2010 at 2:21 PM, Peter Geoghegan
>  wrote:
>> You can't concurrently execute queries from within a single
>> connection. Perhaps you should use multiple connections, while
>> understanding the implications of having each operate within a
>> separate snapshot.
>
> OP is not suggesting that queries run concurrently, but asking why
> result sets can't be popped off as the queries resolve.  It's a good
> question; it's probably either a bug in the database or the
> documentation (if it does not turn out to be operator error).
>
> Kelly, if you can produce small test case in C I'll double check it.
>
> merlin
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Kelly Burkhart

Yes, I omitted the PQclear for simplicity.

I'm not concurrently executing queries, I'm sending multiple queries
to be executed serially by the backend.  I'm expecting the server to
send me the PQresult objects as each query completes rather than
sending them all *after* all of the queries have completed.

That will result in some amount of concurrency as my client will be
processing one result while the backend is processing the next query.

-K

On Tue, Dec 21, 2010 at 1:21 PM, Peter Geoghegan
 wrote:
> You can't concurrently execute queries from within a single
> connection. Perhaps you should use multiple connections, while
> understanding the implications of having each operate within a
> separate snapshot.
>
> Don't forget to free memory with PQclear() . I guess you omitted that
> because it's just pseudo-code.
>
> --
> Regards,
> Peter Geoghegan
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Merlin Moncure

On Tue, Dec 21, 2010 at 2:21 PM, Peter Geoghegan
 wrote:
> You can't concurrently execute queries from within a single
> connection. Perhaps you should use multiple connections, while
> understanding the implications of having each operate within a
> separate snapshot.

OP is not suggesting that queries run concurrently, but asking why
result sets can't be popped off as the queries resolve.  It's a good
question; it's probably either a bug in the database or the
documentation (if it does not turn out to be operator error).

Kelly, if you can produce small test case in C I'll double check it.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Tomas Vondra

Dne 21.12.2010 20:03, Jeremy Harris napsal(a):
> On 2010-12-21 18:50, Tomas Vondra wrote:
>>> Then the index you just built gets automatically dropped, as I said
>>> above.
>>
>> I'm a bit confused. Should the indexes be dropped automatically (as you
>> state here) or kept for the future. Because if they should be dropped,
>> then it does not make sense to do this magic just for a limited time
>> after the DB goes live.
> 
> Here what I said:
> "track those that actually get re-used and remove the rest".
> 
> Which part is confusing?

As I described, identifying which indexes are actually used is a very
tricky task. And it's not difficult to come up with scenarios where this
causes significantly more harm than good.

Basically the time to keep the indices needs to be long enough that the
indexes that are actually used are not dropped (and the resources spent
creating them actually pays off). But on the other side it needs to be
short so that resources are not wasted because of unused indices.

Which are clearly contradictory requirements.

And the 'limits' you've proposed make that even worse, because when the
unnecessary indices get created first and take most of the resources
(e.g. disk space), then the indexes that are actually needed won't be
created because of those limits.

regards
Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Peter Geoghegan

You can't concurrently execute queries from within a single
connection. Perhaps you should use multiple connections, while
understanding the implications of having each operate within a
separate snapshot.

Don't forget to free memory with PQclear() . I guess you omitted that
because it's just pseudo-code.

-- 
Regards,
Peter Geoghegan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Jeremy Harris


On 2010-12-21 18:50, Tomas Vondra wrote:

Then the index you just built gets automatically dropped, as I said above.


I'm a bit confused. Should the indexes be dropped automatically (as you
state here) or kept for the future. Because if they should be dropped,
then it does not make sense to do this magic just for a limited time
after the DB goes live.


Here what I said:
"track those that actually get re-used and remove the rest".

Which part is confusing?

- Jeremy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Tomas Vondra

Dne 21.12.2010 16:34, Jeremy Harris napsal(a):
>> There really is no automatic way to solve this puzzle using a single
>> query. Indexing strategy is a very tough design discipline, and it
>> requires a complex knowledge of the workload. One slow query does not
>> mean
>> the index should be created - what if that was just an ad-hoc query and
>> will not be executed ever again?
> 
> Then the index you just built gets automatically dropped, as I said above.

I'm a bit confused. Should the indexes be dropped automatically (as you
state here) or kept for the future. Because if they should be dropped,
then it does not make sense to do this magic just for a limited time
after the DB goes live.

>> I really don't want to work with products that try to be smarter than me
>> (and I admit I'm pretty dumb from time to time) and act rather randomly
>> due to this 'artificial intelligence'. I've already been there and I
>> don't
>> want to repeat this experience.
> 
> Then, since you're not a neophyte, leave the feature turned off. But
> don't deny the possibility of using it to actual neophytes.

This is not a question of whether I am a neophyte or not, this is a
question of good practices. My experience is that building indexes
should be done when designing the application, and tested at in a test
environment. At production, log slow queries, analyze the log and add
indexes when needed.

This 'neophyte approach' is a really bad idea from my point of view. It
inevitably leads to a database with a zillion of unnecessary indexes,
missing the right ones. And any limits don't prevent this. In the end
you'll get a slow database, and the neophytes would blame the database
although that state is inevitable.

I've been working with several database products over the years, and
AFAIK none of them does this. In most cases "the whole industry is
wrong" is usually a really bad sign (does not hold if you're Steve Jobs).

Most of them do have 'advisors' though - that's a good idea, and I think
sooner or later that will be in PostgreSQL too.

 as in b): Creating an index is quite expensiv
>>>
>>> How much more so than doing that full-table-scan plus sort, which your
>>> query is doing anyway?
>>
>> A lot. It consumes a lot of CPU, it may consume a lot of memory (up to
>> maintenance_work_mem) etc. So imagine a few users, building indices on a
>> big table simultaneously
> 
> Pffthht.   One simple trylock, used only by the auto-indexer.

Pffthht? Anyway trylock is not an answer. Think about a different table
for each user.

>> Building an index is just one side of the problem - maintenance of the
>> indexes is another thing. Each index has an impact on write operations
>> (INSERT/UPDATE) and may cause that HOT actually does not work (thus
>> causing unnecessary bloat of the index).
> 
> This is worth discussing.How much help does the DBMS currently give
> the DBA in evaluating these tradeoffs?Could we do better, given an
> actual installation and workload?

The database gives you explain plans, plus pg_stat_* and pg_statio_*
views. That's a lot of data, although does not provide a complete view
in many cases. And then there's a log_min_duration, which is the primary
weapon in fighting slow queries after going live.

>> And the indexes may need lot of
>> space on a disk.
> 
> By all means require limits as well as a "don't do that" switch.

As I said, this is not a good approach from my POV. And it's really
really hard to implement this in a way that those who don't want to use
it don't have to pay the price. Because what portion of users would
really use this feature? 0.0001%?

>> But the real show stopper is probably locking. Building an index takes a
>> write lock on a table, effectively blocking writes. Sure, you can use a
>> 'CREATE INDEX ... CONCURRENTLY' introduced in 8.2, but there are some
>> disadvantages of that (see
>> http://www.postgresql.org/docs/9.0/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY).
>>
> 
> The only disadvantages I see there are a) the leftover "invalid" index -
> which feels like a
> bug; why is it not auto-dropped?   and b) the second scan, which implies
> more total work
> and a wish to background that portion after completing the query triggering
> the auto-index.

Dropping an index automatically is a really bad idea. Actually dropping
indexes in general (no matter if the index was created automatically or
manually) is a very tricky thing.

How do you decide which indexes are really "leftover" from those that
are used only rarely? I've seen application failing spectacularily
because a DBA dropped an index that was not used ... except for a batch
process that runs once a year, to close a fiscal year.

> Don't forget I suggested doing this only for a limited time after DB
> creation.  Possibly another
> reason to turn it off should be any manual index creation, as that
> implies that the DBA
> knows about indexing.
> 
> I see in another thread you suggest merely placing hints in

[GENERAL] libpq sendQuery -- getResult not returning until all queries complete

2010-12-21 Thread Kelly Burkhart

Hello, I'm sending a group of queries to the database with PQsendQuery
and using PQgetResult to return results similar to this:

PQsendQuery( "select current_timestamp; select pg_sleep(1); select
current_timestamp" );

while( result = PQgetResult() )
doSomethingWith( result )

I'm finding that PQgetResult() will not return the first result until
all three results are ready.  In real life I would be sending
consequential queries and would like the front end to be processing
one result while the backend is producing the next one.  The
documentation for PQgetResult in section 30.4 of the 8.4.5 manual
suggests that this is what should happen.

Can anyone explain if sendQuery/getResult is intended to work as
documented?  And if so what my problem may be?  (perhaps my pg_sleep
is preventing the first result from being written?)

Thanks,

-Kelly

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Constraining overlapping date ranges

2010-12-21 Thread Richard Broersma

On Tue, Dec 21, 2010 at 7:49 AM, McGehee, Robert
 wrote:
> PostgreSQLers,
> I'm hoping for some help creating a constraint/key on a table such that there 
> are no overlapping ranges of dates for any id.

 There is something you can try, but it is not exactly what you want
since it is based on timestamps rather than dates:

http://temporal.projects.postgresql.org/


-- 
Regards,
Richard Broersma Jr.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Constraining overlapping date ranges

2010-12-21 Thread McGehee, Robert

PostgreSQLers,
I'm hoping for some help creating a constraint/key on a table such that there 
are no overlapping ranges of dates for any id. 

Specifically: Using PostgreSQL 9.0.1, I'm creating a name-value pair table as 
such this:

CREATE TABLE tbl (id INTEGER, start_date DATE, stop_date DATE, value REAL);

For a given id, I'd like to enforce that there is only one valid value on a 
given date. For instance, this would be acceptable:

id  start_date  stop_date   value
2   2010-11-01  2010-12-01  3
2   2010-12-02  2010-12-15  4
3   2010-10-15  2010-12-15  -3

But this would not: (notice start_date of line 2 is before stop_date of line 1).
id  start_date  stop_date   value
2   2010-11-01  2010-12-01  3
2   2010-11-30  2010-12-15  4
3   2010-10-15  2010-12-15  -3

I'd also appreciate it if anyone can provide any indexing hints on this table 
to optimize queries like:
SELECT value FROM tbl WHERE id=2 and '2010-12-02' BETWEEN start_date AND 
stop_date;

Thanks in advance, and sorry if I overlooked any obvious documentation!

Robert McGehee

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How to use pgbouncer

2010-12-21 Thread Andreas Kretschmer

Richard Broersma  wrote:

> 2010/12/21 Filip Rembiałkowski :
> >> Or if there is another useful tool available for Connection Pooling. Please
> >> guide me for this.
> >
> > yes there are some; see
> > http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
> >
> > it depends on what you need. pgbouncer is the smallest and most
> > lightweight pooler available.
> 
> Also, Pgbouncer is the only connection-pooler that work on MS-Windows
> that is released for production use.

I'm looking for a solution to split read and write access to different
servers (streaming replication, you know ...). Can i do that with
pgpool? (setting backend_weightX=0 or 1)? I have read the doc, but i'm
not sure if pgpool the right solution, maybe you can enlighten me?
(or someone else ...)


Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How to use pgbouncer

2010-12-21 Thread Richard Broersma

2010/12/21 Filip Rembiałkowski :
>> Or if there is another useful tool available for Connection Pooling. Please
>> guide me for this.
>
> yes there are some; see
> http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
>
> it depends on what you need. pgbouncer is the smallest and most
> lightweight pooler available.

Also, Pgbouncer is the only connection-pooler that work on MS-Windows
that is released for production use.


-- 
Regards,
Richard Broersma Jr.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How to use pgbouncer

2010-12-21 Thread Filip Rembiałkowski

2010/12/21 Adarsh Sharma :
> Dear all,
>
> I am not able to find any useful document regarding Configuration and
> Running Pgbouncer with Postgres-8.4.2.

that's strange, there are several good pages on the web; there is also
my mini-howto:
http://filip.rembialkowski.net/pgbouncer-mini-howto-benchmark/

>
> How it helps and is it able to boost some performance ?

PgBouncer is a connection pooler. I wonder why do you want to use it,
if you don't know what it is?

it can help in many ways, two most obvious points:
- reduce number of postgres backends (more resources for the server)
- reduce cost of single connection from the application to postgres


> Or if there is another useful tool available for Connection Pooling. Please
> guide me for this.

yes there are some; see
http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling

it depends on what you need. pgbouncer is the smallest and most
lightweight pooler available.


FR

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread bricklen

On Tue, Dec 21, 2010 at 7:34 AM, Jeremy Harris  wrote:
> On 2010-12-21 14:26, t...@fuzzy.cz wrote:
>>>
>>> Why not auto-create indices for some limited period after database load
>>> (copy?  any large number of inserts from a single connection?), track
>>> those
>>> that actually get re-used and remove the rest?   Would this not provide
>>> a better out-of-the-box experience for neophytes?
>>
>> Say you have a table with several columns (A,B,C), and the query is using
>> some of them. What indexes would you create? One index on every column? A
>> multi-column index on all columns? Indexs for each combination of columns?
>
> One single-column index, on the first index-worthy column appearing.
> Keep it simple.    Maybe, while you're doing that full-table-scan. gather
> stats on all the indexable columns for later reference, to guide choice of
> which column to index later.
>
>> There really is no automatic way to solve this puzzle using a single
>> query. Indexing strategy is a very tough design discipline, and it
>> requires a complex knowledge of the workload. One slow query does not mean
>> the index should be created - what if that was just an ad-hoc query and
>> will not be executed ever again?
>
> Then the index you just built gets automatically dropped, as I said above.
>
>> I really don't want to work with products that try to be smarter than me
>> (and I admit I'm pretty dumb from time to time) and act rather randomly
>> due to this 'artificial intelligence'. I've already been there and I don't
>> want to repeat this experience.
>
> Then, since you're not a neophyte, leave the feature turned off.   But don't
> deny the possibility of using it to actual neophytes.
>
>
 as in b): Creating an index is quite expensiv
>>>
>>> How much more so than doing that full-table-scan plus sort, which your
>>> query is doing anyway?
>>
>> A lot. It consumes a lot of CPU, it may consume a lot of memory (up to
>> maintenance_work_mem) etc. So imagine a few users, building indices on a
>> big table simultaneously
>
> Pffthht.   One simple trylock, used only by the auto-indexer.
>
>>
>> Building an index is just one side of the problem - maintenance of the
>> indexes is another thing. Each index has an impact on write operations
>> (INSERT/UPDATE) and may cause that HOT actually does not work (thus
>> causing unnecessary bloat of the index).
>
> This is worth discussing.    How much help does the DBMS currently give
> the DBA in evaluating these tradeoffs?    Could we do better, given an
> actual installation and workload?
>
>> And the indexes may need lot of
>> space on a disk.
>
> By all means require limits as well as a "don't do that" switch.
>
>>
>> But the real show stopper is probably locking. Building an index takes a
>> write lock on a table, effectively blocking writes. Sure, you can use a
>> 'CREATE INDEX ... CONCURRENTLY' introduced in 8.2, but there are some
>> disadvantages of that (see
>>
>> http://www.postgresql.org/docs/9.0/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY).
>
> The only disadvantages I see there are a) the leftover "invalid" index -
> which feels like a
> bug; why is it not auto-dropped?   and b) the second scan, which implies
> more total work
> and a wish to background that portion after completing the query triggering
> the auto-index.
>
> Don't forget I suggested doing this only for a limited time after DB
> creation.  Possibly another
> reason to turn it off should be any manual index creation, as that implies
> that the DBA
> knows about indexing.
>
> I see in another thread you suggest merely placing hints in the log.
> That's a fine
> first step - but I'll then be wanting to auto-parse that log to
> auto-create
>
> Cheers,
>    Jeremy
>

This thread offers up some interesting possibilities. Expanding on
what has already been discussed, maybe a contrib module for query
tuning/index suggestions?
Some things that came to mind immediately that the module could do
(feasible or not):
- Look at the EXPLAIN ANALYZE
- Examine the index access methods and table scans + costs/rows
- Which indexes were used?
- What were the blocks/tuples hit & read?
- Look at join conditions and WHERE clause filters
- Data types in the joins (mismatched?)
- Churn rate of the tables, eg. the updates/deletes/inserts. This
might allow suggestion of other index types (eg. gist)

Tool then provides feedback on possibly helpful indexes to test, and
why (hypothetical indexes could be applied here). Possibly provided
suggestions on ways to improve the query, eg. data types don't match
in the join, EXISTS vs IN, etc

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Jeremy Harris

On 2010-12-21 14:26, t...@fuzzy.cz wrote:

Why not auto-create indices for some limited period after database load
(copy? any large number of inserts from a single connection?), track
those
that actually get re-used and remove the rest? Would this not provide
a better out-of-the-box experience for neophytes?

Say you have a table with several columns (A,B,C), and the query is using
some of them. What indexes would you create? One index on every column? A
multi-column index on all columns? Indexs for each combination of columns?

One single-column index, on the first index-worthy column appearing.
Keep it simple.Maybe, while you're doing that full-table-scan. gather
stats on all the indexable columns for later reference, to guide choice of
which column to index later.

There really is no automatic way to solve this puzzle using a single
query. Indexing strategy is a very tough design discipline, and it
requires a complex knowledge of the workload. One slow query does not mean
the index should be created - what if that was just an ad-hoc query and
will not be executed ever again?

Then the index you just built gets automatically dropped, as I said above.

I really don't want to work with products that try to be smarter than me
(and I admit I'm pretty dumb from time to time) and act rather randomly
due to this 'artificial intelligence'. I've already been there and I don't
want to repeat this experience.

Then, since you're not a neophyte, leave the feature turned off. But don't
deny the possibility of using it to actual neophytes.

as in b): Creating an index is quite expensiv

How much more so than doing that full-table-scan plus sort, which your
query is doing anyway?

A lot. It consumes a lot of CPU, it may consume a lot of memory (up to
maintenance_work_mem) etc. So imagine a few users, building indices on a
big table simultaneously

Pffthht. One simple trylock, used only by the auto-indexer.

Building an index is just one side of the problem - maintenance of the
indexes is another thing. Each index has an impact on write operations
(INSERT/UPDATE) and may cause that HOT actually does not work (thus
causing unnecessary bloat of the index).

This is worth discussing.How much help does the DBMS currently give
the DBA in evaluating these tradeoffs?Could we do better, given an
actual installation and workload?

And the indexes may need lot of
space on a disk.

By all means require limits as well as a "don't do that" switch.

But the real show stopper is probably locking. Building an index takes a
write lock on a table, effectively blocking writes. Sure, you can use a
'CREATE INDEX ... CONCURRENTLY' introduced in 8.2, but there are some
disadvantages of that (see
http://www.postgresql.org/docs/9.0/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY).

The only disadvantages I see there are a) the leftover "invalid" index - which
feels like a
bug; why is it not auto-dropped? and b) the second scan, which implies more
total work
and a wish to background that portion after completing the query triggering
the auto-index.

Don't forget I suggested doing this only for a limited time after DB creation.
Possibly another
reason to turn it off should be any manual index creation, as that implies that
the DBA
knows about indexing.

I see in another thread you suggest merely placing hints in the log. That's a
fine
first step - but I'll then be wanting to auto-parse that log to auto-create

Cheers,
Jeremy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread tv

> I don't think planner should do things like creating an index. But it
> might hint at doing it in the logs.
> There was a discussion around that sort of feature on -hackers not so
> long time ago. I don't remember what the conclusion was, but probably
> that it just isn't worth wasting planner's cycles when you execute a
> query.

Yes, that would be a much better solution, definitely. Something like
'watch seq-scan nodes and if the number of matching rows is much smaller
than the total number, put a HINT into log.'

I was thinking about doing something like this for correlated columns
(when the number of matching rows is severely underestimated). But that's
far in the future.

Tomas



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Grzegorz Jaśkiewicz

I don't think planner should do things like creating an index. But it
might hint at doing it in the logs.
There was a discussion around that sort of feature on -hackers not so
long time ago. I don't remember what the conclusion was, but probably
that it just isn't worth wasting planner's cycles when you execute a
query.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL Trusted Startup

2010-12-21 Thread Kenneth Buckler

On Mon, Dec 20, 2010 at 8:53 PM, Craig Ringer
 wrote:
>
> Do you have a trusted boot path from BIOS to bootloader to kernel to init
> core userspace, where everything is digitally signed (by you or someone
> else) and verified before execution? Do you disable kernel module loading?
>
> If not, you're wasting your time, because a compromise by malicious kernel
> module, modified init, modified md5 command, etc will render your
> precautions totally pointless.
>
> If your BIOS can't verify the bootloader, which is likely on an x86 / x64
> system, then you can still get some protection by signing your kernels and
> using a bootloader that checks signatures. If someone messes with the
> bootloader you lose, but it'll help protect you against obvious automated
> attacks. You might be able to use the Trusted Platform Module (TPM) on your
> machine to get a fully verified chain of trust, though, by using Trusted
> GRUB.
>
> http://trousers.sourceforge.net/grub.html
>
> If you can reasonably trust that the kernel you loaded is OK, you can have
> it verify signatures on binaries before executing them. There was a DigSig
> project for that (http://disec.sourceforge.net/) but it seems to have
> stopped recently. I'm not sure if there's any replacement.
>
> Without kernel-level signature verification, all you can really do is have a
> custom initrd/initramfs (signed and verified by grub during boot) that
> checks the signatures on init, md5, gpg, libc, etc etc (any binary root
> runs, including scripts) before switching to the real root FS during boot.
> Then you can have your Pg startup scripts (which you signed on a separate,
> trusted machine) verify GnuPG signatures of the Pg binaries before
> execution.
>
> All in all, it's a painful, clumsy way to do things, and AFAIK there's
> little support in mainline Linux systems for trusted boot and trusted-binary
> systems. You might find out more with a search for "linux trusted
> computing", "linux trusted boot", "linux tpm", "linux signed binaries", etc.
>
> Personally, I'd be using existing system- and network-level intrusion
> detection tools like tripwire and snort to try to spot intrusion if and when
> it happens. I'm not confident that a chain-of-trust approach is workable on
> Linux systems at present, though I'd love to be proved wrong by being
> pointed at existing support I've missed.
>
> --
> Craig Ringer
>

I find it very comforting that I am not the only one who finds this
requirement a bit "out there".
Unfortunately, these requirements are set in stone, and no matter how
hard I try, can not be altered.
We live in a world where compliance is king.  Nevermind if compliance
doesn't actually make the system more secure.

Unfortunately Tripwire does not meet the full requirement, as it does
not prevent the database from starting.

Ken

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread tv

> On 2010-12-21 10:42, Massa, Harald Armin wrote:
>> b) creating an index requires to read the data-to-be-indexed. So, to
>> have an
>> index pointing at the interesting rows for your query, the table has to
>> be
>> read ... which would be the perfect time to allready select the
>> interesting
>> rows. And after having the interesting rows: the index is worthless
>
> ... until another similar query comes along, when suddenly it's a massive
> win.
> Why not auto-create indices for some limited period after database load
> (copy?  any large number of inserts from a single connection?), track
> those
> that actually get re-used and remove the rest?   Would this not provide
> a better out-of-the-box experience for neophytes?

Say you have a table with several columns (A,B,C), and the query is using
some of them. What indexes would you create? One index on every column? A
multi-column index on all columns? Indexs for each combination of columns?

There really is no automatic way to solve this puzzle using a single
query. Indexing strategy is a very tough design discipline, and it
requires a complex knowledge of the workload. One slow query does not mean
the index should be created - what if that was just an ad-hoc query and
will not be executed ever again?

So although this 'automatic index creation' seems nice, it really does not
work in practice.

I really don't want to work with products that try to be smarter than me
(and I admit I'm pretty dumb from time to time) and act rather randomly
due to this 'artificial intelligence'. I've already been there and I don't
want to repeat this experience.

>
> [...]
>> Why is the query planner not allowed to create indexes, but only allowed
>> to
>>> use or not use what's available?
>>>
>>
>> as in b): Creating an index is quite expensiv
>
> How much more so than doing that full-table-scan plus sort, which your
> query is doing anyway?

A lot. It consumes a lot of CPU, it may consume a lot of memory (up to
maintenance_work_mem) etc. So imagine a few users, building indices on a
big table simultaneously - no, I really don't want to see this on a
production server.

Building an index is just one side of the problem - maintenance of the
indexes is another thing. Each index has an impact on write operations
(INSERT/UPDATE) and may cause that HOT actually does not work (thus
causing unnecessary bloat of the index). And the indexes may need lot of
space on a disk.

But the real show stopper is probably locking. Building an index takes a
write lock on a table, effectively blocking writes. Sure, you can use a
'CREATE INDEX ... CONCURRENTLY' introduced in 8.2, but there are some
disadvantages of that (see
http://www.postgresql.org/docs/9.0/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY).

cheers
Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Trouble uninstalling old postgresql installs on osx

2010-12-21 Thread Geoff Bowers

Hi there,

Have been trying to uninstall old instances of Postgres from my Snow
Leopard install, preparing to install 9.0

Not sure how old these instances are (probably dates back to 7). I can
see them in the active process list, but I'm not sure how to
permanently stop them. Any old timers have some tips on nuking old
installs? Don't need any settings or data from them.. Just looking for
scorched earth.

Any help much appreciated,

GB

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Jeremy Harris


On 2010-12-21 10:42, Massa, Harald Armin wrote:

b) creating an index requires to read the data-to-be-indexed. So, to have an
index pointing at the interesting rows for your query, the table has to be
read ... which would be the perfect time to allready select the interesting
rows. And after having the interesting rows: the index is worthless


... until another similar query comes along, when suddenly it's a massive win.
Why not auto-create indices for some limited period after database load
(copy?  any large number of inserts from a single connection?), track those
that actually get re-used and remove the rest?   Would this not provide
a better out-of-the-box experience for neophytes?

[...]

Why is the query planner not allowed to create indexes, but only allowed to

use or not use what's available?



as in b): Creating an index is quite expensiv


How much more so than doing that full-table-scan plus sort, which your
query is doing anyway?

Cheers,
Jeremy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Dario Beraldi

Ok, thanks a lot to all of you for your answers! (Always impressed by  
the prompt feedback you get on this list!)


Quoting Gabriele Bartolini :


Ciao Dario,

On Tue, 21 Dec 2010 09:14:36 +, Dario Beraldi  
 wrote:

the query run better the planner is not able (allowed?) to create such
index, use it, and drop it once the query is done. Why is it so?


Because it is not its responsibility. This is the simplest and most  
rational answer.


I would reply with some questions that hopefully will give you the  
answer. How can you define a 'good' query from a 'bad' query?  
Consider the case when an user launches a wrong query and the  
planner you propose starts creating an index. I believe that would  
be a mess.


Then ... how could you project this scenario in a concurrent context  
where multiple users launch queries that 'need' an index?


I suggest that you look at the documentation for more information.  
Otherwise, I strongly suggest that you read the chapter on the  
planner from Greg's book on High Performance (which you can find  
from here: http://www.postgresql.org/docs/books/)


Merry Christmas to you too!

Cheers,
Gabriele

--
 Gabriele Bartolini - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it - www.2ndQuadrant.it






--

Dr. Dario Beraldi
Institute of Evolutionary Biology
University of Edinburgh
West Mains Road
Edinburgh EH9 3JT
Scotland, UK


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] [feature request] left/right join + limit optimization

2010-12-21 Thread pasman pasmański

hello.

I think that left/right joins and limit may be optimized.
When there aren't WHERE conditions this may be executed as below:

Limit N
Merge Left Join
  Sort Top N
Bitmap Heap Scan
  ...
  Sort
Bitmap Heap Scan
  ...



pasman

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] How to use pgbouncer

2010-12-21 Thread Adarsh Sharma


Dear all,

I am not able to find any useful document regarding Configuration and 
Running Pgbouncer with Postgres-8.4.2.


How it helps and is it able to boost some performance ?

Or if there is another useful tool available for Connection Pooling. 
Please guide me for this.



Thanks & Regards

Adarsh Sharma

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How to guard PostgreSQL from overcharged and/or malicious (read-only) queries?

2010-12-21 Thread Alban Hertroys

On 21 Dec 2010, at 10:57, Stefan Keller wrote:

> You answered:
>>> 1. Filter out all SQL commands which are *not* read-only (no DROP
>> Most people do this using permissions.
> 
> Oh, yes: forgot to mention that; that's obvious. What I also looked
> for was the PL/pgSQL's "EXECUTE command-string".

I'm not sure what you're getting at here, but if you're saying that you have to 
catch SQL commands called from EXECUTE separately I think you're wrong. I 
wouldn't expect Postgres to not apply permissions in such cases. A simple 
test-case can prove that.

>>> 2. Get the estimated time (units) from PostgreSQL planner in a
>>> reliable way (if possible standard/ANSI).
> 
> Ok; again keep in mind that I have a read-only database. Therefore the
> statistics should be up-to-date (after a vacuum analyse).

Up-to-date? Probably. Correct or adequate? That depends, as I pointed out in my 
previous message.

> See below my attempt to write such a function I called
> "secure_execute(text)". It's still not functioning and I have
> indicated two problems there. What do you think?

Your problems seem to stem from a lack of experience with set-returning 
functions. You best look up the documentation for those, it explains it better 
than I could. Make sure you look at the docs for the versions of Postgres that 
you're using or expect to use, as there's a relatively new feature in this 
domain returning a set as a table.

> I like the idea letting abandon the query if it's obviously(!) wrong
> or if the planner alerts me about very high costs?
> Or should I rather abandon the idea of such a function and simply rely
> on read-only privileges and a session statement_timeout?

Although I don't think what you're attempting is wrong in any way, I'd first 
see whether it's necessary to do so. You can use the built-in features 
(permissions & statement_timeout) and see whether that's adequate for your 
use-case. If it's not, then it's time to look into tightening things up.

Of course, having some experience with the solution through experimentation 
can't hurt if you can afford to.

What you're doing would by many on this list be pointed out as "premature 
optimisation", although that usually involves query performance ;)

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.

!DSPAM:737,4d10881c802651631920626!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Gabriele Bartolini


Hi Harald,

On Tue, 21 Dec 2010 11:42:40 +0100, "Massa, Harald Armin" 
 wrote:

a) There is a proposal (and, at the time being) also some code on
pgfoundry creating "hypothetical indexes"

http://postgresql.1045698.n5.nabble.com/Hypothetical-Indexes-PostgreSQL-extension-PGCON-2010-td3288085.html
[1]
http://archives.postgresql.org/pgsql-hackers/2010-12/msg00087.php [2]


I totally missed this. Thanks for posting it.

Cheers,
Gabriele

--
 Gabriele Bartolini - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it - www.2ndQuadrant.it

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Massa, Harald Armin

Hello Dario,

When an index is available for a query, the planner decides whether to use
> it or not depending on whether it would make the query perform better,
> right? However if an index, which does not exist, would make the query run
> better the planner is not able (allowed?) to create such index, use it, and
> drop it once the query is done. Why is it so?
>

>From my knowledge there is more then one answer:

a) There is a proposal (and, at the time being) also some code on pgfoundry
creating "hypothetical indexes"
http://postgresql.1045698.n5.nabble.com/Hypothetical-Indexes-PostgreSQL-extension-PGCON-2010-td3288085.html
http://archives.postgresql.org/pgsql-hackers/2010-12/msg00087.php

The idea is: to play with "what-would-be-if-there-would-be-an-index".

With keywords "hypothetical index", "Index advisor" and "virtual index"
there is some research googleable

b) creating an index requires to read the data-to-be-indexed. So, to have an
index pointing at the interesting rows for your query, the table has to be
read ... which would be the perfect time to allready select the interesting
rows. And after having the interesting rows: the index is worthless

c) PostgreSQL is in fact doing something quite similiar like "temporary
indexes during the query"; only for the selected rows. The words to google
for is "hash maps", "bitmap access".

Why is the query planner not allowed to create indexes, but only allowed to
> use or not use what's available?
>

as in b): Creating an index is quite expensiv

additionally: having multiple possible plans is also creating a new decision
problem: which of the possible plans will lead to the better result; again
with some meanings of "better": faster result or less processor usage or
less memory usage or less disk accesses. So adding additional indices during
planning would worsen this problem; which has to be balanced against
possible gains.

So: hypothetical indizes are a good idea, BUT current limitations would most
likely force them to be made outside the life query process.

Best wishes,

Harald

-- 
GHUM GmbH
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607

Amtsgericht Stuttgart, HRB 734971
-
persuadere.
et programmare

Re: [GENERAL] Can the query planner create indexes?

2010-12-21 Thread Gabriele Bartolini


Ciao Dario,

On Tue, 21 Dec 2010 09:14:36 +, Dario Beraldi 
 wrote:
the query run better the planner is not able (allowed?) to create 
such

index, use it, and drop it once the query is done. Why is it so?


Because it is not its responsibility. This is the simplest and most 
rational answer.


I would reply with some questions that hopefully will give you the 
answer. How can you define a 'good' query from a 'bad' query? Consider 
the case when an user launches a wrong query and the planner you propose 
starts creating an index. I believe that would be a mess.


Then ... how could you project this scenario in a concurrent context 
where multiple users launch queries that 'need' an index?


I suggest that you look at the documentation for more information. 
Otherwise, I strongly suggest that you read the chapter on the planner 
from Greg's book on High Performance (which you can find from here: 
http://www.postgresql.org/docs/books/)


Merry Christmas to you too!

Cheers,
Gabriele

--
 Gabriele Bartolini - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it - www.2ndQuadrant.it

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How to guard PostgreSQL from overcharged and/or malicious (read-only) queries?

2010-12-21 Thread Stefan Keller

Hi Alban

Many thanks for your answers.

You answered:
>> 1. Filter out all SQL commands which are *not* read-only (no DROP
> Most people do this using permissions.

Oh, yes: forgot to mention that; that's obvious. What I also looked
for was the PL/pgSQL's "EXECUTE command-string".

>> 2. Get the estimated time (units) from PostgreSQL planner in a
>> reliable way (if possible standard/ANSI).

Ok; again keep in mind that I have a read-only database. Therefore the
statistics should be up-to-date (after a vacuum analyse). What I have
in mind is exposing the database to the 'public' for exercising and
testing in a way similar to the following (try a query like "SELECT
ST_AsText(ST_GeometryFromText('POINT(8.5 47.7)', 4326));"):
http://openlayers-buch.de/beispiele/chapter-09/postgis-terminal.html

See below my attempt to write such a function I called
"secure_execute(text)". It's still not functioning and I have
indicated two problems there. What do you think?
I like the idea letting abandon the query if it's obviously(!) wrong
or if the planner alerts me about very high costs?
Or should I rather abandon the idea of such a function and simply rely
on read-only privileges and a session statement_timeout?

Yours, S.

--
-- Executes a query. Aborts if it contains ";" or does take too long.
-- Returns: TABLE
--
DROP FUNCTION secure_execute(text);
--
CREATE OR REPLACE FUNCTION secure_execute(text)
RETURNS SETOF real  -- << PROBLEM 1: Want to return the resultset of
the query here as table (SETOF RECORD?) .
LANGUAGE 'plpgsql' STRICT
AS $$
DECLARE
  query text := $1;
  cost_estimate_txt text;
  max_cost_estimate integer;
  rec RECORD;
BEGIN
  -- Abort if ";" is in query
  -- tbd.

  -- Get max_cost_estimate:
  EXECUTE 'EXPLAIN' || ' ' || query INTO cost_estimate_txt;
  SET statement_timeout TO DEFAULT;
  -- txt example: 'Function Scan on generate_series id
(cost=0.00..12.50 rows=1000 width=0)'
  max_cost_estimate := round(CAST(substring(cost_estimate_txt,50,6) AS
numeric),0);

  -- Execute query (abort if too time consuming)!
  IF max_cost_estimate < 100 THEN  -- in units (production config.:
depends on machine)
SET statement_timeout TO 10; -- in ms (production config.: set to
a minute = 6ms)
EXECUTE $1 INTO rec;
SET statement_timeout TO DEFAULT;
  END IF;
  RETURN; -- << PROBLEM 2: want to return "rec" here.

  -- Error handling: Catch all
  EXCEPTION WHEN OTHERS THEN
SET statement_timeout TO DEFAULT;
RAISE NOTICE 'ERROR';
RETURN;
END;
$$
-- Test (positive):
SELECT secure_execute('SELECT random() FROM generate_series(1, 100) AS id');
-- Test (not ok):
SELECT secure_execute('SELECT random() FROM generate_series(1, 10)
AS id');  -- timeout
SELECT secure_execute('SELECT random() FROM generate_series(1, 100) AS
id;'SELECT * FROM generate_series(1, 100)); -- two commands
SELECT secure_execute('DROP TABLE IF EXISTS dummy');  -- malicious!
SELECT secure_execute('SELECT random() FROM generate_series(1, 100) AS
id;DROP TABLE IF EXISTS dummy'); -- two commands, one malicious


2010/12/20 Alban Hertroys :
> On 20 Dec 2010, at 10:05, Stefan Keller wrote:
>
>> I'd like to guard postgres from overcharged and/or malicious queries.
>>
>> The queries are strinctly read-only (from a SQL users perspective).
>> For doing this I'd like to code two functions (preferrably pl/pgsql):
>>
>> 1. Filter out all SQL commands which are *not* read-only (no
>> DROP/DELETE/UPDATE/TRUNCATE).
>
> Most people do this using permissions.
>
>> 2. Get the estimated time (units) from PostgreSQL planner in a
>> reliable way (if possible standard/ANSI).
>
>
> I don't think there's a way to do that directly, not without hacking the 
> source.
>
> What you can do is to have all users go through a SECURITY DEFINER type of 
> function that does this for them. That function can then read the output of 
> EXPLAIN  for its estimates. Those aren't exactly times, but cost 
> estimates. The actual time taken depends on your hardware, you would need to 
> do some measurements to see how planned costs and actual time relate.
>
> I'm not sure this is a good idea though.
> Firstly, testing the query plan adds a little overhead to every query coming 
> in. It's not a lot, but if someone fires a lot of small fast queries it could 
> become a problem. You would be hurting the people who're using your database 
> correctly, instead of the people who're "abusing" it.
>
> Secondly, you could achieve a similar effect by limiting the amount of time a 
> query is allowed to run. I'm pretty sure there are configuration options that 
> cause long-running queries to get killed after a set time.
>
> Thirdly... Reliable estimates??? Lol!
> Seriously, as hard as Postgres tries to keep statistics that make sense, I 
> don't think they can ever be considered entirely reliable. You may not be 
> vacuuming frequently enough, your statistics target may be too small or your 
> data might not be suitable for statistical analysis (random numbers and very 
> unbalan

[GENERAL] Can the query planner create indexes?

2010-12-21 Thread Dario Beraldi


Hello,

This question is just for my curiosity...

When an index is available for a query, the planner decides whether to  
use it or not depending on whether it would make the query perform  
better, right? However if an index, which does not exist, would make  
the query run better the planner is not able (allowed?) to create such  
index, use it, and drop it once the query is done. Why is it so?
Why is the query planner not allowed to create indexes, but only  
allowed to use or not use what's available?


Am I misunderstanding the way the planner works? Any explanation &  
clarification much appreciated!


All the best and Christmas wishes!
Dario

--

Dr. Dario Beraldi
Institute of Evolutionary Biology
University of Edinburgh
West Mains Road
Edinburgh EH9 3JT
Scotland, UK


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq ASYNC with PQgetResult and PQisBusy

2010-12-21 Thread Raimon Fernandez

hi again,

On 20dic, 2010, at 18:48 , Tom Lane wrote:

>> So, now I'm using the PQisBusy to check if postgre is still busy and I can 
>> safely call the PQgetResult wihtout blocking, or just wait *some time* 
>> before sending a new PQisBusy.
> 
> Your proposed code is still a busy-wait loop.  What you should be doing
> is waiting for some data to arrive on the socket.  Once you see
> read-ready on the socket, call PQconsumeInput, then check PQisBusy to
> see if the query is complete or not.  If not, go back to waiting on the
> socket.  Typically you'd use select() or poll() to watch for both data
> on libpq's socket and whatever other events your app is interested in.

As I'm doing this as an excercise and how knows, I'm still playing with my 
previous approach, the same question but in a different way:

Why the PQisBusy is telling me that it's still busy if I send the PQgetResult 
and obtain them at the same moment ?

Now I'm not in a loop, just send the PQsendQuery from a button, and then, I 
just click on another button that simply checks for the PQconsumeInput and 
PQisBusy, and I'm clickin on it each second, and always it's returning busy, 
but if I send the PQgetResult I get it.

So, why libpq it's not updating it's status ?

thanks again,

r.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

60 matches

Mail list logo