Re: [GENERAL] Lengthy deletion

2011-11-29 Thread Herouth Maoz

On 29/11/2011, at 09:13, Tom Lane wrote:

 Herouth Maoz hero...@unicell.co.il writes:
 I was instructed to delete old records from one of the tables in our 
 production system. The deletion took hours and I had to stop it in 
 mid-operation and reschedule it as a night job. But then I had to do the 
 same when I got up in the morning and it was still running.
 
 I got an interesting clue, though, when I canceled the deletion the second 
 time around. I got the following error message:
 
 Cancel request sent
 ERROR:  canceling statement due to user request
 CONTEXT:  SQL statement SELECT 1 FROM ONLY public.sent_messages x WHERE 
 $1 OPERATOR(pg_catalog.=) subscription_id FOR SHARE OF x
 
 Yup, that's a clue all right.  I'll bet a nickel that you don't
 have an index on the foreign key's referencing column (ie,
 sent_messages.subscription_id).  That means each delete in
 the referenced table has to seqscan the referencing table to
 see if the delete would result in an FK violation.


Makes sense. But shouldn't that be figured into the EXPLAIN plan?

--
חרות מעוז
יוניסל פתרונות סלולריים מתקדמים
☎ 03-5181717 שלוחה 742



Re: [GENERAL] Lengthy deletion

2011-11-29 Thread Tom Lane
Herouth Maoz hero...@unicell.co.il writes:
 On 29/11/2011, at 09:13, Tom Lane wrote:
 Yup, that's a clue all right.  I'll bet a nickel that you don't
 have an index on the foreign key's referencing column (ie,
 sent_messages.subscription_id).  That means each delete in
 the referenced table has to seqscan the referencing table to
 see if the delete would result in an FK violation.

 Makes sense. But shouldn't that be figured into the EXPLAIN plan?

If you run EXPLAIN ANALYZE, on a reasonably recent release, you'll see
the foreign-key trigger eating a lot of time.  Plain EXPLAIN doesn't
show triggers because it has no idea how much time they'll take.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] odbc_fdw

2011-11-29 Thread Albe Laurenz
fschwend at hsr.ch wrote:
 I built the current PostgreSQL 9.1.1 sources under Ubuntu 11.04 (in a
VMware under Win7).
 I followed the steps in this guide:

www.thegeekstuff.com/2009/04/linux-postgresql-install-and-configure-from
-source
 
 It seems to work (I can run the server and connect to it with
PgAdmin).
 
 Now I'd like to integrate the ODBC_FDW extension in my installation.
However, I don't really
 understand the steps described on the download page:
 pgxn.org/dist/odbc_fdw/0.1.0
 
 Can anybody tell me how to build it? I'm a software developer myself
but a Linux newbie...

I haven't tried it, but looking at the instructions I would say:

1) Install UnixODBC. If you use Packages, make sure you install the
headers too
   (on my RHEL system, the packages are called unixODBC and
unixODBC-devel).

2) Get odbc_fdw and unzip the source.

3) Change into the source directory.

4) Run make and make install as the web page describes it
(substituting
   the path to your configured PostgreSQL source tree).

If there still is something unclear, ask for details.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] psql query gets stuck indefinitely

2011-11-29 Thread tamanna madaan
well, one question : Is tcp-keep-alive enabled by default in postgres-8.1.2 .

I am using postgres on linux platform .



On Tue, Nov 29, 2011 at 8:51 AM, tamanna madaan 
tamanna.mad...@globallogic.com wrote:

 Hi Craig

 Thanks for your reply . But unfortunately I dont have that process running
 right now. I have already killed that process . But I have seen this
 problem sometimes on my setup.
 It generally happens when the remote system is going slow for some reason
 (CPU utilization high etc.)  . But whatever is the reason , I would assume
 that the query should return with some error or so
 in case the system, the query is running on , is rebooted .  But  it
 doesn't return and remain stuck. Moreover, the same query sometimes hangs
 even if it is run on local postgres  database so I dont think
 network issues have any role in that . Please help.

 Thanks

 Regards
 Tamanna


 On Tue, Nov 29, 2011 at 7:58 AM, Craig Ringer ring...@ringerc.id.auwrote:

 On 11/28/2011 05:30 PM, tamanna madaan wrote:

 Hi All
 I have postgres installed in cluster setup. My system has a script
 which executes the below query on remote system in cluster.
 psql -t -q -Uslon -hhostip -ddbname -cselect 1;
 But somehow this query got stuck. It didnt return even after the remote
 system( on which this query was supposed to execute) is rebooted . What
 could be the reason ??


 I relised just after sending my last message:

 You should use ps to find out what exactly psql is doing and which system
 call it's blocked in in the kernel (if it's waiting on a syscall). As you
 didn't mention your OS I'll assume you're on Linux, where you'd use:

  ps -C psql -o wchan:80=

 or

  ps -p 1234 -o wchan:80=

 ... where 1234 is the pid of the stuck psql process. In a psql waiting
 for command line input I see it blocked in the kernel routine n_tty_read
 for example.


 If you really want to know what it's doing you can also attach gdb and
 get a backtrace to see what code it's paused in inside psql:

 gdb -q -p 1234 __END__
 bt
 q
 __END__

 If you get a message about missing debuginfos, lots of lines reading
 no debugging symbols found or lots of lines ending in ?? () then you
 need to install debug symbols. How to do that depends on your OS/distro so
 I won't go into that; it's documented on the PostgreSQL wiki under how to
 get a stack trace but you probably won't want to bother if this is just
 for curiosity's sake.

 You're looking for output that looks like:

 #1  0x00369d22a131 in rl_getc () from /lib64/libreadline.so.6
 #2  0x00369d22a8e9 in rl_read_key () from /lib64/libreadline.so.6
 #3  0x00369d215b11 in readline_internal_char () from
 /lib64/libreadline.so.6
 #4  0x00369d216065 in readline () from /lib64/libreadline.so.6

 ... etc ...


 --
 Craig Ringer




 --
 Tamanna Madaan | Associate Consultant | GlobalLogic Inc.
 Leaders in Software RD Services
 ARGENTINA | CHILE | CHINA | GERMANY | INDIA | ISRAEL | UKRAINE | UK | USA

 Office: +0-120-406-2000 x 2971

 www.globallogic.com





-- 
Tamanna Madaan | Associate Consultant | GlobalLogic Inc.
Leaders in Software RD Services
ARGENTINA | CHILE | CHINA | GERMANY | INDIA | ISRAEL | UKRAINE | UK | USA

Office: +0-120-406-2000 x 2971

www.globallogic.com


[GENERAL] Re: Any experiences with Foreign Data Wrappers (FDW) like mysql_fdw, odbc_fdw, www_fdw or odbc_fdw?

2011-11-29 Thread Ronan Dunklau
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello.

We use (and develop) the multicorn foreign data wrapper
(http://multicorn.org).
It is very experimental, but works well for our needs.

We use it to perform IMAP searches in our in-house CRM solution, as
well as to list files on a file system.

We never tested it under Windows so try it at your own risks :)


- -- 
Ronan Dunklau
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.18 (GNU/Linux)

iQEcBAEBAgAGBQJO1MrHAAoJECTYLCgFy323X6IH/Avo7KERwWORjl543QvFnwHs
nueyj7m/LUjjBbl/WjtrrRFRFnC1Y2iO4gvJRqVe119KRJBvyxiLqF0mr40U0lWg
EGErRf4648xoRL/gCvYHJmovs/ukarBk202v4TJPb9XThZgjlc4oSKa9qNUOj2OF
xONIO/SnnZ2Cql/MWcEQuuSjZ/ET+6tBNawCA474h19KX+i09OTcAvNLC9pmVhOt
t5j8U1bDDKS+JJ5SghSlR5Za5RE9XkA/j2VSLULtz4dV0RF3s+P0NFDYJfxVfVCw
arzAfIbT44hjBF9esIJXiI5jhteQ83C+zezCUixes2G6NrzFJu4fFaeR7H77cOo=
=iopq
-END PGP SIGNATURE-

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] immutable functions

2011-11-29 Thread Andy Chambers
The documentation has this to say about immutable functions...

 or otherwise use information not directly present in its argument list

If the arguments are row variables, does this allow access to the
data in the row?  For example, is it safe to make the following
function definition immutable.

CREATE OR REPLACE FUNCTION distance(geocodes, geocodes)
  RETURNS double precision AS
$BODY$
  select case $1.zip =  $2.zip
   when  true then 0
   else  ((acos(sin(($1.lat) * (pi()/180)) *
  sin(($2.lat)*(pi()/180)) + cos(($1.lat)*(pi()/180)) *
  cos(($2.lat)*(pi()/180)) * cos(($1.lon - $2.lon) *
  (pi()/180*(180/pi())* 60 * 1.1515)
 end;
$BODY$
LANGUAGE sql immutable
COST 100;

Cheers,
Andy

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Limiting number of connections to PostgreSQL per IP (not per DB/user)?

2011-11-29 Thread Heiko Wundram

Hello!

Sorry for that subscribe post I've just sent, that was bad reading on my 
part (for the subscribe info on the homepage).


Anyway, the title says it all: is there any possibility to limit the 
number of connections that a client can have concurrently with a 
PostgreSQL-Server with on-board means (where I can't influence which 
user/database the clients use, rather, the clients mostly all use the 
same user/database, and I want to make sure that a single client which 
runs amok doesn't kill connectivity for other clients)? I could surely 
implement this with a proxy sitting in front of the server, but I'd 
rather implement this with PostgreSQL directly.


I'm using (and need to stick with) PostgreSQL 8.3 because of the 
frontend software in question.


Thanks for any hints!

--
--- Heiko.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PLPGSQL: How can I get the effected rows when use execute command in function

2011-11-29 Thread Muiz
Dear all,

   Can I get the effected rows after executing sqls in function?
e.g.:

CREATE OR REPLACE FUNCTION execsqls(sqls character varying)
  RETURNS integer AS
$BODY$
DECLARE
BEGIN
   EXECUTE sqls;
   -- TODO-1: I want to know how many records the input sqls
effects?
   RETURN effectedRows;
END;
$BODY$
  LANGUAGE plpgsql;

test: select  execsqls('update mytable where name like ''%abc''')

-- 
Regards,
*Muiz*


Re: [GENERAL] PLPGSQL: How can I get the effected rows when use execute command in function

2011-11-29 Thread Pavel Stehule
Hello

yes, you can. Look on GET DIAGNOSTICS statement

http://www.postgresql.org/docs/9.1/interactive/plpgsql-statements.html#PLPGSQL-STATEMENTS-DIAGNOSTICS

Regards

Pavel Stehule

2011/11/29 Muiz work.m...@gmail.com:
 Dear all,

    Can I get the effected rows after executing sqls in function?
 e.g.:

 CREATE OR REPLACE FUNCTION execsqls(sqls character varying)
   RETURNS integer AS
 $BODY$
 DECLARE
 BEGIN
            EXECUTE sqls;
            -- TODO-1: I want to know how many records the input sqls
 effects?
            RETURN effectedRows;
 END;
 $BODY$
   LANGUAGE plpgsql;

 test: select  execsqls('update mytable where name like ''%abc''')

 --
 Regards,
 Muiz




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PLPGSQL: How can I get the effected rows when use execute command in function

2011-11-29 Thread Ernesto Quiniones
if you are doing insert, update or delete you can use retuirng command in the 
query, work with a cursor to get the rows


- Mensaje original -
 Dear all,
 
       Can I get the effected rows after executing sqls in function?
 e.g.:
 
 CREATE OR REPLACE FUNCTION execsqls(sqls character varying)
     RETURNS integer AS
 $BODY$
 DECLARE
 BEGIN
                       EXECUTE sqls;
                       -- TODO-1: I want to know how many records the input 
sqls
 effects?
                       RETURN effectedRows;
 END;
 $BODY$
     LANGUAGE plpgsql;
 
 test: select   execsqls('update mytable where name like ''%abc''')
 
 -- 
 Regards,
 *Muiz*


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] initdb failure on Windows 2003

2011-11-29 Thread Mike Wylde
I’m trying to install Postgres 8.4.9.1 on a Windows 2003 SP2 64bit
operating system. The download has installed previously (to a windows 7
box) so I know it should be OK but under W2003 the initdb command seems to
be failing. It doesn’t return an error message but non of the actions are
performed, the data directory remains empty.



Any ideas, or extra logging that I can switch on to try and tie this down.
I can’t reproduce it on a W2003 box running the 32bit OS and getting hold
of another 64 bit box is complicated.



The install log looks like:



Called CreateDirectory(D:\Mingle\PostgreSQL\8.4\data)...

Called CreateDirectory(D:\Mingle\PostgreSQL\8.4)...

WScript.Network initialized...

Called IsVistaOrNewer()...

'winmgmts' object initialized...

Version:5.2

MajorVersion:5

Ensuring we can write to the data directory (using cacls):

Executing batch file 'rad38351.bat'...

Executing batch file 'rad38351.bat'...

Reading:objConfFile is nothing...

Writing:objConfFile is nothing...



We’ve also tried with 8.3 and 9.1 installers and get exactly the same
result.



Any help of hints would be most appreciated.



Regards,

Mike Wylde


Re: [GENERAL] immutable functions

2011-11-29 Thread Tom Lane
Andy Chambers achamb...@mcna.net writes:
 The documentation has this to say about immutable functions...
 or otherwise use information not directly present in its argument list

 If the arguments are row variables, does this allow access to the
 data in the row?

Sure.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Sporadic query not returning anything..how to diagnose?

2011-11-29 Thread Phoenix Kiula
Hi.

(My pgbouncer is finally working and has results in at least a 3-fold
site speed increase! YAY! Thanks to everyone who helped.)

Now, a new small problem.

In my PHP code I have a condition that checks for the existence of a
record, and if not found, it INSERTs a new one.

Here's the first SQL to check existence:



# SELECT ip FROM links WHERE ip = 1585119341 AND url_md5 =
'cd4866fa5fca31dfdb07c29d8d80731c' LIMIT 1

 QUERY PLAN
---
 Limit  (cost=0.00..26.20 rows=1 width=8) (actual time=0.054..0.054
rows=1 loops=1)
  -  Index Scan using idx_links_ip_url on links  (cost=0.00..26.20
rows=1 width=8) (actual time=0.053..0.053 rows=1 loops=1)
Index Cond: ((ip = 1585119341) AND (url_md5 =
'cd4866fa5fca31dfdb07c29d8d80731c'::text))
 Total runtime: 0.078 ms
(4 rows)



About 5% of the times (in situations of high traffic), this is not
returning a value in my PHP code. Because it's not found, the code
tries to INSERT a new record and there's a duplicate key error, which
is in the logs. The traffic to the site is much higher than the number
of these entries in my log, which means it's only happening
sometimes--my guess is for 5% of all queries, which is still quite
significant (about 60,000 a day).

I began logging these missed SELECT queries, and when I manually go
into the postgresql terminal and execute those queries, the record is
indeed found. No problem.

So my question: is this related to some timeout or something with
pgbouncer, where I suppose the connection is held for a split-second
longer than it would, and therefore the query doesn't return anything?
Probably an inane guess. Just wondering aloud.

Welcome any thoughts on how to debug this. Btw, the logging is
happening in the postgresql usual log file, the pgbouncer log just has
hordes of one-liners stating how many requests per minute...

Thanks!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sporadic query not returning anything..how to diagnose?

2011-11-29 Thread Heiko Wundram

Am 29.11.2011 16:46, schrieb Phoenix Kiula:

About 5% of the times (in situations of high traffic), this is not
returning a value in my PHP code. Because it's not found, the code
tries to INSERT a new record and there's a duplicate key error, which
is in the logs. The traffic to the site is much higher than the number
of these entries in my log, which means it's only happening
sometimes--my guess is for 5% of all queries, which is still quite
significant (about 60,000 a day).


As the two statements aren't atomic (i.e., after the select, how do you 
make sure that no other process accessing the DB has a chance to 
select/insert, before the [delayed] insert of the primary process gets 
executed)? This is a common form of race-condition.


In the spirit of it's easier to ask forgiveness than permission, just 
do the insert and evaluate the result (i.e., whether a record was 
actually inserted) to get at the same information and the same effect as 
with the two statements you're currently executing.


--
--- Heiko.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] stored function data structures - difficulty

2011-11-29 Thread J.V.

1)
What is a1 ?

2)
Where did the queries below come from?

3)
What information does each query below provide?

On 11/21/2011 9:14 PM, Pavel Stehule wrote:

Hello

2011/11/22 J.V.jvsr...@gmail.com:

I cannot find a way to programatically:
1.  Given a table name, find all foreign key fields in the given table by
field name (column name)
2.  Given a single foreign key field name, programatically look up the
corresponding reference table name and the reference primary key field

so have thought of simply hard coding this (for 100 tables).

What Kind of data structure could I use that is available to me that would
hold as the key the table name and from there be able to iterate through and
get all foreign keys by field name and from there another inner loop that
would give me another key/value pair of the table name (key) and the primary
key (value) that corresponds to that foreign key?

I want to hard code all of this information into a data structure and
iterate through at some point in another function.

Instead of discovering this programatically, I can manually look at each
table / schema diagram and hard code it, but I really need one super
structure that will hold as keys every table in the schema and be able to
drill down that that tables foreign keys and from there further drill down
to get the table name, primary key field in that table.

I have seen a number of structures that might work, but cannot find an
example on how to actually use for what I need to do.  If you do have an
idea of a structure, it would be great and awesome if I could be pointed to
an actual working example that I could test in a sandbox first to understand
how it works.


psql has a nice featute, that can help with orientation in system catalog

if I need a query, that describe a some database object, I need to
know a adequate psql meta statement. You have to run psql with -E
param, and then psql shows a queries that was necessary for processing
a statement

[pavel@nemesis ~]$ psql -E postgres
psql (9.2devel)
Type help for help.

postgres=# \d a1
* QUERY **
SELECT c.oid,
   n.nspname,
   c.relname
FROM pg_catalog.pg_class c
  LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname ~ '^(a1)$'
   AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 2, 3;
**

* QUERY **
SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules,
c.relhastriggers, c.relhasoids, '', c.reltablespace, CASE WHEN
c.reloftype = 0 THEN '' ELSE
c.reloftype::pg_catalog.regtype::pg_catalog.text END, c.relpersistence
FROM pg_catalog.pg_class c
  LEFT JOIN pg_catalog.pg_class tc ON (c.reltoastrelid = tc.oid)
WHERE c.oid = '146989';
**

* QUERY **
SELECT a.attname,
   pg_catalog.format_type(a.atttypid, a.atttypmod),
   (SELECT substring(pg_catalog.pg_get_expr(d.adbin, d.adrelid) for 128)
FROM pg_catalog.pg_attrdef d
WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum AND a.atthasdef),
   a.attnotnull, a.attnum,
   (SELECT c.collname FROM pg_catalog.pg_collation c, pg_catalog.pg_type t
WHERE c.oid = a.attcollation AND t.oid = a.atttypid AND
a.attcollation  t.typcollation) AS attcollation,
   NULL AS indexdef,
   NULL AS attfdwoptions
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = '146989' AND a.attnum  0 AND NOT a.attisdropped
ORDER BY a.attnum;
**

* QUERY **
SELECT c2.relname, i.indisprimary, i.indisunique, i.indisclustered,
i.indisvalid, pg_catalog.pg_get_indexdef(i.indexrelid, 0, true),
   pg_catalog.pg_get_constraintdef(con.oid, true), contype,
condeferrable, condeferred, c2.reltablespace
FROM pg_catalog.pg_class c, pg_catalog.pg_class c2, pg_catalog.pg_index i
   LEFT JOIN pg_catalog.pg_constraint con ON (conrelid = i.indrelid AND
conindid = i.indexrelid AND contype IN ('p','u','x'))
WHERE c.oid = '146989' AND c.oid = i.indrelid AND i.indexrelid = c2.oid
ORDER BY i.indisprimary DESC, i.indisunique DESC, c2.relname;
**

* QUERY **
SELECT conname,
   pg_catalog.pg_get_constraintdef(r.oid, true) as condef
FROM pg_catalog.pg_constraint r
WHERE r.conrelid = '146989' AND r.contype = 'f' ORDER BY 1;
**

* QUERY **
SELECT conname, conrelid::pg_catalog.regclass,
   pg_catalog.pg_get_constraintdef(c.oid, true) as condef
FROM pg_catalog.pg_constraint c
WHERE c.confrelid = '146989' AND c.contype = 'f' ORDER BY 1;
**

* QUERY **
SELECT t.tgname, pg_catalog.pg_get_triggerdef(t.oid, true), t.tgenabled
FROM pg_catalog.pg_trigger t
WHERE t.tgrelid = '146989' AND NOT t.tgisinternal
ORDER BY 1;
**

* QUERY **
SELECT c.oid::pg_catalog.regclass FROM pg_catalog.pg_class c,
pg_catalog.pg_inherits i WHERE c.oid=i.inhparent AND i.inhrelid =
'146989' ORDER BY inhseqno;
**

* QUERY **
SELECT 

Re: [GENERAL] stored function data structures - difficulty

2011-11-29 Thread Pavel Stehule
Hello

2011/11/29 J.V. jvsr...@gmail.com:
 1)
 What is a1 ?

a1 is table name

 2)
 Where did the queries below come from?

These queries are used in psql console to ensure run backslash commands

 3)
 What information does each query below provide?

columns names, column types and other

Regards

Pavel Stehule



 On 11/21/2011 9:14 PM, Pavel Stehule wrote:

 Hello

 2011/11/22 J.V.jvsr...@gmail.com:

 I cannot find a way to programatically:
    1.  Given a table name, find all foreign key fields in the given table
 by
 field name (column name)
    2.  Given a single foreign key field name, programatically look up the
 corresponding reference table name and the reference primary key field

 so have thought of simply hard coding this (for 100 tables).

 What Kind of data structure could I use that is available to me that
 would
 hold as the key the table name and from there be able to iterate through
 and
 get all foreign keys by field name and from there another inner loop that
 would give me another key/value pair of the table name (key) and the
 primary
 key (value) that corresponds to that foreign key?

 I want to hard code all of this information into a data structure and
 iterate through at some point in another function.

 Instead of discovering this programatically, I can manually look at each
 table / schema diagram and hard code it, but I really need one super
 structure that will hold as keys every table in the schema and be able to
 drill down that that tables foreign keys and from there further drill
 down
 to get the table name, primary key field in that table.

 I have seen a number of structures that might work, but cannot find an
 example on how to actually use for what I need to do.  If you do have an
 idea of a structure, it would be great and awesome if I could be pointed
 to
 an actual working example that I could test in a sandbox first to
 understand
 how it works.

 psql has a nice featute, that can help with orientation in system catalog

 if I need a query, that describe a some database object, I need to
 know a adequate psql meta statement. You have to run psql with -E
 param, and then psql shows a queries that was necessary for processing
 a statement

 [pavel@nemesis ~]$ psql -E postgres
 psql (9.2devel)
 Type help for help.

 postgres=# \d a1
 * QUERY **
 SELECT c.oid,
   n.nspname,
   c.relname
 FROM pg_catalog.pg_class c
      LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
 WHERE c.relname ~ '^(a1)$'
   AND pg_catalog.pg_table_is_visible(c.oid)
 ORDER BY 2, 3;
 **

 * QUERY **
 SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules,
 c.relhastriggers, c.relhasoids, '', c.reltablespace, CASE WHEN
 c.reloftype = 0 THEN '' ELSE
 c.reloftype::pg_catalog.regtype::pg_catalog.text END, c.relpersistence
 FROM pg_catalog.pg_class c
  LEFT JOIN pg_catalog.pg_class tc ON (c.reltoastrelid = tc.oid)
 WHERE c.oid = '146989';
 **

 * QUERY **
 SELECT a.attname,
   pg_catalog.format_type(a.atttypid, a.atttypmod),
   (SELECT substring(pg_catalog.pg_get_expr(d.adbin, d.adrelid) for 128)
    FROM pg_catalog.pg_attrdef d
    WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum AND a.atthasdef),
   a.attnotnull, a.attnum,
   (SELECT c.collname FROM pg_catalog.pg_collation c, pg_catalog.pg_type t
    WHERE c.oid = a.attcollation AND t.oid = a.atttypid AND
 a.attcollation  t.typcollation) AS attcollation,
   NULL AS indexdef,
   NULL AS attfdwoptions
 FROM pg_catalog.pg_attribute a
 WHERE a.attrelid = '146989' AND a.attnum  0 AND NOT a.attisdropped
 ORDER BY a.attnum;
 **

 * QUERY **
 SELECT c2.relname, i.indisprimary, i.indisunique, i.indisclustered,
 i.indisvalid, pg_catalog.pg_get_indexdef(i.indexrelid, 0, true),
   pg_catalog.pg_get_constraintdef(con.oid, true), contype,
 condeferrable, condeferred, c2.reltablespace
 FROM pg_catalog.pg_class c, pg_catalog.pg_class c2, pg_catalog.pg_index i
   LEFT JOIN pg_catalog.pg_constraint con ON (conrelid = i.indrelid AND
 conindid = i.indexrelid AND contype IN ('p','u','x'))
 WHERE c.oid = '146989' AND c.oid = i.indrelid AND i.indexrelid = c2.oid
 ORDER BY i.indisprimary DESC, i.indisunique DESC, c2.relname;
 **

 * QUERY **
 SELECT conname,
   pg_catalog.pg_get_constraintdef(r.oid, true) as condef
 FROM pg_catalog.pg_constraint r
 WHERE r.conrelid = '146989' AND r.contype = 'f' ORDER BY 1;
 **

 * QUERY **
 SELECT conname, conrelid::pg_catalog.regclass,
   pg_catalog.pg_get_constraintdef(c.oid, true) as condef
 FROM pg_catalog.pg_constraint c
 WHERE c.confrelid = '146989' AND c.contype = 'f' ORDER BY 1;
 **

 * QUERY **
 SELECT t.tgname, pg_catalog.pg_get_triggerdef(t.oid, true), t.tgenabled
 FROM pg_catalog.pg_trigger t
 WHERE t.tgrelid = '146989' AND NOT t.tgisinternal
 ORDER BY 

[GENERAL] Query Optimizer makes a poor choice

2011-11-29 Thread Tyler Hains
Hi,

 

We’ve got a strange situation where two queries get dramatically
different performance because of how the Query Optimizer handles LIMIT.

 

# explain analyze select * from cards where card_set_id=2850 order by
card_id;
  QUERY
PLAN

--
 Sort  (cost=86686.36..86755.40 rows=27616 width=40) (actual
time=22.504..22.852 rows=5000 loops=1)
   Sort Key: card_id
   Sort Method:  quicksort  Memory: 583kB
   -  Bitmap Heap Scan on cards  (cost=755.41..84649.24 rows=27616
width=40) (actual time=0.416..1.051 rows=5000 loops=1)
 Recheck Cond: (card_set_id = 2850)
 -  Bitmap Index Scan on cards_card_set_id_indx
(cost=0.00..748.50 rows=27616 width=0) (actual time=0.399..0.399
rows=5000 loops=1)
   Index Cond: (card_set_id = 2850)
 Total runtime: 23.233 ms
(8 rows)

# explain analyze select * from cards where card_set_id=2850 order by
card_id limit 1;
   QUERY
PLAN

-
 Limit  (cost=0.00..105.19 rows=1 width=40) (actual
time=6026.947..6026.948 rows=1 loops=1)
   -  Index Scan using cards_pkey on cards  (cost=0.00..2904875.38
rows=27616 width=40) (actual time=6026.945..6026.945 rows=1 loops=1)
 Filter: (card_set_id = 2850)
 Total runtime: 6026.985 ms
(4 rows)

The only way we’ve found to get around the use of the PK index in the
second query is by invalidating it -- sorting it on a cast version of
the PK. This doesn’t work terribly well with our dataset. Is there a
better way around this?

Tyler Hains

IT Director

ProfitPoint, Inc.

www.profitpointinc.com

 




Re: [GENERAL] Query Optimizer makes a poor choice

2011-11-29 Thread Filip Rembiałkowski
2011/11/29 Tyler Hains tha...@profitpointinc.com:
 # explain analyze select * from cards where card_set_id=2850 order by
 card_id limit 1;
    QUERY PLAN
 -
  Limit  (cost=0.00..105.19 rows=1 width=40) (actual time=6026.947..6026.948
 rows=1 loops=1)
    -  Index Scan using cards_pkey on cards  (cost=0.00..2904875.38
 rows=27616 width=40) (actual time=6026.945..6026.945 rows=1 loops=1)
  Filter: (card_set_id = 2850)
  Total runtime: 6026.985 ms

do you have autovacum enabled?

does the plan change when you push stats target for this column?
ALTER TABLE cards ALTER card_set_id SET STATISTICS 500;
ANALYZE cards ( card_set_id );

what happens when you do:
select * from ( select * from cards where card_set_id=2850 ) order by
card_id limit 1
?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] initdb failure on Windows 2003

2011-11-29 Thread Filip Rembiałkowski
what is the output when you run this in command line:

...\path\to\initdb.exe c:\testpgdata

?



2011/11/29 Mike Wylde mwy...@thoughtworks.com:
 I’m trying to install Postgres 8.4.9.1 on a Windows 2003 SP2 64bit operating
 system. The download has installed previously (to a windows 7 box) so I know
 it should be OK but under W2003 the initdb command seems to be failing. It
 doesn’t return an error message but non of the actions are performed, the
 data directory remains empty.



 Any ideas, or extra logging that I can switch on to try and tie this down. I
 can’t reproduce it on a W2003 box running the 32bit OS and getting hold of
 another 64 bit box is complicated.



 The install log looks like:



 Called CreateDirectory(D:\Mingle\PostgreSQL\8.4\data)...

 Called CreateDirectory(D:\Mingle\PostgreSQL\8.4)...

 WScript.Network initialized...

 Called IsVistaOrNewer()...

     'winmgmts' object initialized...

     Version:5.2

     MajorVersion:5

 Ensuring we can write to the data directory (using cacls):

     Executing batch file 'rad38351.bat'...

     Executing batch file 'rad38351.bat'...

 Reading:    objConfFile is nothing...

 Writing:    objConfFile is nothing...



 We’ve also tried with 8.3 and 9.1 installers and get exactly the same
 result.



 Any help of hints would be most appreciated.



 Regards,

 Mike Wylde






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Extending the volume size of the data directory volume

2011-11-29 Thread Filip Rembiałkowski
here's what I would do to analyze this first:

- vmstat 1

- iostat -k -x 3

- look into system logs, maybe something actually happened there...

- look at the process list. find if some of Pg processes are in D state

- strace -f -v PID of the hanging writer process



2011/11/29 panam pa...@gmx.net:
 Hi,

 as I am importing gigabytes of data and the space on the volume where the
 data dictionary resides just became to small during that process, I resized
 it dynamically (it is a LVM volume) according to this procedure:
 http://www.techrepublic.com/blog/opensource/how-to-use-logical-volume-manager-lvm-to-grow-etx4-file-systems-online/3016
 Everything went without any problems and the import continued. Now, it is
 suddenly stuck (pgAdmin shows it as idle (piped connection)) and there is a
 good chance (as estimated from the space used) it just started using one of
 the added LE-Blocks (HDD space that was added to the volume). The db
 imported so far can be accessed just fine.
 So from the postmaster architecture, is there something that would explain
 this behaviour based on the hypothesis that newly added space was used? Any
 chance to revive the import somehow?

 Thanks

 --
 View this message in context: 
 http://postgresql.1045698.n5.nabble.com/Extending-the-volume-size-of-the-data-directory-volume-tp5030663p5030663.html
 Sent from the PostgreSQL - general mailing list archive at Nabble.com.

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query Optimizer makes a poor choice

2011-11-29 Thread Scott Marlowe
On Tue, Nov 29, 2011 at 11:21 AM, Tyler Hains tha...@profitpointinc.com wrote:
 # explain analyze select * from cards where card_set_id=2850 order by
 card_id limit 1;
    QUERY PLAN
 -
  Limit  (cost=0.00..105.19 rows=1 width=40) (actual time=6026.947..6026.948
 rows=1 loops=1)
    -  Index Scan using cards_pkey on cards  (cost=0.00..2904875.38
 rows=27616 width=40) (actual time=6026.945..6026.945 rows=1 loops=1)

There's a huge disconnect here between what the query planner expects
(27k rows) and how many there are (1).  Also, getting a single row
from an index should be faster than this, even if the table and index
are quite large.  Have you checked for bloat on this index?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] DDL DML Logging doesn't work for calling functions

2011-11-29 Thread MURAT KOÇ
 Hi list,

Version is PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC
gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-51), 64-bit.

We set logging parameters as below for DDL  DML Logging:
logging_collector = on
log_statement = mod
log_line_prefix = '%t--%d--%u--%h--%a--%i--%e'

Server log file consists as below sample log information:
2011-11-28 16:35:23 EET--postgres--postgres--136.10.195.51--pgAdmin III -
Query Tool--idle--0LOG:  statement: update test set t=10 where t=3;
2011-11-28 16:35:34 EET--postgres--postgres--136.10.195.51--pgAdmin III -
Query Tool--idle--0LOG:  statement: update test set t=10 where t=5;

Logging works successfully while we run direct DML commands like insert,
update, delete.

But, when I call a function that does DML, logging doesn't work and server
log file has no information about calling function.

I call function like this: SELECT p_dummy_insert();

###This is sample insert function###
CREATE OR REPLACE FUNCTION p_dummy_insert ()
  RETURNS void AS
$BODY$
BEGIN
 INSERT INTO employee values ('dummy', 'test');
END$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;

Not logging of function calls is expected behavior or a bug? We have no
information on server logs about who called function or when was it called
or what did called function do?

Regards,
Murat KOC


Re: [GENERAL] Query Optimizer makes a poor choice

2011-11-29 Thread Tyler Hains
2011/11/29 Tyler Hains tha...@profitpointinc.com:
 # explain analyze select * from cards where card_set_id=2850 order by
 card_id limit 1;
    QUERY 
PLAN
 

-
  Limit  (cost=0.00..105.19 rows=1 width=40) (actual 
time=6026.947..6026.948
 rows=1 loops=1)
    -  Index Scan using cards_pkey on cards  (cost=0.00..2904875.38
 rows=27616 width=40) (actual time=6026.945..6026.945 rows=1 loops=1)
  Filter: (card_set_id = 2850)
  Total runtime: 6026.985 ms

do you have autovacum enabled?

does the plan change when you push stats target for this column?
ALTER TABLE cards ALTER card_set_id SET STATISTICS 500;
ANALYZE cards ( card_set_id );

what happens when you do:
select * from ( select * from cards where card_set_id=2850 ) order by
card_id limit 1
?

--

Yes, I'm pretty sure autovacuum is enabled. Changing the query as shown 
there uses the sub-optimal index.

I haven't had a chance to experiment with the SET STATISTICS, but that 
got me going on something interesting...

Do these statistics look right? 

# SELECT attname, n_distinct, most_common_vals, histogram_bounds FROM 
pg_stats WHERE tablename = 'cards';

initial_set_sequence  31224   
{291,169,334,380,488,599,1752,2293,12584,4}   
{5,806,2485,5394,9106,14071,18566,26521,41407,89905,534617}
initial_card_set_id   901 
{5201,3203,3169,5679,5143,5204,5231,5655,4322,5236}   
{4,3079,3896,4349,4677,5149,5445,5707,6003,6361,6784}
status5   {Inventory,Activated} 
{Closed,Expired,Suspended}
demo  1   {f}   
card_set_id   905 
{5201,3203,3169,5679,5143,5204,5655,4322,5236,4513}   
{4,3080,3896,4349,4701,5179,5445,5706,6003,6361,6784}
external_id   1   {}  
card_id   -1
{0267xx46,1000xx86,1000xx85,1000xx69,10
00xx04,1000xx11,1000xx84,1000xx65,600xxx
xxx4,6006279,998xx40}
pin   9654{1234,1643,2392,6577,0085,0515,0729,1125,1801,1960}   
{,1029,2012,2983,3965,4903,5878,6828,7821,8920,9992}


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query Optimizer makes a poor choice

2011-11-29 Thread Tyler Hains
On Tue, Nov 29, 2011 at 11:21 AM, Tyler Hains 
tha...@profitpointinc.com wrote:
 # explain analyze select * from cards where card_set_id=2850 order by
 card_id limit 1;
    QUERY 
PLAN
 

-
  Limit  (cost=0.00..105.19 rows=1 width=40) (actual 
time=6026.947..6026.948
 rows=1 loops=1)
    -  Index Scan using cards_pkey on cards  (cost=0.00..2904875.38
 rows=27616 width=40) (actual time=6026.945..6026.945 rows=1 loops=1)

There's a huge disconnect here between what the query planner expects
(27k rows) and how many there are (1).  Also, getting a single row
from an index should be faster than this, even if the table and index
are quite large.  Have you checked for bloat on this index?
-

There are actually more like 27 million rows in the table. That's why it 
really should be filtering the rows using the index on the other column 
before ordering for the limit.

The documentation does not seem to give a clear reason for changing the 
value used in default_statistics_target or why you would override it 
with ALTER TABLE SET STATISTICS. My gut is telling me that this may be 
our answer if we can figure out how to tweak it.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query Optimizer makes a poor choice

2011-11-29 Thread Filip Rembiałkowski
2011/11/29 Tyler Hains tha...@profitpointinc.com:


 I haven't had a chance to experiment with the SET STATISTICS, but that
 got me going on something interesting...

 Do these statistics look right?

 # SELECT attname, n_distinct, most_common_vals, histogram_bounds FROM
 pg_stats WHERE tablename = 'cards';

...
 card_set_id   905
 {5201,3203,3169,5679,5143,5204,5655,4322,5236,4513}
 {4,3080,3896,4349,4701,5179,5445,5706,6003,6361,6784}

This looks promising, because n_distinct is low enough that you can
cover almost all values with statistics.
raise the statistics and ANALYZE. should help.
(NOTE NOTE NOTE: assuming that the distribution is even)


...
but one thing we see for sure is that you have not tuned your
PostgreSQL instance :-)
I would recommend pgtune, - pgfoundry.org/projects/pgtune/
it covers most important stuff, *including* default_statistics_target.



Filip

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Limiting number of connections to PostgreSQL per IP (not per DB/user)?

2011-11-29 Thread Heiko Wundram

Am 29.11.2011 20:44, schrieb Filip Rembiałkowski:

no easy, standard way of doing this in postgres.
before we go into workarounds - what's the underlying OS?


Okay, that's too bad that there's no standard way for this. The 
underlying OS is Linux (Gentoo, to be exact), and I'd already thought 
about setting up some form of iptables firewalling, but there's no real 
framework for this (i.e., count the number of connected TCP-sockets 
that originate from a single client) in iptables, only for connection 
throttling from the same source (which won't cut it, as there are 
spikes in connection setup where many connections are created almost 
at once, meaning that hashlimit or recent and the likes are simply not 
suited to the task at hand. I just need/want to give a hard upper 
limit on the number of simultaneous connections from a single client as 
an Anti-DoS-measure - the clients aren't hostile, but their programming 
is broken...).


Is there (meaning do you know of) any form of generic TCP socket proxy 
that can achieve this? I've looked through portage (the Gentoo package 
set) to find something applicable, but none of the socket proxy packages 
I found were able to connection-limit based on source IP out of the box, 
either...


Anyway, thanks for your feedback!

--
--- Heiko.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query Optimizer makes a poor choice

2011-11-29 Thread Tomas Vondra
Hi,

what PostgreSQL version is this? That's the first thing we need to know.

On 29.11.2011 22:28, Tyler Hains wrote:
 Yes, I'm pretty sure autovacuum is enabled. Changing the query as shown 
 there uses the sub-optimal index.

That doesn't mean

 I haven't had a chance to experiment with the SET STATISTICS, but that 
 got me going on something interesting...

If you execute this

SELECT count(*) FROM cards WHERE card_set_id=2850;

what number do you get? How far is that from 27616, expected by the planner?

 Do these statistics look right? 

No idea, that depends on your data set. And you've missed the
most_common_freqs so it's almost impossible to analyze the stats.

Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query Optimizer makes a poor choice

2011-11-29 Thread Tomas Vondra
On 29.11.2011 21:34, Scott Marlowe wrote:
 On Tue, Nov 29, 2011 at 11:21 AM, Tyler Hains tha...@profitpointinc.com 
 wrote:
 # explain analyze select * from cards where card_set_id=2850 order by
 card_id limit 1;
QUERY PLAN
 -
  Limit  (cost=0.00..105.19 rows=1 width=40) (actual time=6026.947..6026.948
 rows=1 loops=1)
-  Index Scan using cards_pkey on cards  (cost=0.00..2904875.38
 rows=27616 width=40) (actual time=6026.945..6026.945 rows=1 loops=1)
 
 There's a huge disconnect here between what the query planner expects
 (27k rows) and how many there are (1).  Also, getting a single row
 from an index should be faster than this, even if the table and index
 are quite large.  Have you checked for bloat on this index?

No there isn't - the 1 is actually caused by the LIMIT clause. Once
the first row is returned, it does not fetch the following ones.

The bloat might be the cause, though. Tyler, run this and let us know
the results:

1) SELECT relpages, reltuples FROM pg_class WHERE relname = 'cards';

2) SELECT n_live_tup, n_dead_tup FROM pg_stat_all_tables
WHERE relname = 'cards';

3) SELECT n_tup_ins, n_tup_upd, n_tup_del, n_tup_hot_upd
 FROM pg_stat_all_tables WHERE relname = 'cards';

If the table / indexes are bloated due to heavy modifications or
(auto)vacuum not aggressive enough, you may try to cluster the table.
But it obtains exclusive lock on the table, preventing writes.

Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Limiting number of connections to PostgreSQL per IP (not per DB/user)?

2011-11-29 Thread Filip Rembiałkowski
W dniu 29 listopada 2011 23:18 użytkownik Heiko Wundram
modeln...@modelnine.org napisał:

 Okay, that's too bad that there's no standard way for this. The underlying
 OS is Linux (Gentoo, to be exact), and I'd already thought about setting up
 some form of iptables firewalling, but there's no real framework for this
 (i.e., count the number of connected TCP-sockets that originate from a
 single client) in iptables, only for connection throttling from the same
 source (which won't cut it, as there are spikes in connection setup where
 many connections are created almost at once, meaning that hashlimit or
 recent and the likes are simply not suited to the task at hand. I just
 need/want to give a hard upper limit on the number of simultaneous
 connections from a single client as an Anti-DoS-measure - the clients aren't
 hostile, but their programming is broken...).


did you look at connlimit?
http://www.netfilter.org/projects/patch-o-matic/pom-external.html#pom-external-connlimit
AFAIK, it applies only to ESTABLISHED state, so maybe it suits you.

I'm not sure how do you want to allow many connections being created
almost at once and limit number of connections from same IP at the
same time?

anyway, we are going offtopic here...

regards
Filip

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Limiting number of connections to PostgreSQL per IP (not per DB/user)?

2011-11-29 Thread Tom Lane
=?UTF-8?Q?Filip_Rembia=C5=82kowski?= plk.zu...@gmail.com writes:
 W dniu 29 listopada 2011 23:18 użytkownik Heiko Wundram
 modeln...@modelnine.org napisał:
 Okay, that's too bad that there's no standard way for this.

 did you look at connlimit?
 http://www.netfilter.org/projects/patch-o-matic/pom-external.html#pom-external-connlimit

Another way that we've sometimes recommended people handle custom login
restrictions is
(1) use PAM for authentication
(2) find or write a PAM plugin that makes the kind of check you want

I think that there may well be a plugin out there already that does
this, or something close enough; but you'll have to do your own
research...

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query Optimizer makes a poor choice

2011-11-29 Thread Tomas Vondra
On 29.11.2011 23:06, Filip Rembiałkowski wrote:
 2011/11/29 Tyler Hains tha...@profitpointinc.com:
 
 
 I haven't had a chance to experiment with the SET STATISTICS, but that
 got me going on something interesting...

 Do these statistics look right?

 # SELECT attname, n_distinct, most_common_vals, histogram_bounds FROM
 pg_stats WHERE tablename = 'cards';

 ...
 card_set_id   905
 {5201,3203,3169,5679,5143,5204,5655,4322,5236,4513}
 {4,3080,3896,4349,4701,5179,5445,5706,6003,6361,6784}
 
 This looks promising, because n_distinct is low enough that you can
 cover almost all values with statistics.
 raise the statistics and ANALYZE. should help.
 (NOTE NOTE NOTE: assuming that the distribution is even)

Estimating ndistinct is very tricky, there are well known fail cases
(skewed distributions etc.)

 ...
 but one thing we see for sure is that you have not tuned your
 PostgreSQL instance :-)
 I would recommend pgtune, - pgfoundry.org/projects/pgtune/
 it covers most important stuff, *including* default_statistics_target.

How do we see that? The only thing you can derive from the above info is
that he's probably running 8.3 (or older), because the number of MVC is
10 and newer releases use 100 by default.

But the statistics target is modified rather rarely, only when it's
actually needed - the default is usually enough and increasing it just
adds overhead to planning.

And pgtune can't reliably suggest a good value, because it's very
dependent on the data. It can merely recommend some reasonable values
(and it recommends 10 for most workloads anyway, except for DWH and
mixed). Don't touch default_statistics_target unless you're sure it
helps and set it only for those columns that need it.

Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query Optimizer makes a poor choice

2011-11-29 Thread Tomas Vondra
On 29.11.2011 22:43, Tyler Hains wrote:
 There are actually more like 27 million rows in the table. That's why it 
 really should be filtering the rows using the index on the other column 
 before ordering for the limit.

Well, the problem is that the PostgreSQL MVCC model is based on keeping
copies of the row. When you delete a row, it's actually marked as
deleted so that running transactions can still see it. An update is just
a delete+insert, so the consequences are the same.

This means there may be a lot of dead rows - easily orders of magnitude
more than there should be. So instead of 27 million rows the table may
actually contain 270 million.

That's what (auto)vacuum is for - it reclaims the space occupied by dead
rows, because there are no transaction that can see them. This space is
then used for new rows (either created by INSERT or UPDATE).

But if the autovacuum can't keep pace with the changes, e.g. because
you've repeatedly run a full-table update or because the table is
updated heavily and the autovacuum is not aggressive enough, you got a
problem.

And this affects indexes too - each new row (or a copy of a row) needs a
new record in the index. Unless it's a HOT update, but let's not
complicate that. And this space is not reclaimed by plain (auto)vacuum,
so you may have a perfectly healthy table and bloated index.

Check the size of your table and indexes, see if it matches your
expectations. E.g. create a small table with 1 rows and compute how
large would the table be with 27 million rows (just multiply by 2700).
Does that match the current size? Same thing for the index.

And run the three queries I've posted in my previous post - that should
give you more details.

You may also use pgstattuple contrib module - run this

  select * from pgstattuple('cards');
  select * from pgstatindex('cards_pkey');

High values of dead_tuple_percent/free_percent (for a table) or
leaf_fragmentation (index) and low avg_leaf_density (index) usually mean
there's a bloat.

But be careful - this actually reads the whole table / index.

 The documentation does not seem to give a clear reason for changing the 
 value used in default_statistics_target or why you would override it 
 with ALTER TABLE SET STATISTICS. My gut is telling me that this may be 
 our answer if we can figure out how to tweak it.

That affects the estimates - when the distribution is skewed the default
detail may not be sufficient for estimate precise enough, so the
optimizer chooses bad plans. Increasing the statistics target means
collect more detailed statistics and that often helps to fix the
issues. But I think this is not the case. I'd guess the bloat.

Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query Optimizer makes a poor choice

2011-11-29 Thread Tomas Vondra
On 29.11.2011 23:19, Tomas Vondra wrote:
 Hi,
 
 what PostgreSQL version is this? That's the first thing we need to know.
 
 On 29.11.2011 22:28, Tyler Hains wrote:
 Yes, I'm pretty sure autovacuum is enabled. Changing the query as shown 
 there uses the sub-optimal index.
 
 That doesn't mean

Sorry, deleted this part by accident. It should be That doesn't mean
the table / index is not bloated. See my other posts for more details.

Tomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Extending the volume size of the data directory volume

2011-11-29 Thread panam
Hi Filip,

Thanks for the suggestions.
There is no indication of disk activity triggered by postgres.
iostat reports: Cannot find disk data (maybe because the system is hosted in
a OpenVZ environment?)
Systemlogs (syslog and postgresql-9.1-main.log) do not indicate something
unusual.
All pg threads are sleeping (S state).

I will try to reproduce this, this time with a smaller initial disk size...

Regards
panam


--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Extending-the-volume-size-of-the-data-directory-volume-tp5030663p5034257.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Extending the volume size of the data directory volume

2011-11-29 Thread Scott Marlowe
On Tue, Nov 29, 2011 at 6:32 PM, panam pa...@gmx.net wrote:
 Hi Filip,

 Thanks for the suggestions.
 There is no indication of disk activity triggered by postgres.
 iostat reports: Cannot find disk data (maybe because the system is hosted in
 a OpenVZ environment?)
 Systemlogs (syslog and postgresql-9.1-main.log) do not indicate something
 unusual.
 All pg threads are sleeping (S state).

 I will try to reproduce this, this time with a smaller initial disk size...

Have you tried doing something like stopping postgres and rebooting
the server in case there's some volume info that didn't get updated
when you grew the partition bigger?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Extending the volume size of the data directory volume

2011-11-29 Thread Craig Ringer

On 11/30/2011 09:32 AM, panam wrote:

Hi Filip,

Thanks for the suggestions.
There is no indication of disk activity triggered by postgres.
iostat reports: Cannot find disk data (maybe because the system is hosted in
a OpenVZ environment?)
Systemlogs (syslog and postgresql-9.1-main.log) do not indicate something
unusual.
All pg threads are sleeping (S state).

I will try to reproduce this, this time with a smaller initial disk size...


Try checking where the postgres processes are waiting, too:

  ps -C postgres -o wchan=

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Extending the volume size of the data directory volume

2011-11-29 Thread panam
Hi, output is



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Extending-the-volume-size-of-the-data-directory-volume-tp5030663p5034494.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Extending the volume size of the data directory volume

2011-11-29 Thread panam
No, but will try this first, thanks for the suggestion.

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Extending-the-volume-size-of-the-data-directory-volume-tp5030663p5034495.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tricking EXPLAIN?

2011-11-29 Thread Shigeru Hanada
(2011/11/28 20:55), Wim Bertels wrote:
 If we look at the output of EXPLAIN ANALYZE,
 then according to the COST the second query is best one,
 but according to the ACTUAL TIME the first query is best
 (which seems logical intuitively).
 
 So explain is being tricked,
 and the reason for this seems the number of rows in de nested loop,
 which are reduced to 1 for explain because of the join.
 http://www.postgresql.org/docs/8.4/static/using-explain.html
 
 Suggestions, comments are always welcome.

Interesting.  I tried a modified version of second query, and got same
EXPLAIN output as first query.

SELECT  amproc, amprocnum - average AS difference
FROMpg_amproc INNER JOIN
(SELECT avg(amprocnum) AS average
FROMpg_amproc) AS tmp
ON true;  -- semantically same as amproc = amproc

So, I think that the point of this issue is somehow PG thinks wrongly
that amporc = amproc filters the result to just one row, though such
condition never reduces result.  I also tried simplified query, and got
another result which shows that PG estimates that same condition reduces
to half.

postgres=# EXPLAIN ANALYZE SELECT * FROM pg_amproc WHERE (amproc = amproc);
   QUERY PLAN
-
 Seq Scan on pg_amproc  (cost=0.00..67.52 rows=126 width=18) (actual
time=0.039..1.356 rows=252 loops=1)
   Filter: (amproc = amproc)
 Total runtime: 1.445 ms
(3 rows)

postgres=# EXPLAIN ANALYZE SELECT * FROM pg_amproc WHERE (true);
   QUERY PLAN

 Seq Scan on pg_amproc  (cost=0.00..4.52 rows=252 width=18) (actual
time=0.008..0.045 rows=252 loops=1)
 Total runtime: 0.089 ms
(2 rows)

IMHO planner should be modified so that it can estimate result rows
accurately in this case.

-- 
Shigeru Hanada

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general