date:20110831

Re: [GENERAL] Getting Table Names in a Particular Database

2011-08-31 Thread Scott Marlowe

On Tue, Aug 30, 2011 at 11:50 PM, Adarsh Sharma
adarsh.sha...@orkash.com wrote:
  I understand, So there is no way to fetch table in a single query. The only
 way is :

 1. Connect demo
 2. Execute the query  'SELECT n.nspname as Schema,   c.relname as Name,
 CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'i' THEN
 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' END as Type,
 pg_catalog.pg_get_userbyid(c.relowner) as Owner FROM pg_catalog.pg_class
 c    LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE
 c.relkind IN ('r','v','S','')   AND n.nspname  'pg_catalog'    AND
 n.nspname  'information_schema' AND n.nspname !~ '^pg_toast'
 ORDER BY 1,2;

 As in Mysql we can view all tables in a test database from below command :

  select table_name from information_schema.tables where table_schema
 ='test';;

Have you tried it in pgsql, cause that works too.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Getting Table Names in a Particular Database

2011-08-31 Thread Adarsh Sharma

pdc_uima=# select table_name from information_schema.tables where 
table_schema='pdc_uima';

table_name

(0 rows)

But filtering on 'public', it gives the result ,  :

pdc_uima=# select * from information_schema.tables where 
table_schema='public';
table_catalog | table_schema |table_name| table_type | 
self_referencing_column_name | reference_generation | 
user_defined_type_catalog | user_define
d_type_schema | user_defined_type_name | is_insertable_into | is_typed | 
commit_action

---+--+--++--+--+---+
--+++--+---
pdc_uima  | public   | spatial_ref_sys  | BASE TABLE 
|  |  
|   |   
 || YES| NO   |
pdc_uima  | public   | geometry_columns | BASE TABLE 
|  |  
|   |   
 || YES| NO   |
pdc_uima  | public   | adarsh   | BASE TABLE 
|  |  
|   |   
 || YES| NO   |

(3 rows)

Come back to the original problem. I have 10 databases with different 
names you have to go into the database by \c command to fetch the table 
names.



Thanks

Scott Marlowe wrote:

On Tue, Aug 30, 2011 at 11:50 PM, Adarsh Sharma
adarsh.sha...@orkash.com wrote:
  

 I understand, So there is no way to fetch table in a single query. The only
way is :

1. Connect demo
2. Execute the query  'SELECT n.nspname as Schema,   c.relname as Name,
CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'i' THEN
'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' END as Type,
pg_catalog.pg_get_userbyid(c.relowner) as Owner FROM pg_catalog.pg_class
cLEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE
c.relkind IN ('r','v','S','')   AND n.nspname  'pg_catalog'AND
n.nspname  'information_schema' AND n.nspname !~ '^pg_toast'
ORDER BY 1,2;

As in Mysql we can view all tables in a test database from below command :

 select table_name from information_schema.tables where table_schema
='test';;



Have you tried it in pgsql, cause that works too.

Re: [GENERAL] Getting Table Names in a Particular Database

2011-08-31 Thread Scott Marlowe

On Wed, Aug 31, 2011 at 12:10 AM, Adarsh Sharma
adarsh.sha...@orkash.com wrote:
 Come back to the original problem. I have 10 databases with different names
 you have to go into the database by \c command to fetch the table names.

Again, in PostgreSQL databases are very separate objects.  In mysql
they are closer to schemas than separate entities.  If you want to
examine a database in pg, you need to connect to it.  period.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Postgresql-9.0.1 Recovery

2011-08-31 Thread Venkat Balaji

Thanks Craig !

Below  is what i did -

1. pg_start_backup()
2. rsync the data dir
3. pg_stop_backup()

I believe the backup is valid because, i was able to bring up the cluster
without any issues (ofcourse with data loss).

+ve signs-

I am able to bring up the cluster with the Online backup, but, only with the
loss  of data.

-ve signs and things to be strongly foreseen while backup testing -

   - pg_clog files were not synced. I suspect they were being written at the
   time of backup. I might have tried to sync the data dir when pg_clog files
   were half filled.
   - Though the WAL Archives are there, Postgres is not trying to recover
   beyond the timestamp at which pg_clog was missing.
   - Even if i replace the missing pg_clog files (which i did), Postgres is
   asking for the corresponding wal archive files

Yes. What i learnt is that we need to ensure that all the pg_clog files must
be fully copied as on the backup time. We cannot afford to miss any of them.

Thanks
Venkat
On Wed, Aug 31, 2011 at 5:46 AM, Craig Ringer ring...@ringerc.id.au wrote:

 On 30/08/2011 6:59 PM, Venkat Balaji wrote:

 Hello Everyone,

 I have a situation here -

 I am trying to restore the production online backup and recover the same.

  - I had initially rsynced (excluded pg_log) the data directory and the
 tarred and zipped the same


 Did you do that after pg_start_backup() or on a stopped database server?

 If you did it on a running database server without first running
 pg_start_backup(), your backup is invalid.

 Personally I like to take my base backups from an LVM snapshot of the
 datadir just to be extra safe. That isn't necessary, though, and a regular
 rsync or tar or whatever of a datadir after pg_start_backup() is fine.

 Remember to run pg_stop_backup() afterwards.


   - I got an error unable to read filename from pg_clog location
 (file size is around 160K)


 ... from PostgreSQL, when you tried to start it?

 What emitted that error message?


  What i understood is that, rsync some how missed out on syncing the
 files in pg_clog  so, i had manually coped the missing pg_clog file
 from production and tried recovery.


 That won't work. You need a consistent snapshot of all the files in the
 data dir. You cannot just mix and match copies taken at different times.

 For efficiency reasons PostgreSQL will recycle used clog files. You can't
 just copy a file over and hope that because it has the same name, it still
 contains the data you want.

 Your backup *failed* at the point where you got an incomplete copy of the
 data directory.


  Do i need to get that particular wal archive which is before online
 backup time ?


 No, you need to get the missing clog files. If you cannot do that, try
 using pg_resetxlog, but be aware that that may lose transactions and can
 potentially cause corruption of tables and indexes.


  By this experience what i understand is that Postgresql stores committed
 and uncommited transactions in pg_xlog / wal archive files and
 information (not the transaction data) about transaction commit status
 is stored in pg_clog. Am I correct ?


 That sounds right to me, but I don't know as much about how Pg stores
 things as I should.


  I am in the process of designing a disaster recovery planner for our
 productions systems.


 Congratulations!

 Be extremely glad this didn't happen in a real recovery scenario. This is a
 marvellous example of why you should always test your backups - you actually
 did, and found a problem that would've been a critical issue if the backup
 were actually needed.

 --
 Craig Ringer

[GENERAL] row is too big

2011-08-31 Thread Sim Zacks


select version()
PostgreSQL 8.2.19 on i686-pc-linux-gnu, compiled by GCC gcc (Debian 
4.3.2-1.1) 4.3.2


Before you jump down my throat about bad design, this is a reporting 
table that is generated based on what the users want to see in their 
rows and columns. (I'm basically generating a spreadsheet for the user 
to copy and paste and then play with graphing functions)


I have a table with 952 fields one text, one int, all the rest numeric 
(no precision or scale defined).
The value of one row is the sum of all the other rows (insert into table 
select sum(a),sum(b),sum(c)...)
The values in this case all fit into an integer size, int column is a 
sum of all the other columns and is 300,000
The text value is blank ('' not null. When I made it null it took up 16 
more bytes)

I am getting an error on inserting the total row:
ERROR:  row is too big: size 11436, maximum size 8136

The manual doesn't say exactly how much storage a numeric type uses, but 
it seems to me that it is taking up about 12 bytes per field. Does this 
make any sense?


Thanks
Sim


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] regression between 8.4.8 and 8.4.2?

2011-08-31 Thread Peter Eisentraut

I don't have an answer for you, but this report looks suspiciously
similar to the one I posted the other day at
http://archives.postgresql.org/pgsql-hackers/2011-08/msg01224.php,
which, now that I think about it, also manifested itself after the
upgrade to 8.4.8.

On tis, 2011-08-30 at 15:24 -0700, Ben Chobot wrote:
 We recently took a copy of our production data (running on 8.4.2),
 scrubbed many data fields, and then loaded it onto a qa server
 (running 8.4.8). We're seeing some odd planner performance that I
 think might be a bug, though I'm hoping it's just idiocy on my part.
 I've analyzed things and looked into pg_stats and it seems as if the
 relevant columns have about the same statistics. 
 
 
 I've managed to simplify the query, but if I make it any simpler, then
 the two servers end up with the same good plan. The query is down to:
 
 
 SELECT machines.quota_purchased 
 FROM machines
 WHERE NOT deleted AND machines.user_id IN (
 SELECT id FROM users WHERE user_group_id IN (
  select 607547 offset 0
 ) OFFSET 0
   );
 
 
 
 
 (Those offset 0 are in there to protect us from planner regressions
 we saw when moving to 8.4. When we move to 9, they can hopefully go
 away.)
 
 
 On the production server, this returns a fairly accurate plan:
 
 
 
 QUERY PLAN
  
 --
  Nested Loop  (cost=843.59..1447.90 rows=243 width=8) (actual
 time=0.044..0.045 rows=1 loops=1)
-  HashAggregate  (cost=843.59..845.59 rows=200 width=4) (actual
 time=0.027..0.027 rows=1 loops=1)
  -  Limit  (cost=0.02..823.90 rows=1575 width=4) (actual
 time=0.024..0.025 rows=1 loops=1)
-  Nested Loop  (cost=0.02..823.90 rows=1575 width=4)
 (actual time=0.023..0.024 rows=1 loops=1)
  -  HashAggregate  (cost=0.02..0.03 rows=1
 width=4) (actual time=0.005..0.005 rows=1 loops=1)
-  Limit  (cost=0.00..0.01 rows=1 width=0)
 (actual time=0.001..0.002 rows=1 loops=1)
  -  Result  (cost=0.00..0.01 rows=1
 width=0) (actual time=0.000..0.000 rows=1 loops=1)
  -  Index Scan using users_user_groups_idx on
 users  (cost=0.00..804.18 rows=1575 width=8) (actual time=0.017..0.018
 rows=1 loops=1)
Index Cond: (users.user_group_id =
 (607547))
-  Index Scan using machines_sid_un on machines  (cost=0.00..3.00
 rows=1 width=12) (actual time=0.015..0.015 rows=1 loops=1)
  Index Cond: (machines.user_id = users.id)
  Total runtime: 0.121 ms
 (12 rows)
 
 
 
 
 On the QA server, things are not so accurate. It doesn't hurt the
 timing of this simplified query much, but when put into the actual
 query, the row estimation being off by 6 orders of magnitude really
 throws the planning in the wrong direction. The plan on the QA server
 is:
 
 
 
 QUERY PLAN
   
 ---
  Nested Loop  (cost=1887.16..3671.20 rows=1192462 width=8) (actual
 time=0.049..0.051 rows=1 loops=1)
-  HashAggregate  (cost=1887.16..1889.16 rows=200 width=4) (actual
 time=0.032..0.033 rows=1 loops=1)
  -  Limit  (cost=0.02..1868.20 rows=1517 width=4) (actual
 time=0.027..0.029 rows=1 loops=1)
-  Nested Loop  (cost=0.02..1868.20 rows=1517 width=4)
 (actual time=0.027..0.028 rows=1 loops=1)
  -  HashAggregate  (cost=0.02..0.03 rows=1
 width=4) (actual time=0.008..0.008 rows=1 loops=1)
-  Limit  (cost=0.00..0.01 rows=1 width=0)
 (actual time=0.001..0.001 rows=1 loops=1)
  -  Result  (cost=0.00..0.01 rows=1
 width=0) (actual time=0.001..0.001 rows=1 loops=1)
  -  Index Scan using users_user_groups_idx on
 users  (cost=0.00..1849.20 rows=1517 width=8) (actual
 time=0.015..0.016 rows=1 loops=1)
Index Cond: (users.user_group_id =
 (607547))
-  Index Scan using machines_sid_un on machines  (cost=0.00..8.90
 rows=1 width=12) (actual time=0.013..0.013 rows=1 loops=1)
  Index Cond: (machines.user_id = users.id)
  Total runtime: 0.148 ms
 (12 rows)
 
 
 
 
 
 
 The problem here (I think) seems to be that the QA server believes
 that running a nested loop over 200 users.id values and joining that
 against machines.user_id will result in 1M rows. The production
 servers sees this more accurately as the nearly 1:1 relationship that
 it is.
 
 
 The reason I wonder if this might be a bug is because if I change the
 obtuse clause WHERE user_group_id IN (select 607547 offset 0) to
 simply where user_group_id in (607547) then the plan collapses to
 the same plan on both servers:
 
 
 explain analyze SELECT

Re: [GENERAL] regression between 8.4.8 and 8.4.2?

2011-08-31 Thread Peter Eisentraut

On ons, 2011-08-31 at 10:42 +0300, Peter Eisentraut wrote:
 I don't have an answer for you, but this report looks suspiciously
 similar to the one I posted the other day at
 http://archives.postgresql.org/pgsql-hackers/2011-08/msg01224.php,
 which, now that I think about it, also manifested itself after the
 upgrade to 8.4.8.

See this thread:
http://archives.postgresql.org/pgsql-performance/2011-08/msg00248.php

It looks like there are a number of users affected by this.

 
 On tis, 2011-08-30 at 15:24 -0700, Ben Chobot wrote:
  We recently took a copy of our production data (running on 8.4.2),
  scrubbed many data fields, and then loaded it onto a qa server
  (running 8.4.8). We're seeing some odd planner performance that I
  think might be a bug, though I'm hoping it's just idiocy on my part.
  I've analyzed things and looked into pg_stats and it seems as if the
  relevant columns have about the same statistics. 
  
  
  I've managed to simplify the query, but if I make it any simpler, then
  the two servers end up with the same good plan. The query is down to:
  
  
  SELECT machines.quota_purchased 
  FROM machines
  WHERE NOT deleted AND machines.user_id IN (
  SELECT id FROM users WHERE user_group_id IN (
   select 607547 offset 0
  ) OFFSET 0
);
  
  
  
  
  (Those offset 0 are in there to protect us from planner regressions
  we saw when moving to 8.4. When we move to 9, they can hopefully go
  away.)
  
  
  On the production server, this returns a fairly accurate plan:
  
  
  
  QUERY PLAN
   
  --
   Nested Loop  (cost=843.59..1447.90 rows=243 width=8) (actual
  time=0.044..0.045 rows=1 loops=1)
 -  HashAggregate  (cost=843.59..845.59 rows=200 width=4) (actual
  time=0.027..0.027 rows=1 loops=1)
   -  Limit  (cost=0.02..823.90 rows=1575 width=4) (actual
  time=0.024..0.025 rows=1 loops=1)
 -  Nested Loop  (cost=0.02..823.90 rows=1575 width=4)
  (actual time=0.023..0.024 rows=1 loops=1)
   -  HashAggregate  (cost=0.02..0.03 rows=1
  width=4) (actual time=0.005..0.005 rows=1 loops=1)
 -  Limit  (cost=0.00..0.01 rows=1 width=0)
  (actual time=0.001..0.002 rows=1 loops=1)
   -  Result  (cost=0.00..0.01 rows=1
  width=0) (actual time=0.000..0.000 rows=1 loops=1)
   -  Index Scan using users_user_groups_idx on
  users  (cost=0.00..804.18 rows=1575 width=8) (actual time=0.017..0.018
  rows=1 loops=1)
 Index Cond: (users.user_group_id =
  (607547))
 -  Index Scan using machines_sid_un on machines  (cost=0.00..3.00
  rows=1 width=12) (actual time=0.015..0.015 rows=1 loops=1)
   Index Cond: (machines.user_id = users.id)
   Total runtime: 0.121 ms
  (12 rows)
  
  
  
  
  On the QA server, things are not so accurate. It doesn't hurt the
  timing of this simplified query much, but when put into the actual
  query, the row estimation being off by 6 orders of magnitude really
  throws the planning in the wrong direction. The plan on the QA server
  is:
  
  
  
  QUERY PLAN

  ---
   Nested Loop  (cost=1887.16..3671.20 rows=1192462 width=8) (actual
  time=0.049..0.051 rows=1 loops=1)
 -  HashAggregate  (cost=1887.16..1889.16 rows=200 width=4) (actual
  time=0.032..0.033 rows=1 loops=1)
   -  Limit  (cost=0.02..1868.20 rows=1517 width=4) (actual
  time=0.027..0.029 rows=1 loops=1)
 -  Nested Loop  (cost=0.02..1868.20 rows=1517 width=4)
  (actual time=0.027..0.028 rows=1 loops=1)
   -  HashAggregate  (cost=0.02..0.03 rows=1
  width=4) (actual time=0.008..0.008 rows=1 loops=1)
 -  Limit  (cost=0.00..0.01 rows=1 width=0)
  (actual time=0.001..0.001 rows=1 loops=1)
   -  Result  (cost=0.00..0.01 rows=1
  width=0) (actual time=0.001..0.001 rows=1 loops=1)
   -  Index Scan using users_user_groups_idx on
  users  (cost=0.00..1849.20 rows=1517 width=8) (actual
  time=0.015..0.016 rows=1 loops=1)
 Index Cond: (users.user_group_id =
  (607547))
 -  Index Scan using machines_sid_un on machines  (cost=0.00..8.90
  rows=1 width=12) (actual time=0.013..0.013 rows=1 loops=1)
   Index Cond: (machines.user_id = users.id)
   Total runtime: 0.148 ms
  (12 rows)
  
  
  
  
  
  
  The problem here (I think) seems to be that the QA server believes
  that running a nested loop over 200 users.id values and joining that
  against machines.user_id will result in 1M rows. The production
  servers sees this more

Re: [GENERAL] IDLE queries taking up space

2011-08-31 Thread Craig Ringer


On 31/08/2011 12:03 AM, JD Wong wrote:

Hi,

When I run select datname, procpid, current_query from 
pg_stat_activity; I get 26 rows of IDLE queries.  How can I set 
postgres to qutomatically close connections that have finished their 
queries and now sit idle?


If they're not idle in transaction, they don't matter much.

Try filtering the list based on last activity, so you only see those 
connections that have been idle for a while. Short periods of idle are 
normal for many applications because they perform a query then process 
its results and issue other queries based on the results, or because 
they keep a connection around between requests from users.


If the idle connections are actually causing an issue - for example, if 
they're preventing the release of non-trivial amounts of backend private 
memory back to the OS - you can tweak the client to disconnect after a 
certain idle time, or you can use a connection pool. Connection pools 
may be inside the client (for example, in Java EE application servers) 
or between the client and the server using tools like pgbouncer and 
PgPool-II.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Parameterized prepared statements

2011-08-31 Thread Craig Ringer


Hi folks

While replying to another query, something struck me as interesting that 
I thought I'd raise.


People here often raise issues where preparing a parameterised query and 
executing the prepared query results in a plan that's sub-optimal for 
the particular values substituted into it. The cause is well understood 
- the planner has to pick a plan that looks good without knowledge of 
what the value to be substituted in will be.


Things like pre-parsed prepared statements that're re-planned on every 
execution are often proposed as solutions to this. This has me 
wondering: rather than expensively re-planning from scratch, would it be 
possiblet to adjust the planning process so that *multiple* alternative 
plans would be cached for a query, using placeholders for unknown 
rowcounts and costs? At execution, the unknown costs would be filled in 
and the plans compared then the best plan picked for this execution. Is 
this crazy talk, or could it significantly reduce the cost of 
re-planning parameterized prepared statements to the point where it'd be 
worth doing by default?


On an unrelated note, does Pg do any kind of smart searching on `IN' 
lists, or just a linear scan? Would it be worth sorting longer IN list 
results so each iteration could do a binary search of the list?


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] ERD Tool

2011-08-31 Thread Adarsh Sharma


Dear all,

Is there any open source ERD Tool for Postgresql Database.
I find some paid tools but looking for free tools.


Thanks

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] ERD Tool

2011-08-31 Thread Thomas Kellerer


Adarsh Sharma, 31.08.2011 13:54:

Dear all,

Is there any open source ERD Tool for Postgresql Database.
I find some paid tools but looking for free tools.



Have a look at Power*Architect: http://www.sqlpower.ca/page/architect

It's not perfect but it's quite OK.
As it is a multi-DBMS tool it does not support any Postgres specific features 
or datatypes.

To avoid the nasty registration that is required on their homepage you can also 
download the binaries directly from the Google code project homepage:

http://code.google.com/p/power-architect/downloads/list

Regards
Thomas



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] SELECT Query on DB table preventing inserts

2011-08-31 Thread Dan Scott

On Tue, Aug 30, 2011 at 13:52, Daniel Verite dan...@manitou-mail.org wrote:
        Dan Scott wrote:

 the insert process is unable to insert new rows into the database

 You should probably provide the error message on insert or otherwise describe
 how it's not working. Normally reading does not unintentionally prevent
 writing in a concurrent session.

I've investigated a little further and it seems that they don't really
'fail'. Just start taking significantly longer to insert, and the
messages start backing up and eventually stop being sent because
they're not being acknowledged. I can see a few WARNING:  pgstat wait
timeout messages around the time that this is happening in the
syslog.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] SELECT Query on DB table preventing inserts

2011-08-31 Thread Dan Scott

On Tue, Aug 30, 2011 at 13:21, Scott Ribe scott_r...@elevated-dev.com wrote:
 On Aug 30, 2011, at 8:22 AM, Dan Scott wrote:

 Perhaps because I'm locking the table with my query?

 Do you mean you're explicitly locking the table? If so, why???

No, not explicitly. I just thought of it as a possible explanation. If
reading from the table does not lock the table for writing, then
that's not the problem.

Thanks,

Dan


 --
 Scott Ribe
 scott_r...@elevated-dev.com
 http://www.elevated-dev.com/
 (303) 722-0567 voice






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] SELECT Query on DB table preventing inserts

2011-08-31 Thread Tomas Vondra

On 31 Srpen 2011, 1:07, Dan Scott wrote:
 On Tue, Aug 30, 2011 at 13:52, Daniel Verite dan...@manitou-mail.org
 wrote:
        Dan Scott wrote:

 the insert process is unable to insert new rows into the database

 You should probably provide the error message on insert or otherwise
 describe
 how it's not working. Normally reading does not unintentionally prevent
 writing in a concurrent session.

 I've investigated a little further and it seems that they don't really
 'fail'. Just start taking significantly longer to insert, and the
 messages start backing up and eventually stop being sent because

What messages are you talking about?

 they're not being acknowledged. I can see a few WARNING:  pgstat wait
 timeout messages around the time that this is happening in the
 syslog.

The pgstat messages are a typical symptom of I/O bottleneck - it just
means you'ro doing a lot of writes, more than the drives can take. Enable
checkpoint logging (log_checkpoints=on) and watch the system stats (e.g.
using 'iostat -x' or vmstat), my bet is this is a checkpoint or pdflush
issue.

Anyway we need more info about your system - Pg version, amount of RAM,
shared buffers, checkpoint settings (segments, completion) and page cache
config (/proc/sys/vm/). A few lines of vmstat/iostat output would help
too.

Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] heavy swapping, not sure why

2011-08-31 Thread Merlin Moncure

On Tue, Aug 30, 2011 at 10:05 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Tue, Aug 30, 2011 at 8:36 PM, mark dvlh...@gmail.com wrote:
 To the broader list, regarding troubles with kswap. I am curious to what
 others seeing from /proc/zoneinfo for DMA pages (not dma32 or normal) -
 basically if it sits at 1 or not.  Setting swappiness to 0 did not have any
 affect for us on kswap issues. Another thing I have not had time and
 resources to go work on... interested in what kernel they are running and
 what storage drivers they might be using.

 Well, we had zone reclaim mode autoset to 1, and we had to turn it off
 to get decent performance with postgresql.  Machine was a quad
 dodecacore Magny Cours, so 48 cores with 128G RAM.  RAID controller is
 an Areca 1680 with BBU, 34 15kRPM 147G SAS Seagate 15k6 drives in two
 16 drive external enclosures and 2 drives in the server.

 The only solution we could find for kswapd going crazy was to just
 turn off swap.  Pretty sure I used a large swap file to test larger
 swaps, but all that did was put off the eventual kswapd storm. It took
 anywhere from one to two weeks, maybe more, and then one day you check
 and your servers maxed out by kswapd.

hm, that's an interesting counterpoint to what I've been saying.  I've
never seen that, I wonder what the underlying trigger was?  I
typically set shared_buffers fairly low (even to the default, raising
only when I think it might help) -- I wonder if that plays in.

to setting 1000 connections: some applications rely on database
session features (like advisory locks or listen/notfiy) and retooling
the client is more trouble than it's worth.   This is definitely on
the upper bound of what's reasonable though...these days I code with
the assumption that pgbouncer is going to be put in even if I don't
need it right away.

merlin

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] row is too big

2011-08-31 Thread Tom Lane

Sim Zacks s...@compulab.co.il writes:
 The manual doesn't say exactly how much storage a numeric type uses, but 
 it seems to me that it is taking up about 12 bytes per field. Does this 
 make any sense?

Yeah, that would be the minimum size of a nonzero numeric value in 8.2.
(More recent versions can pack them a bit tighter in many cases.)

You might consider whether you can put all those numeric fields into an
array.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Index usage on OR queries

2011-08-31 Thread Tore Halvorsen

Hi,

I'm trying to optimize a query where I have two tables that both have a
timestamp column. I want the result where either of the timestamps is after
a specified time. In a reduced form, like this:


CREATE TABLE a
(
  id serial NOT NULL PRIMARY KEY,
  time timestamp without time zone NOT NULL DEFAULT now()
);

CREATE INDEX a_time_idx ON a USING btree (time DESC NULLS LAST);

CREATE TABLE b
(
  id serial NOT NULL PRIMARY KEY,
  time timestamp without time zone NOT NULL DEFAULT now()
);

CREATE INDEX b_time_idx ON b USING btree (time DESC NULLS LAST);

--- generate some data
insert into a(time)
select now() - '10 year'::interval * random() from generate_series(1,
100, 1);

insert into b(time)
select now() - '10 year'::interval * random() from generate_series(1,
100, 1);

-- Using constraint works as expected, and uses the time index.
select * from a join b using(id)
where a.time = '2011-08-15';

-- ... both ways...
select * from a join b using(id)
where b.time = '2011-08-15';

-- However, if I'm trying to do this for both times at once, the time index
is not used at all
select * from a join b using(id)
where a.time = '2011-08-15' OR b.time = '2011-08-01'

-- This can be optimized by using CTEs
with am as (
  select * from a where time = '2011-08-15'
)
, bm as (
  select * from b where time = '2011-08-15'
)
select * from am join bm using(id)

-- end

I'm just wondering why the optimizer does things the way it does - and if
the CTE version is the best way to go...

The actual case is slightly more complex and uses more tables - this is
mostly a way to find updated data.

-- 
Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul]
demo 2011 Tore Halvorsen || +052 0553034554

Re: [GENERAL] Index usage on OR queries

2011-08-31 Thread Tore Halvorsen


 -- This can be optimized by using CTEs
 with am as (
   select * from a where time = '2011-08-15'
 )
 , bm as (
   select * from b where time = '2011-08-15'
 )
 select * from am join bm using(id)


Disregard this, it doesn't to the same at all.

Now I'm more confused as to how I can optimize the query.


-- 
Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul]
demo 2011 Tore Halvorsen || +052 0553034554

Re: [GENERAL] FATAL: terminating connection due to conflict with recovery

2011-08-31 Thread Jeff Ross


On 08/30/11 18:03, Fujii Masao wrote:

On Wed, Aug 31, 2011 at 5:51 AM, Jeff Rossjr...@wykids.org  wrote:

Is there a setting in this or something else that I should tweak so this
query can complete against the replica?  Google turned up some threads on
the error code associated with the error but I didn't find much else that
seems applicable.


Increasing max_standby_archive_delay and max_standby_streaming_delay
would be helpful to make the query complete. Please see the following manual
for details.
http://www.postgresql.org/docs/9.0/interactive/hot-standby.html#HOT-STANDBY-CONFLICT

Regards,



Thank you!

And now the error message is linked to the solution for the next person.

Jeff

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Index usage on OR queries

2011-08-31 Thread Andy Colson


On 8/31/2011 9:35 AM, Tore Halvorsen wrote:

Hi,

I'm trying to optimize a query where I have two tables that both have a
timestamp column. I want the result where either of the timestamps is
after a specified time. In a reduced form, like this:


CREATE TABLE a
(
   id serial NOT NULL PRIMARY KEY,
   time timestamp without time zone NOT NULL DEFAULT now()
);

CREATE INDEX a_time_idx ON a USING btree (time DESC NULLS LAST);

CREATE TABLE b
(
   id serial NOT NULL PRIMARY KEY,
   time timestamp without time zone NOT NULL DEFAULT now()
);

CREATE INDEX b_time_idx ON b USING btree (time DESC NULLS LAST);

--- generate some data
insert into a(time)
select now() - '10 year'::interval * random() from generate_series(1,
100, 1);

insert into b(time)
select now() - '10 year'::interval * random() from generate_series(1,
100, 1);

-- Using constraint works as expected, and uses the time index.
select * from a join b using(id)
where a.time = '2011-08-15';

-- ... both ways...
select * from a join b using(id)
where b.time = '2011-08-15';

-- However, if I'm trying to do this for both times at once, the time
index is not used at all
select * from a join b using(id)
where a.time = '2011-08-15' OR b.time = '2011-08-01'

-- This can be optimized by using CTEs
with am as (
   select * from a where time = '2011-08-15'
)
, bm as (
   select * from b where time = '2011-08-15'
)
select * from am join bm using(id)

-- end

I'm just wondering why the optimizer does things the way it does - and
if the CTE version is the best way to go...

The actual case is slightly more complex and uses more tables - this is
mostly a way to find updated data.

--
Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul]
demo 2011 Tore Halvorsen || +052 0553034554




On PG 9, after I ANALYZED the tables, it used indexes:


QUERY PLAN
---
 Merge Join  (cost=1.59..82778.35 rows=13171 width=20) (actual 
time=0.066..1076.616 rows=12966 loops=1)

   Merge Cond: (a.id = b.id)
   Join Filter: ((a.time = '2011-08-15 00:00:00'::timestamp without 
time zone) OR (b.time = '2011-08-01 0
   -  Index Scan using a_pkey on a  (cost=0.00..31389.36 rows=100 
width=12) (actual time=0.007..204.856 ro
   -  Index Scan using b_pkey on b  (cost=0.00..31389.36 rows=100 
width=12) (actual time=0.006..224.189 ro



ANALYZE is the magic.

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Index usage on OR queries

2011-08-31 Thread Tore Halvorsen

On Wed, Aug 31, 2011 at 4:49 PM, Andy Colson a...@squeakycode.net wrote:

 On PG 9, after I ANALYZED the tables, it used indexes:


 QUERY PLAN
 --**--**
 --**-
  Merge Join  (cost=1.59..82778.35 rows=13171 width=20) (actual
 time=0.066..1076.616 rows=12966 loops=1)
   Merge Cond: (a.id = b.id)
   Join Filter: ((a.time = '2011-08-15 00:00:00'::timestamp without time
 zone) OR (b.time = '2011-08-01 0
   -  Index Scan using a_pkey on a  (cost=0.00..31389.36 rows=100
 width=12) (actual time=0.007..204.856 ro
   -  Index Scan using b_pkey on b  (cost=0.00..31389.36 rows=100
 width=12) (actual time=0.006..224.189 ro


 ANALYZE is the magic.


You are, of course, right, but it doesn't use the TIME index.
Hmmm, may be my example isn't large enough to produce the issue.

-- 
Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul]
demo 2011 Tore Halvorsen || +052 0553034554

Re: [GENERAL] Index usage on OR queries

2011-08-31 Thread Andy Colson


On 8/31/2011 9:53 AM, Tore Halvorsen wrote:

On Wed, Aug 31, 2011 at 4:49 PM, Andy Colson a...@squeakycode.net
mailto:a...@squeakycode.net wrote:

On PG 9, after I ANALYZED the tables, it used indexes:


QUERY PLAN

--__--__--__-
  Merge Join  (cost=1.59..82778.35 rows=13171 width=20) (actual
time=0.066..1076.616 rows=12966 loops=1)
   Merge Cond: (a.id http://a.id = b.id http://b.id)
   Join Filter: ((a.time = '2011-08-15 00:00:00'::timestamp
without time zone) OR (b.time = '2011-08-01 0
   -  Index Scan using a_pkey on a  (cost=0.00..31389.36
rows=100 width=12) (actual time=0.007..204.856 ro
   -  Index Scan using b_pkey on b  (cost=0.00..31389.36
rows=100 width=12) (actual time=0.006..224.189 ro


ANALYZE is the magic.


You are, of course, right, but it doesn't use the TIME index.
Hmmm, may be my example isn't large enough to produce the issue.

--
Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul]
demo 2011 Tore Halvorsen || +052 0553034554


wow, yea.. I saw index and just assumed.  Didn't even notice, sorry 
about that.


-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Index usage on OR queries

2011-08-31 Thread Tomas Vondra

On 31 Srpen 2011, 16:49, Andy Colson wrote:
 On 8/31/2011 9:35 AM, Tore Halvorsen wrote:
 Hi,

 I'm trying to optimize a query where I have two tables that both have a
 timestamp column. I want the result where either of the timestamps is
 after a specified time. In a reduced form, like this:


 CREATE TABLE a
 (
id serial NOT NULL PRIMARY KEY,
time timestamp without time zone NOT NULL DEFAULT now()
 );

 CREATE INDEX a_time_idx ON a USING btree (time DESC NULLS LAST);

 CREATE TABLE b
 (
id serial NOT NULL PRIMARY KEY,
time timestamp without time zone NOT NULL DEFAULT now()
 );

 CREATE INDEX b_time_idx ON b USING btree (time DESC NULLS LAST);

 --- generate some data
 insert into a(time)
 select now() - '10 year'::interval * random() from generate_series(1,
 100, 1);

 insert into b(time)
 select now() - '10 year'::interval * random() from generate_series(1,
 100, 1);

 -- Using constraint works as expected, and uses the time index.
 select * from a join b using(id)
 where a.time = '2011-08-15';

 -- ... both ways...
 select * from a join b using(id)
 where b.time = '2011-08-15';

 -- However, if I'm trying to do this for both times at once, the time
 index is not used at all
 select * from a join b using(id)
 where a.time = '2011-08-15' OR b.time = '2011-08-01'

 -- This can be optimized by using CTEs
 with am as (
select * from a where time = '2011-08-15'
 )
 , bm as (
select * from b where time = '2011-08-15'
 )
 select * from am join bm using(id)

 -- end

 I'm just wondering why the optimizer does things the way it does - and
 if the CTE version is the best way to go...

 The actual case is slightly more complex and uses more tables - this is
 mostly a way to find updated data.

 --
 Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul]
 demo 2011 Tore Halvorsen || +052 0553034554



 On PG 9, after I ANALYZED the tables, it used indexes:


 QUERY PLAN
 ---
   Merge Join  (cost=1.59..82778.35 rows=13171 width=20) (actual
 time=0.066..1076.616 rows=12966 loops=1)
 Merge Cond: (a.id = b.id)
 Join Filter: ((a.time = '2011-08-15 00:00:00'::timestamp without
 time zone) OR (b.time = '2011-08-01 0
 -  Index Scan using a_pkey on a  (cost=0.00..31389.36 rows=100
 width=12) (actual time=0.007..204.856 ro
 -  Index Scan using b_pkey on b  (cost=0.00..31389.36 rows=100
 width=12) (actual time=0.006..224.189 ro


 ANALYZE is the magic.

Yes ;-) Who says we don't have a magical fairy dust?

Anyway you could try to postpone the join a bit - determine the IDs first
and then join. Something like this

WITH t AS (
  SELECT id FROM a WHERE time = '2011-08-15'
  UNION
  SELECT id FROM b WHERE time = '2011-08-15'
)
SELECT * FROM a JOIN b ON (a.id = b.id) WHERE id IN (SELECT id FROM t);

or something like that. It's not as clean as your query, but in some cases
it's faster.

Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] heavy swapping, not sure why

2011-08-31 Thread Scott Marlowe

On Tue, Aug 30, 2011 at 9:05 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 Well, we had zone reclaim mode autoset to 1, and we had to turn it off
 to get decent performance with postgresql.  Machine was a quad
 dodecacore Magny Cours, so 48 cores with 128G RAM.  RAID controller is
 an Areca 1680 with BBU, 34 15kRPM 147G SAS Seagate 15k6 drives in two
 16 drive external enclosures and 2 drives in the server.

Also, Ubuntu 10.04 64 bit with all updates last fall right after it came out.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] out of memory - no sort

2011-08-31 Thread Don


Pavel...

Thanks for the reply...

This still did not solve the issue.  It seems odd that a simple select 
command in psql accessing 32MB of records should cause a problem.  I 
have tables much larger than this and may want to access them the same way.


I have 24 GB RAM on the sever and 32GB RAM on the client machine. Both 
machines are 64bit.


Thanks Don


On 8/30/2011 10:25 AM, Pavel Stehule wrote:

Hello

if table is large, then client can raise this exception too

try to set FETCH_COUNT to 1000

http://www.postgresql.org/docs/8.4/interactive/app-psql.html

Regards

Pavel Stehule

2011/8/30 Dondonald.laur...@noaa.gov:

I am trying a simple access of a table and get an out of memory error.  How
do I avoid this issue.  It seems I have some configuration set wrong.

Our system has 24GB of memory and is dedicated to the postgres database.

Back ground information

aquarec=  explain analyze verbose select * from ens_memb;
 QUERY
PLAN
--
  Seq Scan on ens_memb  (cost=0.00..719893.12 rows=32216212 width=62) (actual
time=4.954..37513.377 rows=32216154 loops=1)
Output: id, shefpc, bwmon, ewmon, pb, tb, cdate, vdate, source, tyr, val
  Total runtime: 39588.386 ms


#--
# RESOURCE USAGE (except WAL)
#--

# - Memory -

shared_buffers = 6144MB # min 128kB
 # (change requires restart)
#temp_buffers = 8MB # min 800kB
max_prepared_transactions = 5   # zero disables the feature
 # (change requires restart)
# Note:  Increasing max_prepared_transactions costs ~600 bytes of shared
memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
# It is not advisable to set max_prepared_transactions nonzero unless you
# actively intend to use prepared transactions.
work_mem = 48MB # min 64kB
maintenance_work_mem = 256MB# min 1MB
#max_stack_depth = 2MB  # min 100kB

# - Kernel Resource Usage -

#max_files_per_process = 1000   # min 25
 # (change requires restart)
#shared_preload_libraries = ''  # (change requires restart)

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0ms# 0-100 milliseconds
#vacuum_cost_page_hit = 1   # 0-1 credits
#vacuum_cost_page_miss = 10 # 0-1 credits
#vacuum_cost_page_dirty = 20# 0-1 credits
#vacuum_cost_limit = 200# 1-1 credits

# - Background Writer -

#bgwriter_delay = 200ms # 10-1ms between rounds
#bgwriter_lru_maxpages = 100# 0-1000 max buffers written/round
#bgwriter_lru_multiplier = 2.0  # 0-10.0 multipler on buffers
scanned/round

# - Asynchronous Behavior -

#effective_io_concurrency = 1   # 1-1000. 0 disables prefetching


#--
# WRITE AHEAD LOG
#--

# - Settings -

#fsync = on # turns forced synchronization on or
off
#synchronous_commit = on# immediate fsync at commit
#wal_sync_method = fsync# the default is the first option
 # supported by the operating system:
 #   open_datasync
 #   fdatasync
 #   fsync
 #   fsync_writethrough
 #   open_sync
#full_page_writes = on  # recover from partial page writes
wal_buffers = 8MB   # min 32kB
 # (change requires restart)
#wal_writer_delay = 200ms   # 1-1 milliseconds

#commit_delay = 0   # range 0-10, in microseconds
#commit_siblings = 5# range 1-1000

# - Checkpoints -

checkpoint_segments = 32# in logfile segments, min 1, 16MB
each
#checkpoint_timeout = 5min  # range 30s-1h
#checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 -
1.0
#checkpoint_warning = 30s   # 0 disables

# - Archiving -

#archive_mode = off # allows archiving to be done
 # (change requires restart)
#archive_command = ''   # command to use to archive a logfile
segment
#archive_timeout = 0# force a logfile segment switch after this
 # number of

[GENERAL] Using a function in different schemas

2011-08-31 Thread Juan Manuel Alvarez

Hello everyone! This is the first time I post on this list. Until now,
I was able to get a solution to all the problems I had, but now I am
facing an issue I can't resolve, so I hope you can help me.

The problem goes like this: I am using the same distribution of
tables/functions into different schemas, so I have the following
layout
- schema1 has tables myTable1 and myTable2, and function myFunction
- schema2 has tables myTable1 and myTable2, and function myFunction

Until now I used the full names to distinguish between schemas, like
this: schema1.myTable1 and schema2.myTable1.
But now I have to add an update trigger in both myTable1 that calls
to a function that makes a select statement on myTable2.
The problem is that even if the trigger is inside mySchema1, it
can't find myTable2 and I don't know how to get the schema that the
function is into.

Things I have tried so far:
- current_schema() but, according to the documentation, it returns
the name of the schema that is first in the search path and that is
neither mySchema1 or mySchema2.
- Passing a parameter to the trigger: since it is an update trigger I
haven't found a way to pass parameter to it.

Can anyone please point me in the right direction?

Thanks in advance!
Juan M.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] out of memory - no sort

2011-08-31 Thread Scott Ribe

On Aug 31, 2011, at 9:51 AM, Don wrote:

 Both machines are 64bit.

Are all your server  client builds 64-bit? 

32M rows, unless the rows are 50 bytes each, you'll never be able to 
manipulate that selection in memory with a 32-bit app.

-- 
Scott Ribe
scott_r...@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] invalid input syntax for type bytea

2011-08-31 Thread Alan Millington

I am running Postgres 8.4.1 on Windows XP Professional Service Pack 3. My 
database is UTF8. My program code is written in Python, and to interface to 
Postgres I use mxODBC 3.0 and the PostgreSQL Unicode driver PSQLODBCW.DLL 
version 8.01.02.00 dated 31/01/2006.
I recently hit an error when trying to insert into a bytea column. The 
following is the first part of what appears in the Postgres log:
2011-08-25 14:42:40 BST HINT:  Use the escape string syntax for backslashes, 
e.g., E'\\'.2011-08-25 14:42:40 BST ERROR:  invalid input syntax for type bytea 
at character 752011-08-25 14:42:40 BST STATEMENT:  insert into FAMILY_DATA 
(family_id, seq_num, family_pod) values (177, 20,
 
'\\200\\002\\135q\\001\\050cperson\\012Person\\012q\\002\\051\\201q\\003\\175q\\004\\050U\\006\\137namesq\\005cperson\\012\\137Names\\012q\\006\\051\\201q\\007cperson\\012\\137Name\\012q\\010\\051\\201q\\011\\050\\135q\\012X\\003\\000\\000\\00017\\052q\\013a\\135q\\014X\\002\\000\\000\\000ABq\\015ae\\175q\\016U\\006parentq\\017h\\007sba\\175q\\020U\\006personq\\021h\\003sbU\\021nationality\\137eventq\\022cperson\\012Dictof\\137Nationality\\137Event\\012q\\023\\051\\201q\\024X\\004\\000\\000\\000UKBSq\\025\\175q\\026\\051cperson\\012Listof\\137Nationality\\137Event\\012q\\027\\051\\201q\\030\\050cperson\\012Nationality\\137Event\\012q\\031\\051\\201q\\032\\175q\\033\\050U\\015contra\\137eventsq\\034\\051U\\006parentq\\035h\\030U\\004bvosq\\036c\\137\\137builtin\\137\\137\\012set\\012q\\037\\135q
 
U\\002ATq\\041a\\205Rq\\042U\\007cfflagsq\\043\\051U\\006effectq\\044U\\001AU\\005cinfoq\\045ccinfo\\012BCInfo
(I omit the rest, but there is a close quote and a close parenthesis at the 
end.)
If I have counted right, character 75 is the '7' in '\\175', which looks to be 
a valid octal value.
The statement as shown in the log is generated by mxODBC, not by me, so if 
there is something wrong with it I shall have to take the matter up with 
eGenix. Before I do so it would be useful to know what is wrong with the 
statement.
I had to tweak my own code when I upgraded from Postgres 8.1 to 8.4, but since 
then I have inserted 5092 rows with an average binary data length of 40,000 
bytes. This is the only insert that has failed.

Re: [GENERAL] how do I disable automatic start on mac os x?

2011-08-31 Thread Scott Ribe

On Aug 31, 2011, at 8:46 AM, edwardIshaq wrote:

 if you do:
 open   
 OS X will open the file in the plist editor.
 I tried doing that but didn't get a way with saving though :)

Probably a privileges violation, right? In a prior message on this thread I 
suggested:

sudo launchctl unload -w blahblahblah.plist

I also said: The -w option causes it to not only unload the item, but also 
write a 
disabled key into it which will stop it from loading at launch.

That is no longer true. Instead of the disabled key, 10.6  up (or was it 10.5 
 up?) now keep track of enabled/disabled elsewhere, so you really need to use 
launchctl instead of editing the plist.


-- 
Scott Ribe
scott_r...@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] out of memory - no sort

2011-08-31 Thread Pavel Stehule

Hello

2011/8/31 Don donald.laur...@noaa.gov:
 Pavel...

 Thanks for the reply...

 This still did not solve the issue.  It seems odd that a simple select
 command in psql accessing 32MB of records should cause a problem.  I have
 tables much larger than this and may want to access them the same way.


so there are two possibilities

a) broken datafiles
b) PostgreSQL's bug

Pavel

 I have 24 GB RAM on the sever and 32GB RAM on the client machine. Both
 machines are 64bit.

 Thanks Don


 On 8/30/2011 10:25 AM, Pavel Stehule wrote:

 Hello

 if table is large, then client can raise this exception too

 try to set FETCH_COUNT to 1000

 http://www.postgresql.org/docs/8.4/interactive/app-psql.html

 Regards

 Pavel Stehule

 2011/8/30 Dondonald.laur...@noaa.gov:

 I am trying a simple access of a table and get an out of memory error.
  How
 do I avoid this issue.  It seems I have some configuration set wrong.

 Our system has 24GB of memory and is dedicated to the postgres database.

 Back ground information

 aquarec=  explain analyze verbose select * from ens_memb;
                                                         QUERY
 PLAN

 --
  Seq Scan on ens_memb  (cost=0.00..719893.12 rows=32216212 width=62)
 (actual
 time=4.954..37513.377 rows=32216154 loops=1)
    Output: id, shefpc, bwmon, ewmon, pb, tb, cdate, vdate, source, tyr,
 val
  Total runtime: 39588.386 ms



 #--
 # RESOURCE USAGE (except WAL)

 #--

 # - Memory -

 shared_buffers = 6144MB                 # min 128kB
                                         # (change requires restart)
 #temp_buffers = 8MB                     # min 800kB
 max_prepared_transactions = 5           # zero disables the feature
                                         # (change requires restart)
 # Note:  Increasing max_prepared_transactions costs ~600 bytes of shared
 memory
 # per transaction slot, plus lock space (see max_locks_per_transaction).
 # It is not advisable to set max_prepared_transactions nonzero unless you
 # actively intend to use prepared transactions.
 work_mem = 48MB                         # min 64kB
 maintenance_work_mem = 256MB            # min 1MB
 #max_stack_depth = 2MB                  # min 100kB

 # - Kernel Resource Usage -

 #max_files_per_process = 1000           # min 25
                                         # (change requires restart)
 #shared_preload_libraries = ''          # (change requires restart)

 # - Cost-Based Vacuum Delay -

 #vacuum_cost_delay = 0ms                # 0-100 milliseconds
 #vacuum_cost_page_hit = 1               # 0-1 credits
 #vacuum_cost_page_miss = 10             # 0-1 credits
 #vacuum_cost_page_dirty = 20            # 0-1 credits
 #vacuum_cost_limit = 200                # 1-1 credits

 # - Background Writer -

 #bgwriter_delay = 200ms                 # 10-1ms between rounds
 #bgwriter_lru_maxpages = 100            # 0-1000 max buffers
 written/round
 #bgwriter_lru_multiplier = 2.0          # 0-10.0 multipler on buffers
 scanned/round

 # - Asynchronous Behavior -

 #effective_io_concurrency = 1           # 1-1000. 0 disables prefetching



 #--
 # WRITE AHEAD LOG

 #--

 # - Settings -

 #fsync = on                             # turns forced synchronization on
 or
 off
 #synchronous_commit = on                # immediate fsync at commit
 #wal_sync_method = fsync                # the default is the first option
                                         # supported by the operating
 system:
                                         #   open_datasync
                                         #   fdatasync
                                         #   fsync
                                         #   fsync_writethrough
                                         #   open_sync
 #full_page_writes = on                  # recover from partial page
 writes
 wal_buffers = 8MB                       # min 32kB
                                         # (change requires restart)
 #wal_writer_delay = 200ms               # 1-1 milliseconds

 #commit_delay = 0                       # range 0-10, in microseconds
 #commit_siblings = 5                    # range 1-1000

 # - Checkpoints -

 checkpoint_segments = 32                # in logfile segments, min 1,
 16MB
 each
 #checkpoint_timeout = 5min              # range 30s-1h
 #checkpoint_completion_target = 0.5     # checkpoint target duration, 0.0
 -
 1.0
 #checkpoint_warning = 30s               # 0 disables

 # - Archiving -

 #archive_mode = off             # allows archiving to be done
                                 # (change

Re: [GENERAL] out of memory - no sort

2011-08-31 Thread Don


  
  
The server is 64 bit and client is 32 bit... I tried the select
  * from table on the server and the query worked...
but I am puzzled why it does not work on the 32bit machine. I had
always thought that a 32bit machine could access up to 4GB.
So what is the limiting factor ?



On 8/31/2011 8:57 AM, Scott Ribe wrote:

  On Aug 31, 2011, at 9:51 AM, Don wrote:


  
Both machines are 64bit.

  
  
Are all your server  client builds 64-bit? 

32M rows, unless the rows are 50 bytes each, you'll never be able to manipulate that selection in memory with a 32-bit app.

Re: [GENERAL] out of memory - no sort

2011-08-31 Thread Scott Ribe

On Aug 31, 2011, at 10:52 AM, Don wrote:

 I had always thought that a 32bit machine could access up to 4GB.
 So what is the limiting factor ?

- Half of your memory space may be given over to memory-mapped I/O. Now you're 
down to 2GB.

- Your process's executable, plus any libraries it uses, plus all the system 
libraries that they touch, recursively all the way down, are mapped into this 
space. Now you're likely down to 1.5GB or less free.

- Then of course your process allocates various data structures for each row, 
even if it's just a huge array of pointers to each row, that would be overhead. 
And of course the overhead is not nearly that simple--there will be allocations 
for  pointers to varchars, and info about columns and data types, and heap 
data structures to keep track of allocated vs free blocks.

- Memory will be fragmented of course, so you can't even use all of what's left.

So no, you can't manipulate 32M of anything except plain numbers or very simple 
structs in RAM in a 32-bit process.

-- 
Scott Ribe
scott_r...@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] invalid input syntax for type bytea

2011-08-31 Thread Tom Lane

Alan Millington admilling...@yahoo.co.uk writes:
 I recently hit an error when trying to insert into a bytea column. The 
 following is the first part of what appears in the Postgres log:
 2011-08-25 14:42:40 BST HINT:  Use the escape string syntax for backslashes, 
 e.g., E'\\'.2011-08-25 14:42:40 BST ERROR:  invalid input syntax for type 
 bytea at character 752011-08-25 14:42:40 BST STATEMENT:  insert into 
 FAMILY_DATA (family_id, seq_num, family_pod) values (177, 20,
  
 '\\200\\002\\135q\\001\\050cperson\\012Person\\012q\\002\\051\\201q\\003\\175q\\004\\050U\\006\\137namesq\\005cperson\\012\\137Names\\012q\\006\\051\\201q\\007cperson\\012\\137Name\\012q\\010\\051\\201q\\011\\050\\135q\\012X\\003\\000\\000\\00017\\052q\\013a\\135q\\014X\\002\\000\\000\\000ABq\\015ae\\175q\\016U\\006parentq\\017h\\007sba\\175q\\020U\\006personq\\021h\\003sbU\\021nationality\\137eventq\\022cperson\\012Dictof\\137Nationality\\137Event\\012q\\023\\051\\201q\\024X\\004\\000\\000\\000UKBSq\\025\\175q\\026\\051cperson\\012Listof\\137Nationality\\137Event\\012q\\027\\051\\201q\\030\\050cperson\\012Nationality\\137Event\\012q\\031\\051\\201q\\032\\175q\\033\\050U\\015contra\\137eventsq\\034\\051U\\006parentq\\035h\\030U\\004bvosq\\036c\\137\\137builtin\\137\\137\\012set\\012q\\037\\135q
  
 U\\002ATq\\041a\\205Rq\\042U\\007cfflagsq\\043\\051U\\006effectq\\044U\\001AU\\005cinfoq\\045ccinfo\\012BCInfo
 (I omit the rest, but there is a close quote and a close parenthesis at the 
 end.)

Well, the part you showed us is perfectly valid bytea data, so the
problem is somewhere in what you omitted.

 If I have counted right, character 75 is the '7' in '\\175', which looks to 
 be a valid octal value.

Unfortunately, that number is just a pointer to the start of the bytea
literal within the statement, so it's not very helpful here :-(.  What
you need to look for is a backslash that's not followed by either a
second backslash or octal digits.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] invalid input syntax for type bytea

2011-08-31 Thread John R Pierce


On 08/31/11 9:02 AM, Alan Millington wrote:

I am running Postgres 8.4.1 on Windows XP Pro...


8.4 is currently up to 8.4.8 and you really should upgrade.8.4.1 is 
about 2 years old, and there's a long list of bugs fixed in the later 
8.4 updates, see the release notes for 8.4.2 through 8.4.8 for the gory 
details,

http://www.postgresql.org/docs/8.4/static/release.html


Any particular reason you're using ODBC and not a PostgreSQL python 
binding like Psycopg ?   I find the more layers of abstraction between 
your code and the outside world, the more obscure the bugs are, and the 
harder they are to track down.





--
john r pierceN 37, W 122
santa cruz ca mid-left coast


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] regression between 8.4.8 and 8.4.2?

2011-08-31 Thread Tom Lane

Peter Eisentraut pete...@gmx.net writes:
 I don't have an answer for you, but this report looks suspiciously
 similar to the one I posted the other day at
 http://archives.postgresql.org/pgsql-hackers/2011-08/msg01224.php,
 which, now that I think about it, also manifested itself after the
 upgrade to 8.4.8.

I think there is more than one thing going on here.  I've identified a
logic error in this 8.4 change:
http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=7f3eba30
which is that it is relying on vardata[12]-rel-rows to provide the
number of rows coming in to the semijoin, but that's only accurate for a
single level of join.  With two nested semijoins you get a pretty wacko
answer --- unless you prevent them from being folded by inserting OFFSET
0.  So that's definitely a bug, but it doesn't explain Ben's complaint
because he's griping about a case where he did have OFFSET 0.  (I wonder
though if this is the 8.4 planner regression that he put in the OFFSET
for originally.  Sure would be nice if people reported such things
instead of hacking around them and imagining that they'll get fixed
magically.)  It also doesn't explain Mark Kirkwood's complaint, since
he's showing test cases that involve only one join.

The only relevant-looking change I can find between 8.4.7 and 8.4.8 is
http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=0ae8b300388c2a3eaf90e6e6f13d6be1f4d4ac2d
which again should not have caused the amount of excitement we're seeing
on this thread, since it should represent no worse than a reversion to
8.3 behavior.  Possibly what we're after is some earlier 8.4.x patch.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] regression between 8.4.8 and 8.4.2?

2011-08-31 Thread Ben Chobot

On Aug 31, 2011, at 10:47 AM, Tom Lane wrote:

Peter Eisentraut pete...@gmx.net writes:
I don't have an answer for you, but this report looks suspiciously
similar to the one I posted the other day at
http://archives.postgresql.org/pgsql-hackers/2011-08/msg01224.php,
which, now that I think about it, also manifested itself after the
upgrade to 8.4.8.

I think there is more than one thing going on here. I've identified a
logic error in this 8.4 change:
http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=7f3eba30
which is that it is relying on vardata[12]-rel-rows to provide the
number of rows coming in to the semijoin, but that's only accurate for a
single level of join. With two nested semijoins you get a pretty wacko
answer --- unless you prevent them from being folded by inserting OFFSET
0. So that's definitely a bug, but it doesn't explain Ben's complaint
because he's griping about a case where he did have OFFSET 0. (I wonder
though if this is the 8.4 planner regression that he put in the OFFSET
for originally. Sure would be nice if people reported such things
instead of hacking around them and imagining that they'll get fixed
magically.) It also doesn't explain Mark Kirkwood's complaint, since
he's showing test cases that involve only one join.

The only relevant-looking change I can find between 8.4.7 and 8.4.8 is
http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=0ae8b300388c2a3eaf90e6e6f13d6be1f4d4ac2d
which again should not have caused the amount of excitement we're seeing
on this thread, since it should represent no worse than a reversion to
8.3 behavior. Possibly what we're after is some earlier 8.4.x patch.

Tom, if there's anything else we can provide that might you out, let me know.
We're currently about to install an earlier 8.4 version to see if the problem
goes away. Is there a particular version you'd be interested to know about?
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] regression between 8.4.8 and 8.4.2?

2011-08-31 Thread Tom Lane

Ben Chobot be...@silentmedia.com writes:
 Tom, if there's anything else we can provide that might you out, let me know.

If you could extract a self-contained test case for the bad estimation,
that would be useful.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Using a function in different schemas

2011-08-31 Thread Adrian Klaver


On 08/31/2011 08:38 AM, Juan Manuel Alvarez wrote:

Hello everyone! This is the first time I post on this list. Until now,
I was able to get a solution to all the problems I had, but now I am
facing an issue I can't resolve, so I hope you can help me.

The problem goes like this: I am using the same distribution of
tables/functions into different schemas, so I have the following
layout
- schema1 has tables myTable1 and myTable2, and function myFunction
- schema2 has tables myTable1 and myTable2, and function myFunction

Until now I used the full names to distinguish between schemas, like
this: schema1.myTable1 and schema2.myTable1.
But now I have to add an update trigger in both myTable1 that calls
to a function that makes a select statement on myTable2.
The problem is that even if the trigger is inside mySchema1, it
can't find myTable2 and I don't know how to get the schema that the
function is into.

Things I have tried so far:
- current_schema() but, according to the documentation, it returns
the name of the schema that is first in the search path and that is
neither mySchema1 or mySchema2.
- Passing a parameter to the trigger: since it is an update trigger I
haven't found a way to pass parameter to it.

Can anyone please point me in the right direction?


Assuming using plpgsql look here:
http://www.postgresql.org/docs/9.0/interactive/plpgsql-trigger.html

In particular:


TG_TABLE_SCHEMA

Data type name; the name of the schema of the table that caused the 
trigger invocation.





Thanks in advance!
Juan M.




--
Adrian Klaver
adrian.kla...@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Using a function in different schemas

2011-08-31 Thread Juan Manuel Alvarez

Thanks Adrian! That is exactly what I was looking for! =o)

On Wed, Aug 31, 2011 at 3:35 PM, Adrian Klaver adrian.kla...@gmail.com wrote:
 On 08/31/2011 08:38 AM, Juan Manuel Alvarez wrote:

 Hello everyone! This is the first time I post on this list. Until now,
 I was able to get a solution to all the problems I had, but now I am
 facing an issue I can't resolve, so I hope you can help me.

 The problem goes like this: I am using the same distribution of
 tables/functions into different schemas, so I have the following
 layout
 - schema1 has tables myTable1 and myTable2, and function
 myFunction
 - schema2 has tables myTable1 and myTable2, and function
 myFunction

 Until now I used the full names to distinguish between schemas, like
 this: schema1.myTable1 and schema2.myTable1.
 But now I have to add an update trigger in both myTable1 that calls
 to a function that makes a select statement on myTable2.
 The problem is that even if the trigger is inside mySchema1, it
 can't find myTable2 and I don't know how to get the schema that the
 function is into.

 Things I have tried so far:
 - current_schema() but, according to the documentation, it returns
 the name of the schema that is first in the search path and that is
 neither mySchema1 or mySchema2.
 - Passing a parameter to the trigger: since it is an update trigger I
 haven't found a way to pass parameter to it.

 Can anyone please point me in the right direction?

 Assuming using plpgsql look here:
 http://www.postgresql.org/docs/9.0/interactive/plpgsql-trigger.html

 In particular:

 
 TG_TABLE_SCHEMA

    Data type name; the name of the schema of the table that caused the
 trigger invocation.
 


 Thanks in advance!
 Juan M.



 --
 Adrian Klaver
 adrian.kla...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] regression between 8.4.8 and 8.4.2?

2011-08-31 Thread Ben Chobot

On Aug 31, 2011, at 11:10 AM, Tom Lane wrote:

 Ben Chobot be...@silentmedia.com writes:
 Tom, if there's anything else we can provide that might you out, let me know.
 
 If you could extract a self-contained test case for the bad estimation,
 that would be useful.

OK, we'll pull something together. In the meantime, we can confirm that 
reverting from 8.4.8 to 8.4.3 fixes things.

Re: [GENERAL] how do I disable automatic start on mac os x?

2011-08-31 Thread edwardIshaq

if you do:
open   
OS X will open the file in the plist editor.
I tried doing that but didn't get a way with saving though :)


--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/how-do-I-disable-automatic-start-on-mac-os-x-tp1926565p4754428.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] regression between 8.4.8 and 8.4.2?

2011-08-31 Thread Ben Chobot

On Aug 31, 2011, at 11:53 AM, Ben Chobot wrote:

 On Aug 31, 2011, at 11:10 AM, Tom Lane wrote:
 
 Ben Chobot be...@silentmedia.com writes:
 Tom, if there's anything else we can provide that might you out, let me 
 know.
 
 If you could extract a self-contained test case for the bad estimation,
 that would be useful.
 
 OK, we'll pull something together. In the meantime, we can confirm that 
 reverting from 8.4.8 to 8.4.3 fixes things.

 and FWIW 9.0.3 does the right thing as well.

[GENERAL] ARRAY_AGG(DISTINCT a ORDER BY b) Error

2011-08-31 Thread David Johnston

in an aggregate with DISTINCT, ORDER BY expressions must appear in argument
list

Why?

If I add the fields of the ORDER BY expression to the DISTINCT clause I can
no longer use DISTINCT since the ORDER BY values are not unique. Nor do I
want the contents of the final ARRAY to contain the ORDER BY column.

I presuppose this is a technical limitation since my particular use-case
(and I've come across a few cases where this would be useful) doesn't seem
that obscure.

My specific sample query (use-case) is as follows:

SELECT control, ARRAY_AGG(DISTINCT accountnumber ORDER BY amount DESC)
FROM (VALUES ('A', '1000', 100), ('A', '1000', -50), ('A', '2000',200))
accountdetail (control, accountnumber, amount)
GROUP BY control

I want to create an array of the unique account numbers associated with a
control with the ordering of the array matching the order of the amounts.
In this case I would want the output to be:

(A, {'2000','1000'})

David J.







-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] ARRAY_AGG(DISTINCT a ORDER BY b) Error

2011-08-31 Thread Tom Lane

David Johnston pol...@yahoo.com writes:
 in an aggregate with DISTINCT, ORDER BY expressions must appear in argument
 list

 Why?

Because the results are ill-defined otherwise.  In your example,

   ... ARRAY_AGG(DISTINCT accountnumber ORDER BY amount DESC) ...

there may be many rows with the same accountnumber and yet different
amount values.  Which of those amounts should be used as the sort key
for the aggregated row?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] ARRAY_AGG(DISTINCT a ORDER BY b) Error

2011-08-31 Thread Thom Brown

On 31 August 2011 23:54, David Johnston pol...@yahoo.com wrote:
 in an aggregate with DISTINCT, ORDER BY expressions must appear in argument
 list

 Why?

 If I add the fields of the ORDER BY expression to the DISTINCT clause I can
 no longer use DISTINCT since the ORDER BY values are not unique. Nor do I
 want the contents of the final ARRAY to contain the ORDER BY column.

 I presuppose this is a technical limitation since my particular use-case
 (and I've come across a few cases where this would be useful) doesn't seem
 that obscure.

 My specific sample query (use-case) is as follows:

 SELECT control, ARRAY_AGG(DISTINCT accountnumber ORDER BY amount DESC)
 FROM (VALUES ('A', '1000', 100), ('A', '1000', -50), ('A', '2000',200))
 accountdetail (control, accountnumber, amount)
 GROUP BY control

 I want to create an array of the unique account numbers associated with a
 control with the ordering of the array matching the order of the amounts.
 In this case I would want the output to be:

 (A, {'2000','1000'})

I'm not sure that makes sense.  If you're aggregating accountnumber as
an array of distinct values, what do you expect your query to output
if, say you had the following:

accountnumber, amount
1000,100
2000,200
1000,300

You've ordered by amount, but accountnumber has 2 identical values,
where the amount is less than the amount corresponding to
accountnumber 2000 in one instance, but greater in another.  Where
does 1000 appear?  Before or after 2000?

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] ARRAY_AGG(DISTINCT a ORDER BY b) Error

2011-08-31 Thread David Johnston

-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Wednesday, August 31, 2011 7:10 PM
To: David Johnston
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] ARRAY_AGG(DISTINCT a ORDER BY b) Error 

David Johnston pol...@yahoo.com writes:
 in an aggregate with DISTINCT, ORDER BY expressions must appear in 
 argument list

 Why?

Because the results are ill-defined otherwise.  In your example,

   ... ARRAY_AGG(DISTINCT accountnumber ORDER BY amount DESC) ...

there may be many rows with the same accountnumber and yet different
amount values.  Which of those amounts should be used as the sort key for
the aggregated row?

regards, tom lane

--

My take is that you look at the non-aggregated data using the same ORDER BY
clause and then add the values in the order they appear in the detail.

So (account, amount, [Action]):

1000, 150, [Add]
1000, 130, [Skip]
2000, 120, [Add]
2000, 100, [Skip]
1000, 50, [Skip]
2000, 0, [Skip]
3000, -10, [Add]
1000, -50, [Skip]

This is basically how a plain DISTINCT would have to work (taking the first
encountered value and skipping any subsequent repeats).  I want the same
general behavior but have the opportunity to order the input frame so that
I can manipulate the encounter order.

In a 2-phase process you would do the following (I will remain specific for
the time being):

Phase 1: Select the representative record for each DISTINCT (accountnumber);
The representative would be the FIRST record as determined via an ORDER BY
of all candidate records.
Phase 2: Order the DISTINCT representatives based upon the same ORDER BY
clause

So:
Phase 1: (3000, -10), (1000, 150), (2000, 120) [Not Ordered] 
Phase 2: (1000, 150), (2000, 120), (3000, -10) [ORDER BY amount DESC]

The behavior/concept of [ARRAY_AGG(DISTINCT ON (accountnumber) .?. ORDER BY
amount] matches here but I do not want to output an amount column at all.

The other view is that you can create the ARRAY using just the ORDER BY and
then immediately DISTINCTify the array so that there are no duplicates.
This is basically the 2-phase process described above.  In this view you
basically keep the DISTINCT value that has the lowest array index.

I would be interested in other possible interpretations/algorithms that
would then cause ambiguity in deciding which algorithm to implement.

I know that I am only considering ARRAY_AGG in my examples but my first
reaction is that other aggregates would behave acceptably under the
algorithm described; and since the current behavior is FAIL at least some
well-defined behavior would exist for the specified syntax.

David J.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] function on trigger

2011-08-31 Thread Marcos Hercules Santos

hi guys

I'm newbie in Psql and I'm trying to build one function  in order to
count the products for each supplier. So i'm gonna put it quite simply
though this example


Please, consider a table called books with the following fields

bookid, title, price, idPublisher



and one another table called publisher

Idpublisher, name, city, Books


Being in that last field from Publisher, called book, I gotta have the
amount of published books for each publisher.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] function on trigger

2011-08-31 Thread David Johnston

On Aug 31, 2011, at 18:39, Marcos Hercules Santos mhe...@gmail.com wrote:

 hi guys
 
 I'm newbie in Psql and I'm trying to build one function  in order to
 count the products for each supplier. So i'm gonna put it quite simply
 though this example
 
 
 Please, consider a table called books with the following fields
 
 bookid, title, price, idPublisher
 
 
 
 and one another table called publisher
 
 Idpublisher, name, city, Books
 
 
 Being in that last field from Publisher, called book, I gotta have the
 amount of published books for each publisher.

Is there a question somewhere?


 
Do not be tempted by the dark side Marcos.  Create a view that uses a join and 
a count.  Only if you have significant performance issues would you then 
consider materializing that view.

If you insist, or simply want to learn, read the sections on UPDATE, CREATE 
TRIGGER, and CREATE FUNCTION.  Try doing it following those examples.  Ask more 
specific questions if something specific stumps you.

Do it incrementally.  Get the raw SQL UPDATE working then create the trigger 
and function and get those working.  Then combine the two.

David J.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] function on trigger

2011-08-31 Thread John R Pierce


On 08/31/11 3:39 PM, Marcos Hercules Santos wrote:

I'm newbie in Psql and I'm trying to build one function  in order to
count the products for each supplier. So i'm gonna put it quite simply
though this example


Please, consider a table called books with the following fields

bookid, title, price, idPublisher



and one another table called publisher

Idpublisher, name, city, Books


Being in that last field from Publisher, called book, I gotta have the
amount of published books for each publisher.


get rid of the books field on your publisher table, thats dynamic and 
changes as you add/remove books from the book table.  to get that data, 
try...


SELECT p.idPublisher, p.name, p.city, COUNT(b.bookid) AS books FROM 
publisher p JOIN books b USING idPublisher GROUP BY p.idPublisher;


you could make this a view if its too cumbersome.

CREATE VIEW publisher_books SELECT p.idPublisher, p.name, 
p.city, COUNT(b.bookid) AS books FROM publisher p JOIN books b USING 
idPublisher GROUP BY p.idPublisher;


SELECT * from publisher_books;

and of course, add other WHERE conditions...

SELECT books FROM publisher_books WHERE name=?;



--
john r pierceN 37, W 122
santa cruz ca mid-left coast


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

49 matches

Mail list logo