Re: [GENERAL] Getting Table Names in a Particular Database
On Tue, Aug 30, 2011 at 11:50 PM, Adarsh Sharma adarsh.sha...@orkash.com wrote: I understand, So there is no way to fetch table in a single query. The only way is : 1. Connect demo 2. Execute the query 'SELECT n.nspname as Schema, c.relname as Name, CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' END as Type, pg_catalog.pg_get_userbyid(c.relowner) as Owner FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind IN ('r','v','S','') AND n.nspname 'pg_catalog' AND n.nspname 'information_schema' AND n.nspname !~ '^pg_toast' ORDER BY 1,2; As in Mysql we can view all tables in a test database from below command : select table_name from information_schema.tables where table_schema ='test';; Have you tried it in pgsql, cause that works too. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Getting Table Names in a Particular Database
pdc_uima=# select table_name from information_schema.tables where table_schema='pdc_uima'; table_name (0 rows) But filtering on 'public', it gives the result , : pdc_uima=# select * from information_schema.tables where table_schema='public'; table_catalog | table_schema |table_name| table_type | self_referencing_column_name | reference_generation | user_defined_type_catalog | user_define d_type_schema | user_defined_type_name | is_insertable_into | is_typed | commit_action ---+--+--++--+--+---+ --+++--+--- pdc_uima | public | spatial_ref_sys | BASE TABLE | | | | || YES| NO | pdc_uima | public | geometry_columns | BASE TABLE | | | | || YES| NO | pdc_uima | public | adarsh | BASE TABLE | | | | || YES| NO | (3 rows) Come back to the original problem. I have 10 databases with different names you have to go into the database by \c command to fetch the table names. Thanks Scott Marlowe wrote: On Tue, Aug 30, 2011 at 11:50 PM, Adarsh Sharma adarsh.sha...@orkash.com wrote: I understand, So there is no way to fetch table in a single query. The only way is : 1. Connect demo 2. Execute the query 'SELECT n.nspname as Schema, c.relname as Name, CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' END as Type, pg_catalog.pg_get_userbyid(c.relowner) as Owner FROM pg_catalog.pg_class cLEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind IN ('r','v','S','') AND n.nspname 'pg_catalog'AND n.nspname 'information_schema' AND n.nspname !~ '^pg_toast' ORDER BY 1,2; As in Mysql we can view all tables in a test database from below command : select table_name from information_schema.tables where table_schema ='test';; Have you tried it in pgsql, cause that works too.
Re: [GENERAL] Getting Table Names in a Particular Database
On Wed, Aug 31, 2011 at 12:10 AM, Adarsh Sharma adarsh.sha...@orkash.com wrote: Come back to the original problem. I have 10 databases with different names you have to go into the database by \c command to fetch the table names. Again, in PostgreSQL databases are very separate objects. In mysql they are closer to schemas than separate entities. If you want to examine a database in pg, you need to connect to it. period. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgresql-9.0.1 Recovery
Thanks Craig ! Below is what i did - 1. pg_start_backup() 2. rsync the data dir 3. pg_stop_backup() I believe the backup is valid because, i was able to bring up the cluster without any issues (ofcourse with data loss). +ve signs- I am able to bring up the cluster with the Online backup, but, only with the loss of data. -ve signs and things to be strongly foreseen while backup testing - - pg_clog files were not synced. I suspect they were being written at the time of backup. I might have tried to sync the data dir when pg_clog files were half filled. - Though the WAL Archives are there, Postgres is not trying to recover beyond the timestamp at which pg_clog was missing. - Even if i replace the missing pg_clog files (which i did), Postgres is asking for the corresponding wal archive files Yes. What i learnt is that we need to ensure that all the pg_clog files must be fully copied as on the backup time. We cannot afford to miss any of them. Thanks Venkat On Wed, Aug 31, 2011 at 5:46 AM, Craig Ringer ring...@ringerc.id.au wrote: On 30/08/2011 6:59 PM, Venkat Balaji wrote: Hello Everyone, I have a situation here - I am trying to restore the production online backup and recover the same. - I had initially rsynced (excluded pg_log) the data directory and the tarred and zipped the same Did you do that after pg_start_backup() or on a stopped database server? If you did it on a running database server without first running pg_start_backup(), your backup is invalid. Personally I like to take my base backups from an LVM snapshot of the datadir just to be extra safe. That isn't necessary, though, and a regular rsync or tar or whatever of a datadir after pg_start_backup() is fine. Remember to run pg_stop_backup() afterwards. - I got an error unable to read filename from pg_clog location (file size is around 160K) ... from PostgreSQL, when you tried to start it? What emitted that error message? What i understood is that, rsync some how missed out on syncing the files in pg_clog so, i had manually coped the missing pg_clog file from production and tried recovery. That won't work. You need a consistent snapshot of all the files in the data dir. You cannot just mix and match copies taken at different times. For efficiency reasons PostgreSQL will recycle used clog files. You can't just copy a file over and hope that because it has the same name, it still contains the data you want. Your backup *failed* at the point where you got an incomplete copy of the data directory. Do i need to get that particular wal archive which is before online backup time ? No, you need to get the missing clog files. If you cannot do that, try using pg_resetxlog, but be aware that that may lose transactions and can potentially cause corruption of tables and indexes. By this experience what i understand is that Postgresql stores committed and uncommited transactions in pg_xlog / wal archive files and information (not the transaction data) about transaction commit status is stored in pg_clog. Am I correct ? That sounds right to me, but I don't know as much about how Pg stores things as I should. I am in the process of designing a disaster recovery planner for our productions systems. Congratulations! Be extremely glad this didn't happen in a real recovery scenario. This is a marvellous example of why you should always test your backups - you actually did, and found a problem that would've been a critical issue if the backup were actually needed. -- Craig Ringer
[GENERAL] row is too big
select version() PostgreSQL 8.2.19 on i686-pc-linux-gnu, compiled by GCC gcc (Debian 4.3.2-1.1) 4.3.2 Before you jump down my throat about bad design, this is a reporting table that is generated based on what the users want to see in their rows and columns. (I'm basically generating a spreadsheet for the user to copy and paste and then play with graphing functions) I have a table with 952 fields one text, one int, all the rest numeric (no precision or scale defined). The value of one row is the sum of all the other rows (insert into table select sum(a),sum(b),sum(c)...) The values in this case all fit into an integer size, int column is a sum of all the other columns and is 300,000 The text value is blank ('' not null. When I made it null it took up 16 more bytes) I am getting an error on inserting the total row: ERROR: row is too big: size 11436, maximum size 8136 The manual doesn't say exactly how much storage a numeric type uses, but it seems to me that it is taking up about 12 bytes per field. Does this make any sense? Thanks Sim -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] regression between 8.4.8 and 8.4.2?
I don't have an answer for you, but this report looks suspiciously similar to the one I posted the other day at http://archives.postgresql.org/pgsql-hackers/2011-08/msg01224.php, which, now that I think about it, also manifested itself after the upgrade to 8.4.8. On tis, 2011-08-30 at 15:24 -0700, Ben Chobot wrote: We recently took a copy of our production data (running on 8.4.2), scrubbed many data fields, and then loaded it onto a qa server (running 8.4.8). We're seeing some odd planner performance that I think might be a bug, though I'm hoping it's just idiocy on my part. I've analyzed things and looked into pg_stats and it seems as if the relevant columns have about the same statistics. I've managed to simplify the query, but if I make it any simpler, then the two servers end up with the same good plan. The query is down to: SELECT machines.quota_purchased FROM machines WHERE NOT deleted AND machines.user_id IN ( SELECT id FROM users WHERE user_group_id IN ( select 607547 offset 0 ) OFFSET 0 ); (Those offset 0 are in there to protect us from planner regressions we saw when moving to 8.4. When we move to 9, they can hopefully go away.) On the production server, this returns a fairly accurate plan: QUERY PLAN -- Nested Loop (cost=843.59..1447.90 rows=243 width=8) (actual time=0.044..0.045 rows=1 loops=1) - HashAggregate (cost=843.59..845.59 rows=200 width=4) (actual time=0.027..0.027 rows=1 loops=1) - Limit (cost=0.02..823.90 rows=1575 width=4) (actual time=0.024..0.025 rows=1 loops=1) - Nested Loop (cost=0.02..823.90 rows=1575 width=4) (actual time=0.023..0.024 rows=1 loops=1) - HashAggregate (cost=0.02..0.03 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=1) - Limit (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1) - Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=1) - Index Scan using users_user_groups_idx on users (cost=0.00..804.18 rows=1575 width=8) (actual time=0.017..0.018 rows=1 loops=1) Index Cond: (users.user_group_id = (607547)) - Index Scan using machines_sid_un on machines (cost=0.00..3.00 rows=1 width=12) (actual time=0.015..0.015 rows=1 loops=1) Index Cond: (machines.user_id = users.id) Total runtime: 0.121 ms (12 rows) On the QA server, things are not so accurate. It doesn't hurt the timing of this simplified query much, but when put into the actual query, the row estimation being off by 6 orders of magnitude really throws the planning in the wrong direction. The plan on the QA server is: QUERY PLAN --- Nested Loop (cost=1887.16..3671.20 rows=1192462 width=8) (actual time=0.049..0.051 rows=1 loops=1) - HashAggregate (cost=1887.16..1889.16 rows=200 width=4) (actual time=0.032..0.033 rows=1 loops=1) - Limit (cost=0.02..1868.20 rows=1517 width=4) (actual time=0.027..0.029 rows=1 loops=1) - Nested Loop (cost=0.02..1868.20 rows=1517 width=4) (actual time=0.027..0.028 rows=1 loops=1) - HashAggregate (cost=0.02..0.03 rows=1 width=4) (actual time=0.008..0.008 rows=1 loops=1) - Limit (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1) - Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1) - Index Scan using users_user_groups_idx on users (cost=0.00..1849.20 rows=1517 width=8) (actual time=0.015..0.016 rows=1 loops=1) Index Cond: (users.user_group_id = (607547)) - Index Scan using machines_sid_un on machines (cost=0.00..8.90 rows=1 width=12) (actual time=0.013..0.013 rows=1 loops=1) Index Cond: (machines.user_id = users.id) Total runtime: 0.148 ms (12 rows) The problem here (I think) seems to be that the QA server believes that running a nested loop over 200 users.id values and joining that against machines.user_id will result in 1M rows. The production servers sees this more accurately as the nearly 1:1 relationship that it is. The reason I wonder if this might be a bug is because if I change the obtuse clause WHERE user_group_id IN (select 607547 offset 0) to simply where user_group_id in (607547) then the plan collapses to the same plan on both servers: explain analyze SELECT
Re: [GENERAL] regression between 8.4.8 and 8.4.2?
On ons, 2011-08-31 at 10:42 +0300, Peter Eisentraut wrote: I don't have an answer for you, but this report looks suspiciously similar to the one I posted the other day at http://archives.postgresql.org/pgsql-hackers/2011-08/msg01224.php, which, now that I think about it, also manifested itself after the upgrade to 8.4.8. See this thread: http://archives.postgresql.org/pgsql-performance/2011-08/msg00248.php It looks like there are a number of users affected by this. On tis, 2011-08-30 at 15:24 -0700, Ben Chobot wrote: We recently took a copy of our production data (running on 8.4.2), scrubbed many data fields, and then loaded it onto a qa server (running 8.4.8). We're seeing some odd planner performance that I think might be a bug, though I'm hoping it's just idiocy on my part. I've analyzed things and looked into pg_stats and it seems as if the relevant columns have about the same statistics. I've managed to simplify the query, but if I make it any simpler, then the two servers end up with the same good plan. The query is down to: SELECT machines.quota_purchased FROM machines WHERE NOT deleted AND machines.user_id IN ( SELECT id FROM users WHERE user_group_id IN ( select 607547 offset 0 ) OFFSET 0 ); (Those offset 0 are in there to protect us from planner regressions we saw when moving to 8.4. When we move to 9, they can hopefully go away.) On the production server, this returns a fairly accurate plan: QUERY PLAN -- Nested Loop (cost=843.59..1447.90 rows=243 width=8) (actual time=0.044..0.045 rows=1 loops=1) - HashAggregate (cost=843.59..845.59 rows=200 width=4) (actual time=0.027..0.027 rows=1 loops=1) - Limit (cost=0.02..823.90 rows=1575 width=4) (actual time=0.024..0.025 rows=1 loops=1) - Nested Loop (cost=0.02..823.90 rows=1575 width=4) (actual time=0.023..0.024 rows=1 loops=1) - HashAggregate (cost=0.02..0.03 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=1) - Limit (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1) - Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=1) - Index Scan using users_user_groups_idx on users (cost=0.00..804.18 rows=1575 width=8) (actual time=0.017..0.018 rows=1 loops=1) Index Cond: (users.user_group_id = (607547)) - Index Scan using machines_sid_un on machines (cost=0.00..3.00 rows=1 width=12) (actual time=0.015..0.015 rows=1 loops=1) Index Cond: (machines.user_id = users.id) Total runtime: 0.121 ms (12 rows) On the QA server, things are not so accurate. It doesn't hurt the timing of this simplified query much, but when put into the actual query, the row estimation being off by 6 orders of magnitude really throws the planning in the wrong direction. The plan on the QA server is: QUERY PLAN --- Nested Loop (cost=1887.16..3671.20 rows=1192462 width=8) (actual time=0.049..0.051 rows=1 loops=1) - HashAggregate (cost=1887.16..1889.16 rows=200 width=4) (actual time=0.032..0.033 rows=1 loops=1) - Limit (cost=0.02..1868.20 rows=1517 width=4) (actual time=0.027..0.029 rows=1 loops=1) - Nested Loop (cost=0.02..1868.20 rows=1517 width=4) (actual time=0.027..0.028 rows=1 loops=1) - HashAggregate (cost=0.02..0.03 rows=1 width=4) (actual time=0.008..0.008 rows=1 loops=1) - Limit (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1) - Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1) - Index Scan using users_user_groups_idx on users (cost=0.00..1849.20 rows=1517 width=8) (actual time=0.015..0.016 rows=1 loops=1) Index Cond: (users.user_group_id = (607547)) - Index Scan using machines_sid_un on machines (cost=0.00..8.90 rows=1 width=12) (actual time=0.013..0.013 rows=1 loops=1) Index Cond: (machines.user_id = users.id) Total runtime: 0.148 ms (12 rows) The problem here (I think) seems to be that the QA server believes that running a nested loop over 200 users.id values and joining that against machines.user_id will result in 1M rows. The production servers sees this more
Re: [GENERAL] IDLE queries taking up space
On 31/08/2011 12:03 AM, JD Wong wrote: Hi, When I run select datname, procpid, current_query from pg_stat_activity; I get 26 rows of IDLE queries. How can I set postgres to qutomatically close connections that have finished their queries and now sit idle? If they're not idle in transaction, they don't matter much. Try filtering the list based on last activity, so you only see those connections that have been idle for a while. Short periods of idle are normal for many applications because they perform a query then process its results and issue other queries based on the results, or because they keep a connection around between requests from users. If the idle connections are actually causing an issue - for example, if they're preventing the release of non-trivial amounts of backend private memory back to the OS - you can tweak the client to disconnect after a certain idle time, or you can use a connection pool. Connection pools may be inside the client (for example, in Java EE application servers) or between the client and the server using tools like pgbouncer and PgPool-II. -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Parameterized prepared statements
Hi folks While replying to another query, something struck me as interesting that I thought I'd raise. People here often raise issues where preparing a parameterised query and executing the prepared query results in a plan that's sub-optimal for the particular values substituted into it. The cause is well understood - the planner has to pick a plan that looks good without knowledge of what the value to be substituted in will be. Things like pre-parsed prepared statements that're re-planned on every execution are often proposed as solutions to this. This has me wondering: rather than expensively re-planning from scratch, would it be possiblet to adjust the planning process so that *multiple* alternative plans would be cached for a query, using placeholders for unknown rowcounts and costs? At execution, the unknown costs would be filled in and the plans compared then the best plan picked for this execution. Is this crazy talk, or could it significantly reduce the cost of re-planning parameterized prepared statements to the point where it'd be worth doing by default? On an unrelated note, does Pg do any kind of smart searching on `IN' lists, or just a linear scan? Would it be worth sorting longer IN list results so each iteration could do a binary search of the list? -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] ERD Tool
Dear all, Is there any open source ERD Tool for Postgresql Database. I find some paid tools but looking for free tools. Thanks -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ERD Tool
Adarsh Sharma, 31.08.2011 13:54: Dear all, Is there any open source ERD Tool for Postgresql Database. I find some paid tools but looking for free tools. Have a look at Power*Architect: http://www.sqlpower.ca/page/architect It's not perfect but it's quite OK. As it is a multi-DBMS tool it does not support any Postgres specific features or datatypes. To avoid the nasty registration that is required on their homepage you can also download the binaries directly from the Google code project homepage: http://code.google.com/p/power-architect/downloads/list Regards Thomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SELECT Query on DB table preventing inserts
On Tue, Aug 30, 2011 at 13:52, Daniel Verite dan...@manitou-mail.org wrote: Dan Scott wrote: the insert process is unable to insert new rows into the database You should probably provide the error message on insert or otherwise describe how it's not working. Normally reading does not unintentionally prevent writing in a concurrent session. I've investigated a little further and it seems that they don't really 'fail'. Just start taking significantly longer to insert, and the messages start backing up and eventually stop being sent because they're not being acknowledged. I can see a few WARNING: pgstat wait timeout messages around the time that this is happening in the syslog. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SELECT Query on DB table preventing inserts
On Tue, Aug 30, 2011 at 13:21, Scott Ribe scott_r...@elevated-dev.com wrote: On Aug 30, 2011, at 8:22 AM, Dan Scott wrote: Perhaps because I'm locking the table with my query? Do you mean you're explicitly locking the table? If so, why??? No, not explicitly. I just thought of it as a possible explanation. If reading from the table does not lock the table for writing, then that's not the problem. Thanks, Dan -- Scott Ribe scott_r...@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SELECT Query on DB table preventing inserts
On 31 Srpen 2011, 1:07, Dan Scott wrote: On Tue, Aug 30, 2011 at 13:52, Daniel Verite dan...@manitou-mail.org wrote: Dan Scott wrote: the insert process is unable to insert new rows into the database You should probably provide the error message on insert or otherwise describe how it's not working. Normally reading does not unintentionally prevent writing in a concurrent session. I've investigated a little further and it seems that they don't really 'fail'. Just start taking significantly longer to insert, and the messages start backing up and eventually stop being sent because What messages are you talking about? they're not being acknowledged. I can see a few WARNING: pgstat wait timeout messages around the time that this is happening in the syslog. The pgstat messages are a typical symptom of I/O bottleneck - it just means you'ro doing a lot of writes, more than the drives can take. Enable checkpoint logging (log_checkpoints=on) and watch the system stats (e.g. using 'iostat -x' or vmstat), my bet is this is a checkpoint or pdflush issue. Anyway we need more info about your system - Pg version, amount of RAM, shared buffers, checkpoint settings (segments, completion) and page cache config (/proc/sys/vm/). A few lines of vmstat/iostat output would help too. Tomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] heavy swapping, not sure why
On Tue, Aug 30, 2011 at 10:05 PM, Scott Marlowe scott.marl...@gmail.com wrote: On Tue, Aug 30, 2011 at 8:36 PM, mark dvlh...@gmail.com wrote: To the broader list, regarding troubles with kswap. I am curious to what others seeing from /proc/zoneinfo for DMA pages (not dma32 or normal) - basically if it sits at 1 or not. Setting swappiness to 0 did not have any affect for us on kswap issues. Another thing I have not had time and resources to go work on... interested in what kernel they are running and what storage drivers they might be using. Well, we had zone reclaim mode autoset to 1, and we had to turn it off to get decent performance with postgresql. Machine was a quad dodecacore Magny Cours, so 48 cores with 128G RAM. RAID controller is an Areca 1680 with BBU, 34 15kRPM 147G SAS Seagate 15k6 drives in two 16 drive external enclosures and 2 drives in the server. The only solution we could find for kswapd going crazy was to just turn off swap. Pretty sure I used a large swap file to test larger swaps, but all that did was put off the eventual kswapd storm. It took anywhere from one to two weeks, maybe more, and then one day you check and your servers maxed out by kswapd. hm, that's an interesting counterpoint to what I've been saying. I've never seen that, I wonder what the underlying trigger was? I typically set shared_buffers fairly low (even to the default, raising only when I think it might help) -- I wonder if that plays in. to setting 1000 connections: some applications rely on database session features (like advisory locks or listen/notfiy) and retooling the client is more trouble than it's worth. This is definitely on the upper bound of what's reasonable though...these days I code with the assumption that pgbouncer is going to be put in even if I don't need it right away. merlin merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] row is too big
Sim Zacks s...@compulab.co.il writes: The manual doesn't say exactly how much storage a numeric type uses, but it seems to me that it is taking up about 12 bytes per field. Does this make any sense? Yeah, that would be the minimum size of a nonzero numeric value in 8.2. (More recent versions can pack them a bit tighter in many cases.) You might consider whether you can put all those numeric fields into an array. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Index usage on OR queries
Hi, I'm trying to optimize a query where I have two tables that both have a timestamp column. I want the result where either of the timestamps is after a specified time. In a reduced form, like this: CREATE TABLE a ( id serial NOT NULL PRIMARY KEY, time timestamp without time zone NOT NULL DEFAULT now() ); CREATE INDEX a_time_idx ON a USING btree (time DESC NULLS LAST); CREATE TABLE b ( id serial NOT NULL PRIMARY KEY, time timestamp without time zone NOT NULL DEFAULT now() ); CREATE INDEX b_time_idx ON b USING btree (time DESC NULLS LAST); --- generate some data insert into a(time) select now() - '10 year'::interval * random() from generate_series(1, 100, 1); insert into b(time) select now() - '10 year'::interval * random() from generate_series(1, 100, 1); -- Using constraint works as expected, and uses the time index. select * from a join b using(id) where a.time = '2011-08-15'; -- ... both ways... select * from a join b using(id) where b.time = '2011-08-15'; -- However, if I'm trying to do this for both times at once, the time index is not used at all select * from a join b using(id) where a.time = '2011-08-15' OR b.time = '2011-08-01' -- This can be optimized by using CTEs with am as ( select * from a where time = '2011-08-15' ) , bm as ( select * from b where time = '2011-08-15' ) select * from am join bm using(id) -- end I'm just wondering why the optimizer does things the way it does - and if the CTE version is the best way to go... The actual case is slightly more complex and uses more tables - this is mostly a way to find updated data. -- Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul] demo 2011 Tore Halvorsen || +052 0553034554
Re: [GENERAL] Index usage on OR queries
-- This can be optimized by using CTEs with am as ( select * from a where time = '2011-08-15' ) , bm as ( select * from b where time = '2011-08-15' ) select * from am join bm using(id) Disregard this, it doesn't to the same at all. Now I'm more confused as to how I can optimize the query. -- Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul] demo 2011 Tore Halvorsen || +052 0553034554
Re: [GENERAL] FATAL: terminating connection due to conflict with recovery
On 08/30/11 18:03, Fujii Masao wrote: On Wed, Aug 31, 2011 at 5:51 AM, Jeff Rossjr...@wykids.org wrote: Is there a setting in this or something else that I should tweak so this query can complete against the replica? Google turned up some threads on the error code associated with the error but I didn't find much else that seems applicable. Increasing max_standby_archive_delay and max_standby_streaming_delay would be helpful to make the query complete. Please see the following manual for details. http://www.postgresql.org/docs/9.0/interactive/hot-standby.html#HOT-STANDBY-CONFLICT Regards, Thank you! And now the error message is linked to the solution for the next person. Jeff -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Index usage on OR queries
On 8/31/2011 9:35 AM, Tore Halvorsen wrote: Hi, I'm trying to optimize a query where I have two tables that both have a timestamp column. I want the result where either of the timestamps is after a specified time. In a reduced form, like this: CREATE TABLE a ( id serial NOT NULL PRIMARY KEY, time timestamp without time zone NOT NULL DEFAULT now() ); CREATE INDEX a_time_idx ON a USING btree (time DESC NULLS LAST); CREATE TABLE b ( id serial NOT NULL PRIMARY KEY, time timestamp without time zone NOT NULL DEFAULT now() ); CREATE INDEX b_time_idx ON b USING btree (time DESC NULLS LAST); --- generate some data insert into a(time) select now() - '10 year'::interval * random() from generate_series(1, 100, 1); insert into b(time) select now() - '10 year'::interval * random() from generate_series(1, 100, 1); -- Using constraint works as expected, and uses the time index. select * from a join b using(id) where a.time = '2011-08-15'; -- ... both ways... select * from a join b using(id) where b.time = '2011-08-15'; -- However, if I'm trying to do this for both times at once, the time index is not used at all select * from a join b using(id) where a.time = '2011-08-15' OR b.time = '2011-08-01' -- This can be optimized by using CTEs with am as ( select * from a where time = '2011-08-15' ) , bm as ( select * from b where time = '2011-08-15' ) select * from am join bm using(id) -- end I'm just wondering why the optimizer does things the way it does - and if the CTE version is the best way to go... The actual case is slightly more complex and uses more tables - this is mostly a way to find updated data. -- Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul] demo 2011 Tore Halvorsen || +052 0553034554 On PG 9, after I ANALYZED the tables, it used indexes: QUERY PLAN --- Merge Join (cost=1.59..82778.35 rows=13171 width=20) (actual time=0.066..1076.616 rows=12966 loops=1) Merge Cond: (a.id = b.id) Join Filter: ((a.time = '2011-08-15 00:00:00'::timestamp without time zone) OR (b.time = '2011-08-01 0 - Index Scan using a_pkey on a (cost=0.00..31389.36 rows=100 width=12) (actual time=0.007..204.856 ro - Index Scan using b_pkey on b (cost=0.00..31389.36 rows=100 width=12) (actual time=0.006..224.189 ro ANALYZE is the magic. -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Index usage on OR queries
On Wed, Aug 31, 2011 at 4:49 PM, Andy Colson a...@squeakycode.net wrote: On PG 9, after I ANALYZED the tables, it used indexes: QUERY PLAN --**--** --**- Merge Join (cost=1.59..82778.35 rows=13171 width=20) (actual time=0.066..1076.616 rows=12966 loops=1) Merge Cond: (a.id = b.id) Join Filter: ((a.time = '2011-08-15 00:00:00'::timestamp without time zone) OR (b.time = '2011-08-01 0 - Index Scan using a_pkey on a (cost=0.00..31389.36 rows=100 width=12) (actual time=0.007..204.856 ro - Index Scan using b_pkey on b (cost=0.00..31389.36 rows=100 width=12) (actual time=0.006..224.189 ro ANALYZE is the magic. You are, of course, right, but it doesn't use the TIME index. Hmmm, may be my example isn't large enough to produce the issue. -- Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul] demo 2011 Tore Halvorsen || +052 0553034554
Re: [GENERAL] Index usage on OR queries
On 8/31/2011 9:53 AM, Tore Halvorsen wrote: On Wed, Aug 31, 2011 at 4:49 PM, Andy Colson a...@squeakycode.net mailto:a...@squeakycode.net wrote: On PG 9, after I ANALYZED the tables, it used indexes: QUERY PLAN --__--__--__- Merge Join (cost=1.59..82778.35 rows=13171 width=20) (actual time=0.066..1076.616 rows=12966 loops=1) Merge Cond: (a.id http://a.id = b.id http://b.id) Join Filter: ((a.time = '2011-08-15 00:00:00'::timestamp without time zone) OR (b.time = '2011-08-01 0 - Index Scan using a_pkey on a (cost=0.00..31389.36 rows=100 width=12) (actual time=0.007..204.856 ro - Index Scan using b_pkey on b (cost=0.00..31389.36 rows=100 width=12) (actual time=0.006..224.189 ro ANALYZE is the magic. You are, of course, right, but it doesn't use the TIME index. Hmmm, may be my example isn't large enough to produce the issue. -- Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul] demo 2011 Tore Halvorsen || +052 0553034554 wow, yea.. I saw index and just assumed. Didn't even notice, sorry about that. -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Index usage on OR queries
On 31 Srpen 2011, 16:49, Andy Colson wrote: On 8/31/2011 9:35 AM, Tore Halvorsen wrote: Hi, I'm trying to optimize a query where I have two tables that both have a timestamp column. I want the result where either of the timestamps is after a specified time. In a reduced form, like this: CREATE TABLE a ( id serial NOT NULL PRIMARY KEY, time timestamp without time zone NOT NULL DEFAULT now() ); CREATE INDEX a_time_idx ON a USING btree (time DESC NULLS LAST); CREATE TABLE b ( id serial NOT NULL PRIMARY KEY, time timestamp without time zone NOT NULL DEFAULT now() ); CREATE INDEX b_time_idx ON b USING btree (time DESC NULLS LAST); --- generate some data insert into a(time) select now() - '10 year'::interval * random() from generate_series(1, 100, 1); insert into b(time) select now() - '10 year'::interval * random() from generate_series(1, 100, 1); -- Using constraint works as expected, and uses the time index. select * from a join b using(id) where a.time = '2011-08-15'; -- ... both ways... select * from a join b using(id) where b.time = '2011-08-15'; -- However, if I'm trying to do this for both times at once, the time index is not used at all select * from a join b using(id) where a.time = '2011-08-15' OR b.time = '2011-08-01' -- This can be optimized by using CTEs with am as ( select * from a where time = '2011-08-15' ) , bm as ( select * from b where time = '2011-08-15' ) select * from am join bm using(id) -- end I'm just wondering why the optimizer does things the way it does - and if the CTE version is the best way to go... The actual case is slightly more complex and uses more tables - this is mostly a way to find updated data. -- Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul] demo 2011 Tore Halvorsen || +052 0553034554 On PG 9, after I ANALYZED the tables, it used indexes: QUERY PLAN --- Merge Join (cost=1.59..82778.35 rows=13171 width=20) (actual time=0.066..1076.616 rows=12966 loops=1) Merge Cond: (a.id = b.id) Join Filter: ((a.time = '2011-08-15 00:00:00'::timestamp without time zone) OR (b.time = '2011-08-01 0 - Index Scan using a_pkey on a (cost=0.00..31389.36 rows=100 width=12) (actual time=0.007..204.856 ro - Index Scan using b_pkey on b (cost=0.00..31389.36 rows=100 width=12) (actual time=0.006..224.189 ro ANALYZE is the magic. Yes ;-) Who says we don't have a magical fairy dust? Anyway you could try to postpone the join a bit - determine the IDs first and then join. Something like this WITH t AS ( SELECT id FROM a WHERE time = '2011-08-15' UNION SELECT id FROM b WHERE time = '2011-08-15' ) SELECT * FROM a JOIN b ON (a.id = b.id) WHERE id IN (SELECT id FROM t); or something like that. It's not as clean as your query, but in some cases it's faster. Tomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] heavy swapping, not sure why
On Tue, Aug 30, 2011 at 9:05 PM, Scott Marlowe scott.marl...@gmail.com wrote: Well, we had zone reclaim mode autoset to 1, and we had to turn it off to get decent performance with postgresql. Machine was a quad dodecacore Magny Cours, so 48 cores with 128G RAM. RAID controller is an Areca 1680 with BBU, 34 15kRPM 147G SAS Seagate 15k6 drives in two 16 drive external enclosures and 2 drives in the server. Also, Ubuntu 10.04 64 bit with all updates last fall right after it came out. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] out of memory - no sort
Pavel... Thanks for the reply... This still did not solve the issue. It seems odd that a simple select command in psql accessing 32MB of records should cause a problem. I have tables much larger than this and may want to access them the same way. I have 24 GB RAM on the sever and 32GB RAM on the client machine. Both machines are 64bit. Thanks Don On 8/30/2011 10:25 AM, Pavel Stehule wrote: Hello if table is large, then client can raise this exception too try to set FETCH_COUNT to 1000 http://www.postgresql.org/docs/8.4/interactive/app-psql.html Regards Pavel Stehule 2011/8/30 Dondonald.laur...@noaa.gov: I am trying a simple access of a table and get an out of memory error. How do I avoid this issue. It seems I have some configuration set wrong. Our system has 24GB of memory and is dedicated to the postgres database. Back ground information aquarec= explain analyze verbose select * from ens_memb; QUERY PLAN -- Seq Scan on ens_memb (cost=0.00..719893.12 rows=32216212 width=62) (actual time=4.954..37513.377 rows=32216154 loops=1) Output: id, shefpc, bwmon, ewmon, pb, tb, cdate, vdate, source, tyr, val Total runtime: 39588.386 ms #-- # RESOURCE USAGE (except WAL) #-- # - Memory - shared_buffers = 6144MB # min 128kB # (change requires restart) #temp_buffers = 8MB # min 800kB max_prepared_transactions = 5 # zero disables the feature # (change requires restart) # Note: Increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). # It is not advisable to set max_prepared_transactions nonzero unless you # actively intend to use prepared transactions. work_mem = 48MB # min 64kB maintenance_work_mem = 256MB# min 1MB #max_stack_depth = 2MB # min 100kB # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 # (change requires restart) #shared_preload_libraries = '' # (change requires restart) # - Cost-Based Vacuum Delay - #vacuum_cost_delay = 0ms# 0-100 milliseconds #vacuum_cost_page_hit = 1 # 0-1 credits #vacuum_cost_page_miss = 10 # 0-1 credits #vacuum_cost_page_dirty = 20# 0-1 credits #vacuum_cost_limit = 200# 1-1 credits # - Background Writer - #bgwriter_delay = 200ms # 10-1ms between rounds #bgwriter_lru_maxpages = 100# 0-1000 max buffers written/round #bgwriter_lru_multiplier = 2.0 # 0-10.0 multipler on buffers scanned/round # - Asynchronous Behavior - #effective_io_concurrency = 1 # 1-1000. 0 disables prefetching #-- # WRITE AHEAD LOG #-- # - Settings - #fsync = on # turns forced synchronization on or off #synchronous_commit = on# immediate fsync at commit #wal_sync_method = fsync# the default is the first option # supported by the operating system: # open_datasync # fdatasync # fsync # fsync_writethrough # open_sync #full_page_writes = on # recover from partial page writes wal_buffers = 8MB # min 32kB # (change requires restart) #wal_writer_delay = 200ms # 1-1 milliseconds #commit_delay = 0 # range 0-10, in microseconds #commit_siblings = 5# range 1-1000 # - Checkpoints - checkpoint_segments = 32# in logfile segments, min 1, 16MB each #checkpoint_timeout = 5min # range 30s-1h #checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0 #checkpoint_warning = 30s # 0 disables # - Archiving - #archive_mode = off # allows archiving to be done # (change requires restart) #archive_command = '' # command to use to archive a logfile segment #archive_timeout = 0# force a logfile segment switch after this # number of
[GENERAL] Using a function in different schemas
Hello everyone! This is the first time I post on this list. Until now, I was able to get a solution to all the problems I had, but now I am facing an issue I can't resolve, so I hope you can help me. The problem goes like this: I am using the same distribution of tables/functions into different schemas, so I have the following layout - schema1 has tables myTable1 and myTable2, and function myFunction - schema2 has tables myTable1 and myTable2, and function myFunction Until now I used the full names to distinguish between schemas, like this: schema1.myTable1 and schema2.myTable1. But now I have to add an update trigger in both myTable1 that calls to a function that makes a select statement on myTable2. The problem is that even if the trigger is inside mySchema1, it can't find myTable2 and I don't know how to get the schema that the function is into. Things I have tried so far: - current_schema() but, according to the documentation, it returns the name of the schema that is first in the search path and that is neither mySchema1 or mySchema2. - Passing a parameter to the trigger: since it is an update trigger I haven't found a way to pass parameter to it. Can anyone please point me in the right direction? Thanks in advance! Juan M. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] out of memory - no sort
On Aug 31, 2011, at 9:51 AM, Don wrote: Both machines are 64bit. Are all your server client builds 64-bit? 32M rows, unless the rows are 50 bytes each, you'll never be able to manipulate that selection in memory with a 32-bit app. -- Scott Ribe scott_r...@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] invalid input syntax for type bytea
I am running Postgres 8.4.1 on Windows XP Professional Service Pack 3. My database is UTF8. My program code is written in Python, and to interface to Postgres I use mxODBC 3.0 and the PostgreSQL Unicode driver PSQLODBCW.DLL version 8.01.02.00 dated 31/01/2006. I recently hit an error when trying to insert into a bytea column. The following is the first part of what appears in the Postgres log: 2011-08-25 14:42:40 BST HINT: Use the escape string syntax for backslashes, e.g., E'\\'.2011-08-25 14:42:40 BST ERROR: invalid input syntax for type bytea at character 752011-08-25 14:42:40 BST STATEMENT: insert into FAMILY_DATA (family_id, seq_num, family_pod) values (177, 20, '\\200\\002\\135q\\001\\050cperson\\012Person\\012q\\002\\051\\201q\\003\\175q\\004\\050U\\006\\137namesq\\005cperson\\012\\137Names\\012q\\006\\051\\201q\\007cperson\\012\\137Name\\012q\\010\\051\\201q\\011\\050\\135q\\012X\\003\\000\\000\\00017\\052q\\013a\\135q\\014X\\002\\000\\000\\000ABq\\015ae\\175q\\016U\\006parentq\\017h\\007sba\\175q\\020U\\006personq\\021h\\003sbU\\021nationality\\137eventq\\022cperson\\012Dictof\\137Nationality\\137Event\\012q\\023\\051\\201q\\024X\\004\\000\\000\\000UKBSq\\025\\175q\\026\\051cperson\\012Listof\\137Nationality\\137Event\\012q\\027\\051\\201q\\030\\050cperson\\012Nationality\\137Event\\012q\\031\\051\\201q\\032\\175q\\033\\050U\\015contra\\137eventsq\\034\\051U\\006parentq\\035h\\030U\\004bvosq\\036c\\137\\137builtin\\137\\137\\012set\\012q\\037\\135q U\\002ATq\\041a\\205Rq\\042U\\007cfflagsq\\043\\051U\\006effectq\\044U\\001AU\\005cinfoq\\045ccinfo\\012BCInfo (I omit the rest, but there is a close quote and a close parenthesis at the end.) If I have counted right, character 75 is the '7' in '\\175', which looks to be a valid octal value. The statement as shown in the log is generated by mxODBC, not by me, so if there is something wrong with it I shall have to take the matter up with eGenix. Before I do so it would be useful to know what is wrong with the statement. I had to tweak my own code when I upgraded from Postgres 8.1 to 8.4, but since then I have inserted 5092 rows with an average binary data length of 40,000 bytes. This is the only insert that has failed.
Re: [GENERAL] how do I disable automatic start on mac os x?
On Aug 31, 2011, at 8:46 AM, edwardIshaq wrote: if you do: open OS X will open the file in the plist editor. I tried doing that but didn't get a way with saving though :) Probably a privileges violation, right? In a prior message on this thread I suggested: sudo launchctl unload -w blahblahblah.plist I also said: The -w option causes it to not only unload the item, but also write a disabled key into it which will stop it from loading at launch. That is no longer true. Instead of the disabled key, 10.6 up (or was it 10.5 up?) now keep track of enabled/disabled elsewhere, so you really need to use launchctl instead of editing the plist. -- Scott Ribe scott_r...@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] out of memory - no sort
Hello 2011/8/31 Don donald.laur...@noaa.gov: Pavel... Thanks for the reply... This still did not solve the issue. It seems odd that a simple select command in psql accessing 32MB of records should cause a problem. I have tables much larger than this and may want to access them the same way. so there are two possibilities a) broken datafiles b) PostgreSQL's bug Pavel I have 24 GB RAM on the sever and 32GB RAM on the client machine. Both machines are 64bit. Thanks Don On 8/30/2011 10:25 AM, Pavel Stehule wrote: Hello if table is large, then client can raise this exception too try to set FETCH_COUNT to 1000 http://www.postgresql.org/docs/8.4/interactive/app-psql.html Regards Pavel Stehule 2011/8/30 Dondonald.laur...@noaa.gov: I am trying a simple access of a table and get an out of memory error. How do I avoid this issue. It seems I have some configuration set wrong. Our system has 24GB of memory and is dedicated to the postgres database. Back ground information aquarec= explain analyze verbose select * from ens_memb; QUERY PLAN -- Seq Scan on ens_memb (cost=0.00..719893.12 rows=32216212 width=62) (actual time=4.954..37513.377 rows=32216154 loops=1) Output: id, shefpc, bwmon, ewmon, pb, tb, cdate, vdate, source, tyr, val Total runtime: 39588.386 ms #-- # RESOURCE USAGE (except WAL) #-- # - Memory - shared_buffers = 6144MB # min 128kB # (change requires restart) #temp_buffers = 8MB # min 800kB max_prepared_transactions = 5 # zero disables the feature # (change requires restart) # Note: Increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). # It is not advisable to set max_prepared_transactions nonzero unless you # actively intend to use prepared transactions. work_mem = 48MB # min 64kB maintenance_work_mem = 256MB # min 1MB #max_stack_depth = 2MB # min 100kB # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 # (change requires restart) #shared_preload_libraries = '' # (change requires restart) # - Cost-Based Vacuum Delay - #vacuum_cost_delay = 0ms # 0-100 milliseconds #vacuum_cost_page_hit = 1 # 0-1 credits #vacuum_cost_page_miss = 10 # 0-1 credits #vacuum_cost_page_dirty = 20 # 0-1 credits #vacuum_cost_limit = 200 # 1-1 credits # - Background Writer - #bgwriter_delay = 200ms # 10-1ms between rounds #bgwriter_lru_maxpages = 100 # 0-1000 max buffers written/round #bgwriter_lru_multiplier = 2.0 # 0-10.0 multipler on buffers scanned/round # - Asynchronous Behavior - #effective_io_concurrency = 1 # 1-1000. 0 disables prefetching #-- # WRITE AHEAD LOG #-- # - Settings - #fsync = on # turns forced synchronization on or off #synchronous_commit = on # immediate fsync at commit #wal_sync_method = fsync # the default is the first option # supported by the operating system: # open_datasync # fdatasync # fsync # fsync_writethrough # open_sync #full_page_writes = on # recover from partial page writes wal_buffers = 8MB # min 32kB # (change requires restart) #wal_writer_delay = 200ms # 1-1 milliseconds #commit_delay = 0 # range 0-10, in microseconds #commit_siblings = 5 # range 1-1000 # - Checkpoints - checkpoint_segments = 32 # in logfile segments, min 1, 16MB each #checkpoint_timeout = 5min # range 30s-1h #checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0 #checkpoint_warning = 30s # 0 disables # - Archiving - #archive_mode = off # allows archiving to be done # (change
Re: [GENERAL] out of memory - no sort
The server is 64 bit and client is 32 bit... I tried the select * from table on the server and the query worked... but I am puzzled why it does not work on the 32bit machine. I had always thought that a 32bit machine could access up to 4GB. So what is the limiting factor ? On 8/31/2011 8:57 AM, Scott Ribe wrote: On Aug 31, 2011, at 9:51 AM, Don wrote: Both machines are 64bit. Are all your server client builds 64-bit? 32M rows, unless the rows are 50 bytes each, you'll never be able to manipulate that selection in memory with a 32-bit app.
Re: [GENERAL] out of memory - no sort
On Aug 31, 2011, at 10:52 AM, Don wrote: I had always thought that a 32bit machine could access up to 4GB. So what is the limiting factor ? - Half of your memory space may be given over to memory-mapped I/O. Now you're down to 2GB. - Your process's executable, plus any libraries it uses, plus all the system libraries that they touch, recursively all the way down, are mapped into this space. Now you're likely down to 1.5GB or less free. - Then of course your process allocates various data structures for each row, even if it's just a huge array of pointers to each row, that would be overhead. And of course the overhead is not nearly that simple--there will be allocations for pointers to varchars, and info about columns and data types, and heap data structures to keep track of allocated vs free blocks. - Memory will be fragmented of course, so you can't even use all of what's left. So no, you can't manipulate 32M of anything except plain numbers or very simple structs in RAM in a 32-bit process. -- Scott Ribe scott_r...@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] invalid input syntax for type bytea
Alan Millington admilling...@yahoo.co.uk writes: I recently hit an error when trying to insert into a bytea column. The following is the first part of what appears in the Postgres log: 2011-08-25 14:42:40 BST HINT: Use the escape string syntax for backslashes, e.g., E'\\'.2011-08-25 14:42:40 BST ERROR: invalid input syntax for type bytea at character 752011-08-25 14:42:40 BST STATEMENT: insert into FAMILY_DATA (family_id, seq_num, family_pod) values (177, 20, '\\200\\002\\135q\\001\\050cperson\\012Person\\012q\\002\\051\\201q\\003\\175q\\004\\050U\\006\\137namesq\\005cperson\\012\\137Names\\012q\\006\\051\\201q\\007cperson\\012\\137Name\\012q\\010\\051\\201q\\011\\050\\135q\\012X\\003\\000\\000\\00017\\052q\\013a\\135q\\014X\\002\\000\\000\\000ABq\\015ae\\175q\\016U\\006parentq\\017h\\007sba\\175q\\020U\\006personq\\021h\\003sbU\\021nationality\\137eventq\\022cperson\\012Dictof\\137Nationality\\137Event\\012q\\023\\051\\201q\\024X\\004\\000\\000\\000UKBSq\\025\\175q\\026\\051cperson\\012Listof\\137Nationality\\137Event\\012q\\027\\051\\201q\\030\\050cperson\\012Nationality\\137Event\\012q\\031\\051\\201q\\032\\175q\\033\\050U\\015contra\\137eventsq\\034\\051U\\006parentq\\035h\\030U\\004bvosq\\036c\\137\\137builtin\\137\\137\\012set\\012q\\037\\135q U\\002ATq\\041a\\205Rq\\042U\\007cfflagsq\\043\\051U\\006effectq\\044U\\001AU\\005cinfoq\\045ccinfo\\012BCInfo (I omit the rest, but there is a close quote and a close parenthesis at the end.) Well, the part you showed us is perfectly valid bytea data, so the problem is somewhere in what you omitted. If I have counted right, character 75 is the '7' in '\\175', which looks to be a valid octal value. Unfortunately, that number is just a pointer to the start of the bytea literal within the statement, so it's not very helpful here :-(. What you need to look for is a backslash that's not followed by either a second backslash or octal digits. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] invalid input syntax for type bytea
On 08/31/11 9:02 AM, Alan Millington wrote: I am running Postgres 8.4.1 on Windows XP Pro... 8.4 is currently up to 8.4.8 and you really should upgrade.8.4.1 is about 2 years old, and there's a long list of bugs fixed in the later 8.4 updates, see the release notes for 8.4.2 through 8.4.8 for the gory details, http://www.postgresql.org/docs/8.4/static/release.html Any particular reason you're using ODBC and not a PostgreSQL python binding like Psycopg ? I find the more layers of abstraction between your code and the outside world, the more obscure the bugs are, and the harder they are to track down. -- john r pierceN 37, W 122 santa cruz ca mid-left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] regression between 8.4.8 and 8.4.2?
Peter Eisentraut pete...@gmx.net writes: I don't have an answer for you, but this report looks suspiciously similar to the one I posted the other day at http://archives.postgresql.org/pgsql-hackers/2011-08/msg01224.php, which, now that I think about it, also manifested itself after the upgrade to 8.4.8. I think there is more than one thing going on here. I've identified a logic error in this 8.4 change: http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=7f3eba30 which is that it is relying on vardata[12]-rel-rows to provide the number of rows coming in to the semijoin, but that's only accurate for a single level of join. With two nested semijoins you get a pretty wacko answer --- unless you prevent them from being folded by inserting OFFSET 0. So that's definitely a bug, but it doesn't explain Ben's complaint because he's griping about a case where he did have OFFSET 0. (I wonder though if this is the 8.4 planner regression that he put in the OFFSET for originally. Sure would be nice if people reported such things instead of hacking around them and imagining that they'll get fixed magically.) It also doesn't explain Mark Kirkwood's complaint, since he's showing test cases that involve only one join. The only relevant-looking change I can find between 8.4.7 and 8.4.8 is http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=0ae8b300388c2a3eaf90e6e6f13d6be1f4d4ac2d which again should not have caused the amount of excitement we're seeing on this thread, since it should represent no worse than a reversion to 8.3 behavior. Possibly what we're after is some earlier 8.4.x patch. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] regression between 8.4.8 and 8.4.2?
On Aug 31, 2011, at 10:47 AM, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: I don't have an answer for you, but this report looks suspiciously similar to the one I posted the other day at http://archives.postgresql.org/pgsql-hackers/2011-08/msg01224.php, which, now that I think about it, also manifested itself after the upgrade to 8.4.8. I think there is more than one thing going on here. I've identified a logic error in this 8.4 change: http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=7f3eba30 which is that it is relying on vardata[12]-rel-rows to provide the number of rows coming in to the semijoin, but that's only accurate for a single level of join. With two nested semijoins you get a pretty wacko answer --- unless you prevent them from being folded by inserting OFFSET 0. So that's definitely a bug, but it doesn't explain Ben's complaint because he's griping about a case where he did have OFFSET 0. (I wonder though if this is the 8.4 planner regression that he put in the OFFSET for originally. Sure would be nice if people reported such things instead of hacking around them and imagining that they'll get fixed magically.) It also doesn't explain Mark Kirkwood's complaint, since he's showing test cases that involve only one join. The only relevant-looking change I can find between 8.4.7 and 8.4.8 is http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=0ae8b300388c2a3eaf90e6e6f13d6be1f4d4ac2d which again should not have caused the amount of excitement we're seeing on this thread, since it should represent no worse than a reversion to 8.3 behavior. Possibly what we're after is some earlier 8.4.x patch. Tom, if there's anything else we can provide that might you out, let me know. We're currently about to install an earlier 8.4 version to see if the problem goes away. Is there a particular version you'd be interested to know about? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] regression between 8.4.8 and 8.4.2?
Ben Chobot be...@silentmedia.com writes: Tom, if there's anything else we can provide that might you out, let me know. If you could extract a self-contained test case for the bad estimation, that would be useful. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using a function in different schemas
On 08/31/2011 08:38 AM, Juan Manuel Alvarez wrote: Hello everyone! This is the first time I post on this list. Until now, I was able to get a solution to all the problems I had, but now I am facing an issue I can't resolve, so I hope you can help me. The problem goes like this: I am using the same distribution of tables/functions into different schemas, so I have the following layout - schema1 has tables myTable1 and myTable2, and function myFunction - schema2 has tables myTable1 and myTable2, and function myFunction Until now I used the full names to distinguish between schemas, like this: schema1.myTable1 and schema2.myTable1. But now I have to add an update trigger in both myTable1 that calls to a function that makes a select statement on myTable2. The problem is that even if the trigger is inside mySchema1, it can't find myTable2 and I don't know how to get the schema that the function is into. Things I have tried so far: - current_schema() but, according to the documentation, it returns the name of the schema that is first in the search path and that is neither mySchema1 or mySchema2. - Passing a parameter to the trigger: since it is an update trigger I haven't found a way to pass parameter to it. Can anyone please point me in the right direction? Assuming using plpgsql look here: http://www.postgresql.org/docs/9.0/interactive/plpgsql-trigger.html In particular: TG_TABLE_SCHEMA Data type name; the name of the schema of the table that caused the trigger invocation. Thanks in advance! Juan M. -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using a function in different schemas
Thanks Adrian! That is exactly what I was looking for! =o) On Wed, Aug 31, 2011 at 3:35 PM, Adrian Klaver adrian.kla...@gmail.com wrote: On 08/31/2011 08:38 AM, Juan Manuel Alvarez wrote: Hello everyone! This is the first time I post on this list. Until now, I was able to get a solution to all the problems I had, but now I am facing an issue I can't resolve, so I hope you can help me. The problem goes like this: I am using the same distribution of tables/functions into different schemas, so I have the following layout - schema1 has tables myTable1 and myTable2, and function myFunction - schema2 has tables myTable1 and myTable2, and function myFunction Until now I used the full names to distinguish between schemas, like this: schema1.myTable1 and schema2.myTable1. But now I have to add an update trigger in both myTable1 that calls to a function that makes a select statement on myTable2. The problem is that even if the trigger is inside mySchema1, it can't find myTable2 and I don't know how to get the schema that the function is into. Things I have tried so far: - current_schema() but, according to the documentation, it returns the name of the schema that is first in the search path and that is neither mySchema1 or mySchema2. - Passing a parameter to the trigger: since it is an update trigger I haven't found a way to pass parameter to it. Can anyone please point me in the right direction? Assuming using plpgsql look here: http://www.postgresql.org/docs/9.0/interactive/plpgsql-trigger.html In particular: TG_TABLE_SCHEMA Data type name; the name of the schema of the table that caused the trigger invocation. Thanks in advance! Juan M. -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] regression between 8.4.8 and 8.4.2?
On Aug 31, 2011, at 11:10 AM, Tom Lane wrote: Ben Chobot be...@silentmedia.com writes: Tom, if there's anything else we can provide that might you out, let me know. If you could extract a self-contained test case for the bad estimation, that would be useful. OK, we'll pull something together. In the meantime, we can confirm that reverting from 8.4.8 to 8.4.3 fixes things.
Re: [GENERAL] how do I disable automatic start on mac os x?
if you do: open OS X will open the file in the plist editor. I tried doing that but didn't get a way with saving though :) -- View this message in context: http://postgresql.1045698.n5.nabble.com/how-do-I-disable-automatic-start-on-mac-os-x-tp1926565p4754428.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] regression between 8.4.8 and 8.4.2?
On Aug 31, 2011, at 11:53 AM, Ben Chobot wrote: On Aug 31, 2011, at 11:10 AM, Tom Lane wrote: Ben Chobot be...@silentmedia.com writes: Tom, if there's anything else we can provide that might you out, let me know. If you could extract a self-contained test case for the bad estimation, that would be useful. OK, we'll pull something together. In the meantime, we can confirm that reverting from 8.4.8 to 8.4.3 fixes things. and FWIW 9.0.3 does the right thing as well.
[GENERAL] ARRAY_AGG(DISTINCT a ORDER BY b) Error
in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list Why? If I add the fields of the ORDER BY expression to the DISTINCT clause I can no longer use DISTINCT since the ORDER BY values are not unique. Nor do I want the contents of the final ARRAY to contain the ORDER BY column. I presuppose this is a technical limitation since my particular use-case (and I've come across a few cases where this would be useful) doesn't seem that obscure. My specific sample query (use-case) is as follows: SELECT control, ARRAY_AGG(DISTINCT accountnumber ORDER BY amount DESC) FROM (VALUES ('A', '1000', 100), ('A', '1000', -50), ('A', '2000',200)) accountdetail (control, accountnumber, amount) GROUP BY control I want to create an array of the unique account numbers associated with a control with the ordering of the array matching the order of the amounts. In this case I would want the output to be: (A, {'2000','1000'}) David J. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ARRAY_AGG(DISTINCT a ORDER BY b) Error
David Johnston pol...@yahoo.com writes: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list Why? Because the results are ill-defined otherwise. In your example, ... ARRAY_AGG(DISTINCT accountnumber ORDER BY amount DESC) ... there may be many rows with the same accountnumber and yet different amount values. Which of those amounts should be used as the sort key for the aggregated row? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ARRAY_AGG(DISTINCT a ORDER BY b) Error
On 31 August 2011 23:54, David Johnston pol...@yahoo.com wrote: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list Why? If I add the fields of the ORDER BY expression to the DISTINCT clause I can no longer use DISTINCT since the ORDER BY values are not unique. Nor do I want the contents of the final ARRAY to contain the ORDER BY column. I presuppose this is a technical limitation since my particular use-case (and I've come across a few cases where this would be useful) doesn't seem that obscure. My specific sample query (use-case) is as follows: SELECT control, ARRAY_AGG(DISTINCT accountnumber ORDER BY amount DESC) FROM (VALUES ('A', '1000', 100), ('A', '1000', -50), ('A', '2000',200)) accountdetail (control, accountnumber, amount) GROUP BY control I want to create an array of the unique account numbers associated with a control with the ordering of the array matching the order of the amounts. In this case I would want the output to be: (A, {'2000','1000'}) I'm not sure that makes sense. If you're aggregating accountnumber as an array of distinct values, what do you expect your query to output if, say you had the following: accountnumber, amount 1000,100 2000,200 1000,300 You've ordered by amount, but accountnumber has 2 identical values, where the amount is less than the amount corresponding to accountnumber 2000 in one instance, but greater in another. Where does 1000 appear? Before or after 2000? -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ARRAY_AGG(DISTINCT a ORDER BY b) Error
-Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: Wednesday, August 31, 2011 7:10 PM To: David Johnston Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] ARRAY_AGG(DISTINCT a ORDER BY b) Error David Johnston pol...@yahoo.com writes: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list Why? Because the results are ill-defined otherwise. In your example, ... ARRAY_AGG(DISTINCT accountnumber ORDER BY amount DESC) ... there may be many rows with the same accountnumber and yet different amount values. Which of those amounts should be used as the sort key for the aggregated row? regards, tom lane -- My take is that you look at the non-aggregated data using the same ORDER BY clause and then add the values in the order they appear in the detail. So (account, amount, [Action]): 1000, 150, [Add] 1000, 130, [Skip] 2000, 120, [Add] 2000, 100, [Skip] 1000, 50, [Skip] 2000, 0, [Skip] 3000, -10, [Add] 1000, -50, [Skip] This is basically how a plain DISTINCT would have to work (taking the first encountered value and skipping any subsequent repeats). I want the same general behavior but have the opportunity to order the input frame so that I can manipulate the encounter order. In a 2-phase process you would do the following (I will remain specific for the time being): Phase 1: Select the representative record for each DISTINCT (accountnumber); The representative would be the FIRST record as determined via an ORDER BY of all candidate records. Phase 2: Order the DISTINCT representatives based upon the same ORDER BY clause So: Phase 1: (3000, -10), (1000, 150), (2000, 120) [Not Ordered] Phase 2: (1000, 150), (2000, 120), (3000, -10) [ORDER BY amount DESC] The behavior/concept of [ARRAY_AGG(DISTINCT ON (accountnumber) .?. ORDER BY amount] matches here but I do not want to output an amount column at all. The other view is that you can create the ARRAY using just the ORDER BY and then immediately DISTINCTify the array so that there are no duplicates. This is basically the 2-phase process described above. In this view you basically keep the DISTINCT value that has the lowest array index. I would be interested in other possible interpretations/algorithms that would then cause ambiguity in deciding which algorithm to implement. I know that I am only considering ARRAY_AGG in my examples but my first reaction is that other aggregates would behave acceptably under the algorithm described; and since the current behavior is FAIL at least some well-defined behavior would exist for the specified syntax. David J. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] function on trigger
hi guys I'm newbie in Psql and I'm trying to build one function in order to count the products for each supplier. So i'm gonna put it quite simply though this example Please, consider a table called books with the following fields bookid, title, price, idPublisher and one another table called publisher Idpublisher, name, city, Books Being in that last field from Publisher, called book, I gotta have the amount of published books for each publisher. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] function on trigger
On Aug 31, 2011, at 18:39, Marcos Hercules Santos mhe...@gmail.com wrote: hi guys I'm newbie in Psql and I'm trying to build one function in order to count the products for each supplier. So i'm gonna put it quite simply though this example Please, consider a table called books with the following fields bookid, title, price, idPublisher and one another table called publisher Idpublisher, name, city, Books Being in that last field from Publisher, called book, I gotta have the amount of published books for each publisher. Is there a question somewhere? Do not be tempted by the dark side Marcos. Create a view that uses a join and a count. Only if you have significant performance issues would you then consider materializing that view. If you insist, or simply want to learn, read the sections on UPDATE, CREATE TRIGGER, and CREATE FUNCTION. Try doing it following those examples. Ask more specific questions if something specific stumps you. Do it incrementally. Get the raw SQL UPDATE working then create the trigger and function and get those working. Then combine the two. David J. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] function on trigger
On 08/31/11 3:39 PM, Marcos Hercules Santos wrote: I'm newbie in Psql and I'm trying to build one function in order to count the products for each supplier. So i'm gonna put it quite simply though this example Please, consider a table called books with the following fields bookid, title, price, idPublisher and one another table called publisher Idpublisher, name, city, Books Being in that last field from Publisher, called book, I gotta have the amount of published books for each publisher. get rid of the books field on your publisher table, thats dynamic and changes as you add/remove books from the book table. to get that data, try... SELECT p.idPublisher, p.name, p.city, COUNT(b.bookid) AS books FROM publisher p JOIN books b USING idPublisher GROUP BY p.idPublisher; you could make this a view if its too cumbersome. CREATE VIEW publisher_books SELECT p.idPublisher, p.name, p.city, COUNT(b.bookid) AS books FROM publisher p JOIN books b USING idPublisher GROUP BY p.idPublisher; SELECT * from publisher_books; and of course, add other WHERE conditions... SELECT books FROM publisher_books WHERE name=?; -- john r pierceN 37, W 122 santa cruz ca mid-left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general