Re: [GENERAL] Out of memory error
On Mon, Jul 6, 2009 at 7:26 PM, Paul Ramsey wrote: > If you are on PostGIS < 1.3.4 there are substantial memory leaks in > intersects() for point/polygon cases. Upgrading to 1.3.6 is > recommended. Thank you, that fixed it. -- Paul Smith http://www.pauladamsmith.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Out of memory error
If you are on PostGIS < 1.3.4 there are substantial memory leaks in intersects() for point/polygon cases. Upgrading to 1.3.6 is recommended. P On Mon, Jul 6, 2009 at 1:39 PM, Paul Smith wrote: > On Mon, Jul 6, 2009 at 3:34 PM, Tom Lane wrote: >> Clearly a memory leak, but it's not so clear exactly what's causing it. >> What's that intersects() function? Can you put together a >> self-contained test case? > > It's actually ST_Intersects from PostGIS (some of the PostGIS function > names are still recognize without the leading "ST_"). > > http://postgis.refractions.net/documentation/manual-1.3/ch06.html#id2574404 > > # select postgis_version(); > postgis_version > --- > 1.3 USE_GEOS=1 USE_PROJ=1 USE_STATS=1 > > -Paul > > -- > Paul Smith > http://www.pauladamsmith.com/ > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] combine multiple row values in to one row
On Tue, 7 Jul 2009 01:59:35 +0430 Lee Harr wrote: > > Hi; > > I'm looking for a way to do this: > # select idn, magic() as codes FROM tbl; > idn | codes > -+-- >1 | A >2 | B, C >3 | A, C, E > (3 rows) > > > Right now, I use plpgsql functions, but each time I do it > I have to rewrite the function to customize it. > > Is there a generic way to do this? An aggregate maybe? array_accum http://www.postgresql.org/docs/8.2/static/xaggr.html ? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Problem search on text arrays, using the overlaps (&&) operator
- "nha" wrote: > From: "nha" > To: "John Cheng" > Cc: pgsql-general@postgresql.org > Sent: Monday, July 6, 2009 9:12:22 AM GMT -08:00 US/Canada Pacific > Subject: Re: [GENERAL] Problem search on text arrays, using the overlaps (&&) > operator > > Hello, > > With your exhaustive example statements based on table foo and cars, I > > performed some measures on my side (PostgreSQL 8.3.1 server). Here are > > some statistical results: > [ ... snipped ... ] > > In my opinion, analysis and optimization may be deepen over table > indexes used for join planning. As your reported query plans show, the > > Where clauses are performed independantly from the table ml_lead; the > > reason is that all the attributes of the clauses belong to the table > lead_reporting_data. Time may be reduced on join condition > achievements. > > Hoping this observation will contribute a little to your opinion. > > Without any claim, I attached a document to this email for details on > > the measures I took with the overlap operator -- OpenDocument > Spreadsheet (ODS) v2 formatted file, 24 kiB. The 3rd sheet "various" > presents the detailed measures related to the data reported in this > email. > > Regards. > > -- > nha / Lyon / France. Hi nha, I had not expected anyone to go to such lengths to evaluate my situation, thank you so much! After looking at your analysis, I realized that the test case I created isn't close enough to the queries running in our prod environment. For one, table 'foo' does not join to another table; The other thing is that the amount of data isn't the same; Finally, these tables have been ANALYZED. So I took some time to update the test case. On our server, running 8.3.6, I was able to reproduce the difference between the two styles: "arr&&{f,b}" and "arr&&{f} or arr&&{b}". First, the setup: -- Begin test case -- Sets up 'bar' SELECT id INTO TEMP TABLE bar FROM (SELECT generate_series(1,30) as id) AS bar; CREATE INDEX bar_idx ON bar (id); ANALYZE bar; -- Sets up 'foo' CREATE TEMP SEQUENCE foo_bar_id_seq; CREATE TEMP TABLE foo ( bar_id numeric DEFAULT NEXTVAL('foo_bar_id_seq'), keywords text[] ); CREATE INDEX foo_idx ON foo USING gin (keywords); INSERT INTO foo (keywords) VALUES ('{ford}'::text[]); INSERT INTO foo (keywords) VALUES ('{toyota}'::text[]); INSERT INTO foo (keywords) VALUES ('{volkswagen}'::text[]); INSERT INTO foo (keywords) VALUES ('{saturn}'::text[]); INSERT INTO foo (keywords) VALUES ('{honda}'::text[]); INSERT INTO foo (keywords) VALUES ('{porsche}'::text[]); INSERT INTO foo (keywords) VALUES ('{porsche, audi, chrysler}'::text[]); INSERT INTO foo (keywords) VALUES ('{honda, hummer, ferrari}'::text[]); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); INSERT INTO foo (keywords) (SELECT keywords FROM foo); ANALYZE foo; -- End test case Query for the form "arr&&{f,b}" SELECT count(*) FROM foo INNER JOIN bar ON foo.bar_id = bar.id WHERE foo.keywords && '{ford, toyota, volkswagen, saturn, honda, porsche, hummer, ferrari}'::text[]; Query for the form "arr&&{f} or arr&&{b}": SELECT count(*) FROM foo, bar WHERE foo.bar_id = bar.id AND ( keywords && '{ford}'::text[] OR keywords && '{toyota}'::text[] OR keywords && '{volkswagen}'::text[] OR keywords && '{saturn}'::text[] OR keywords && '{honda}'::text[] OR keywords && '{porsche}'::text[] OR keywords && '{hummer}'::text[] OR keywords && '{ferrari}'::text[] ); For the first form, "arr&&{f,b}", the query takes about 15 seconds. For the second form "arr&&{f} or arr&&{b}", we get about 8 seconds. The difference is around 90-100%, which is what I am seeing on our real life queries. The query plans also become similar to the real life query plans. But I am having a hard time learning from them. The only interesting I see is that the estimated cost seems to be different than the actual run time. The second form has a higher estimated cost than the first form, but has a lower run time. In this test case, the query filters by any of 8 keywords.
[GENERAL] combine multiple row values in to one row
Hi; I'm looking for a way to do this: # \d tbl Table "public.tbl" Column | Type | Modifiers +-+--- idn| integer | code | text| # SELECT * FROM tbl; idn | code -+-- 1 | A 2 | B 2 | C 3 | A 3 | C 3 | E (6 rows) # select idn, magic() as codes FROM tbl; idn | codes -+-- 1 | A 2 | B, C 3 | A, C, E (3 rows) Right now, I use plpgsql functions, but each time I do it I have to rewrite the function to customize it. Is there a generic way to do this? An aggregate maybe? Thanks for any help. _ Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Out of memory error
On Mon, Jul 6, 2009 at 3:34 PM, Tom Lane wrote: > Clearly a memory leak, but it's not so clear exactly what's causing it. > What's that intersects() function? Can you put together a > self-contained test case? It's actually ST_Intersects from PostGIS (some of the PostGIS function names are still recognize without the leading "ST_"). http://postgis.refractions.net/documentation/manual-1.3/ch06.html#id2574404 # select postgis_version(); postgis_version --- 1.3 USE_GEOS=1 USE_PROJ=1 USE_STATS=1 -Paul -- Paul Smith http://www.pauladamsmith.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Out of memory error
Paul Smith writes: > We have a query that's producing an "out of memory" error > consistently. The detail of the error message is "Failed on request of > size 16." We have 16 GB of RAM in our database server running 32-bit > Debian lenny. Here's the query: > ... > ExecutorState: 460390096 total in 431 blocks; 8400 free (2 > chunks); 460381696 used Clearly a memory leak, but it's not so clear exactly what's causing it. What's that intersects() function? Can you put together a self-contained test case? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Efficiently move data from one table to another, with FK constraints?
I am using COPY to bulk load large volumes (i.e. multi GB range) of data to a staging table in a PostgreSQL 8.3. For performance, the staging table has no constraints, no primary key, etc. I want to move that data into the "real" tables, but need some advice on how to do that efficiently. Here's a simple, abbreviated example of tables and relations I'm working with (in reality there are a lot more columns and foreign keys). /* The raw bulk-loaded data. No indexes or constraints. */ CREATE TABLE log_entry ( req_time TIMESTAMP NOT NULL, url TEXT NOT NULL, bytes INTEGER NOT NULL ); /* Where the data will be moved to. Will have indexes, etc */ CREATE TABLE request ( id BIGSERIAL PRIMARY KEY, req_time TIMESTAMP WITH TIME ZONE NOT NULL, bytes INTEGER NOT NULL, fk_url INTEGER REFERENCES url NOT NULL, ); CREATE TABLE url ( id SERIAL PRIMARY KEY, path TEXT UNIQUE NOT NULL, ); Is there a way to move this data in bulk efficiently? Specifically I'm wondering how to handle the foreign keys? The naive approach is: 1) For each column that is a foreign key in the target table, do INSERT ... SELECT DISTINCT ... to copy all the values into the appropriate child tables. 2) For each row in log_entry, do a similar insert to insert the data with the appropriate foreign keys. 3) delete the contents of table log_entry using TRUNCATE Obviously, this would be very slow when handling tens of millions of records. Are there faster approaches to solving this problem? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Windows installer for pg-8.4 confusion
Stuart McGraw wrote: > First, thanks to everyone who contributed > to 8.4 -- the new features list looks great! > > In the past I have always installed the > Windows binary installer from postgresql.org. > For pg-8.4 I see that the download directory > now has a pointer to the EnterpriseDB "one > click" installer. > > Has the EnterpriseDB installer now become the > "official" Windows distribution? Or will the > standard pginstaller appear some time in the > future? The EnterpriseDB is now the standard and the MSI one will not be returning. The MSI one was just too hard to build and didn't have any advantages of the EnterpriseDB one. We do have all the specs of how to build the EnterpriseDB one so anyone else can reproduce it if necessary. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Out of memory error
We have a query that's producing an "out of memory" error consistently. The detail of the error message is "Failed on request of size 16." We have 16 GB of RAM in our database server running 32-bit Debian lenny. Here's the query: INSERT INTO db_newsitemlocation (news_item_id, location_id) SELECT ni.id, loc.id FROM db_newsitem ni, db_location loc WHERE NOT EXISTS (SELECT 1 FROM db_newsitemlocation WHERE news_item_id=ni.id AND location_id=loc.id) AND intersects(loc.location, ni.location) AND ni.pub_date > '2009-6-25' AND ni.pub_date < '2009-7-3'; db_newsitem and db_newsitemlocation are both on the order of a few million rows. db_location has a few hundred. Details about the server: | uname -a | Linux lincoln 2.6.26-2-686-bigmem #1 SMP Sun Jun 21 05:40:38 UTC 2009 i686 GNU/Linux | | psql -V| psql (PostgreSQL) 8.3.7 | | lsb_release -d | Description:Debian GNU/Linux 5.0.1 (lenny) | | free -m| 16244 (mem) 7632 (swap) | Relevant postgres.conf: max_connections = 300 shared_buffers = 2560MB work_mem = 2MB maintenance_work_mem = 128MB effective_cache_size = 8122MB Here's the EXPLAIN on the main SELECT part of the statement. EXPLAIN ANALYZE is impossible because of the "out of memory" error. QUERY PLAN --- Nested Loop (cost=0.00..3448789.73 rows=65642 width=8) Join Filter: (intersects(loc.location, ni.location) AND (NOT (subplan))) -> Index Scan using db_newsitem_pub_date on db_newsitem ni (cost=0.00..5479.14 rows=1815 width=104) Index Cond: ((pub_date > '2009-06-25 00:00:00-07'::timestamp with time zone) AND (pub_date < '2009-07-03 00:00:00-07'::timestamp with time zone)) -> Seq Scan on db_location loc (cost=0.00..17.17 rows=217 width=47281) SubPlan -> Index Scan using db_newsitemlocation_news_item_id_key on db_newsitemlocation (cost=0.00..8.65 rows=1 width=0) Index Cond: ((news_item_id = $0) AND (location_id = $1)) (8 rows) And here's the error from stderr: TopMemoryContext: 82576 total in 9 blocks; 9016 free (10 chunks); 73560 used CFuncHash: 8192 total in 1 blocks; 4936 free (0 chunks); 3256 used Operator lookup cache: 24576 total in 2 blocks; 14072 free (6 chunks); 10504 used Operator class cache: 8192 total in 1 blocks; 3848 free (0 chunks); 4344 used TopTransactionContext: 8192 total in 1 blocks; 7856 free (1 chunks); 336 used MessageContext: 65536 total in 4 blocks; 13768 free (4 chunks); 51768 used smgr relation table: 8192 total in 1 blocks; 2808 free (0 chunks); 5384 used TransactionAbortContext: 32768 total in 1 blocks; 32752 free (0 chunks); 16 used Portal hash: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used PortalMemory: 8192 total in 1 blocks; 8040 free (0 chunks); 152 used PortalHeapMemory: 1024 total in 1 blocks; 896 free (0 chunks); 128 used ExecutorState: 460390096 total in 431 blocks; 8400 free (2 chunks); 460381696 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 240972 total in 2 blocks; 7984 free (3 chunks); 232988 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used Relcache by OID: 8192 total in 1 blocks; 2856 free (0 chunks); 5336 used CacheMemoryContext: 667472 total in 20 blocks; 236496 free (1 chunks); 430976 used pg_toast_157541_index: 1024 total in 1 blocks; 240 free (0 chunks); 784 used db_location_slug: 1024 total in 1 blocks; 344 free (0 chunks); 680 used db_location_normalized_name: 1024 total in 1 blocks; 344 free (0 chunks); 680 used db_location_location_type_id: 1024 total in 1 blocks; 344 free (0 chunks); 680 used db_location_slug_key: 1024 total in 1 blocks; 280 free (0 chunks); 744 used db_location_pkey: 1024 total in 1 blocks; 344 free (0 chunks); 680 used db_newsitem_schema_id: 1024 total in 1 blocks; 344 free (0 chunks); 680 used db_newsitem_pub_date: 1024 total in 1 blocks; 304 free (0 chunks); 720 used db_newsitem_location_object_id: 1024 total in 1 blocks; 344 free (0 chunks); 680 used db_newsitem_location: 1024 total in 1 blocks; 136 free (0 chunks); 888 used db_newsitem_item_date: 1024 total in 1 blocks; 344 free (0 chunks); 680 used db_newsitem_block_id: 1024 total in 1 blocks; 344 free (0 chunks); 680 used db_newsitem_pkey: 1024 total in 1 blocks; 344 free (0 chunks); 680 used db_newsitemlocation_news_item_id: 1024 total in 1 blocks; 344 free (0 chunks); 680 used db_newsitemlocation_location_id: 1024 total in 1 blocks; 344 free (0 chunks); 680 used db_newsitemlocation_pkey: 1024 total in 1 blocks; 344 free (0 chunk
Re: [GENERAL] Schema search_path and $user
Thanks for the link, I wasn't reading the right page(s) in the documentation. On Mon, Jul 6, 2009 at 12:19 PM, Tom Lane wrote: > Postgres User writes: >> In the docs, I see repeated references to $user in the postgresql.conf >> schema search_path setting such as: > >> search_path = '"$user",public' > >> But I don't see any info on the meaning of '$user' here. > > I guess you didn't read the actual documentation of search_path: > http://www.postgresql.org/docs/8.3/static/runtime-config-client.html#GUC-SEARCH-PATH > It says > > The value for search_path has to be a comma-separated list of schema > names. If one of the list items is the special value $user, then the > schema having the name returned by SESSION_USER is substituted, if there > is such a schema. (If not, $user is ignored.) > > regards, tom lane > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema search_path and $user
Postgres User writes: > In the docs, I see repeated references to $user in the postgresql.conf > schema search_path setting such as: > search_path = '"$user",public' > But I don't see any info on the meaning of '$user' here. I guess you didn't read the actual documentation of search_path: http://www.postgresql.org/docs/8.3/static/runtime-config-client.html#GUC-SEARCH-PATH It says The value for search_path has to be a comma-separated list of schema names. If one of the list items is the special value $user, then the schema having the name returned by SESSION_USER is substituted, if there is such a schema. (If not, $user is ignored.) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Schema search_path and $user
In the docs, I see repeated references to $user in the postgresql.conf schema search_path setting such as: search_path = '"$user",public' But I don't see any info on the meaning of '$user' here. Is $user some kind of variable within postgresql.conf that refers to the current user? Can it be replaced with a real group name to control schema search_paths for specific groups/users? Thanks. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Added parameter for CREATE ROLE
On 06/07/2009 19:32, Michael Gould wrote: > It would be nice if during create role we could have a parameter to set the > number of days that a password is valid instead of just a timestamp. Would (current_timestamp + interval '365 days') work? Dunno myself - just thinking out loud... :-) Ray. -- Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland r...@iol.ie Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals -- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Added parameter for CREATE ROLE
It would be nice if during create role we could have a parameter to set the number of days that a password is valid instead of just a timestamp. Best Regards -- Michael Gould, Managing Partner Intermodal Software Solutions, LLC 904.226.0978 904.592.5250 fax -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance problem with low correlation data
On Mon, Jul 6, 2009 at 6:32 PM, Scara Maccai wrote: > The "best" way to read the table would still be a nested loop, but a loop on > the > "t" values, not on the ne_id values, since data for the same timestamp is > "close". But that would be a different query -- there's no restrictions on the t values in this one. > How can I tell to PG to use an algorithm such as: > > fetch the heap > for each quarter > for each id found where groupname='a group name' > fetch all the indexes Have you tried something using IN or EXISTS instead of a join? The algorithm you describe doesn't work for the join because it has to produce a record which includes the matching group columns. A bitmap scan would return the various groups (I know in your case there's only one but the optimizer can't be sure) mixed together. That might work better in 8.4 than 8.3 as the IN and EXISTS handling is improved. Actually I wonder if doing a sequential scan with a hash join against the group list wouldn't be a better option. That would scan more of the heap but if they're really that spread out it might not make much difference, and it would avoid having to do all the index scans. You could experiment with turning enable_nestloop off and see how fast the hash join plan is. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Windows installer for pg-8.4 confusion
2009/7/6 Stuart McGraw : >>> Has the EnterpriseDB installer now become the >>> "official" Windows distribution? Or will the >>> standard pginstaller appear some time in the >>> future? No, it won't be produced in the future. It's been deprecated due to the high maintenance overhead and how difficult it is to debug problems, amongst other reasons. > [aside to the postgresql.org maintainers: Perhaps > the readme in the v8.4.0/win32/ directory should > also include that statement?] I've updated the wording, and removed the old beta/rc directories. The changes should show up in the next few hours. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Windows installer for pg-8.4 confusion
Andreas Wenk wrote: > Stuart McGraw schrieb: >> First, thanks to everyone who contributed >> to 8.4 -- the new features list looks great! >> >> In the past I have always installed the >> Windows binary installer from postgresql.org. >> For pg-8.4 I see that the download directory >> now has a pointer to the EnterpriseDB "one >> click" installer. >> >> Has the EnterpriseDB installer now become the >> "official" Windows distribution? Or will the >> standard pginstaller appear some time in the >> future? > > Hi, > > having a look at > > http://www.postgresql.org/ftp/binary/v8.4rc1/win32/ Thanks, it did. :-) For future mail list searchers, it says, "As of PostgreSQL 8.4, the MSI installer is no longer being maintained. Please use the one-click installer instead, which may be downloaded from [...]" [aside to the postgresql.org maintainers: Perhaps the readme in the v8.4.0/win32/ directory should also include that statement?] -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Performance problem with low correlation data
I have a problem with the method that PG uses to access my data. Data into testinsert is inserted every 15 minutes. ne_id varies from 1 to 2. CREATE TABLE testinsert ( ne_id integer NOT NULL, t timestamp without time zone NOT NULL, v integer[], CONSTRAINT testinsert_pk PRIMARY KEY (ne_id, t) ) CREATE UNIQUE INDEX testinsert_time_key ON testinsert USING btree (t, ne_id); This table has, then, a t correlation of 1, and a ne_id correlation close to 0. I query this table using another table: CREATE TABLE idtable ( id integer NOT NULL, groupname varchar(50) CONSTRAINT idtable_pk PRIMARY KEY (id, groupname) ) CREATE INDEX idtable_group_idx ON idtable USING btree (groupname); where each id is associated with a group: select * from idtable left outer join testinsert on id=ne_id where groupname='a group name' and time between $a_date and $another_date PG usually choose a nested loop join over all the ne_ids found for groupname='a group name'. BUT, given the correlation in the table, this is a horrible data access: the table (15GB) gets read randomly, since data for one ne_id is scattered all over the table; The "best" way to read the table would still be a nested loop, but a loop on the "t" values, not on the ne_id values, since data for the same timestamp is "close". Or, even better, something like this would be very nice: Bitmap Heap Scan for each id found in idtable where groupname='a group name' BitmapOr BitmapIndexScan using ne_id and time between $a_date and $another_date That is: I understand why PG is using that access method to fetch the indexes, but I would like it to fetch the heaps only after ALL the indexes have been read, so that it could reorder them... So, given that: How can I tell to PG to use an algorithm such as: fetch the heap for each quarter for each id found where groupname='a group name' fetch all the indexes instead of: for each id found where groupname='a group name' fetch the heap fetch all the indexes where ne_id=id time between $a_date and $another_date ( some other infos: 1) results clustering the table are x10-x20 faster, but I can't cluster the table (it gets written every 15 minutes and read pretty often) 2) I know all the "t" values that I'm going to query, since there won't be more than 1 t per ne_id per 15 minutes; so I could use a generate_series($a_date, $another_date, 15 minutes) if that could help somehow: select * from idtable cross join generate_series($a_date, $another_date, 15 minutes) as myt left outer join testinsert on id=ne_id and myt=t where groupname='a group name' but it doesn't help... ) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Windows installer for pg-8.4 confusion
Stuart McGraw schrieb: > First, thanks to everyone who contributed > to 8.4 -- the new features list looks great! > > In the past I have always installed the > Windows binary installer from postgresql.org. > For pg-8.4 I see that the download directory > now has a pointer to the EnterpriseDB "one > click" installer. > > Has the EnterpriseDB installer now become the > "official" Windows distribution? Or will the > standard pginstaller appear some time in the > future? Hi, having a look at http://www.postgresql.org/ftp/binary/v8.4rc1/win32/ will answer your question ;-) Cheers Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Windows installer for pg-8.4 confusion
First, thanks to everyone who contributed to 8.4 -- the new features list looks great! In the past I have always installed the Windows binary installer from postgresql.org. For pg-8.4 I see that the download directory now has a pointer to the EnterpriseDB "one click" installer. Has the EnterpriseDB installer now become the "official" Windows distribution? Or will the standard pginstaller appear some time in the future? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Feistel cipher, shorter string and hex to int
On Sat, 02 May 2009 11:26:28 +0200 "Daniel Verite" wrote: > Note that it returns a bigint because we don't have unsigned > integers in PG. If you're OK with getting negative values, the > return type can be changed to int. > Otherwise if you need a positive result that fits in 32 bits, it's > possible to tweak the code to use 15 bits blocks instead of 16, > but then the input will have to be less than 2^30. I need shorter values (because they should be easier to type. To be sure to modify the function in a sensible way I really would appreciate some pointer. Still if it return To further shrink the length of the result I was planning to to_hex (actually it would be nice to have a fast to_35base [0-9a-z])... but I wasn't able to find a way to convert back an hex string to an int. x'fff' seems to work just for literals. CREATE OR REPLACE FUNCTION pseudo_encrypt(value int) returns bigint AS $$ DECLARE l1 int; l2 int; r1 int; r2 int; i int:=0; BEGIN l1:= (value >> 16) & 65535; -- modifying here seems trivial r1:= value&65535; -- l1:= (value >> 15) & B'111'::int; -- r1:= value & B'111'::int; WHILE i<3 LOOP l2:=r1; r2:=l1 # 1366.0*r1+150889)%714025)/714025.0)*32767)::int; -- but what about this? where does it come from? /* r2:=l1 # 1366.0*r1+150889)%714025)/714025.0)*B'11'::int)::int; */ -- ?? l1:=l2; r1:=r2; i:=i+1; END LOOP; return ((l1::bigint<<16) + r1); -- modifying here seems trivial END; $$ LANGUAGE plpgsql strict immutable; Anything else to suggest or copy from? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Problem search on text arrays, using the overlaps (&&) operator
Hello, Le 2/07/09 2:07, John Cheng a écrit : We use text[] on one of our tables. This text[] column allows us to search for records that matches a keyword in a set of keywords. For example, if we want to find records that has a keyword of "foo" or "bar", we can use the condition: keywords&& '{foo, bar}'::text[] Another wau is to do this: (keywords&& '{foo}::text[]' OR keywords&& '{bar}::text[]') I am noticing a big difference between the two ways. I'm trying to find out if we need to re-write our queries to speed them up, or perhaps I am just missing something about how to use text[]. [...] For some reason, I am seeing a big difference in our real database. I don't want to just rewrite all of our queries yet. I'm guessing the data makes a big difference. What would be a good way to examine the data to figure out what's the best way to write our queries? Is there any features in PostgreSQL that can help me improve the performance? Any advice would be greatly appreciated! With your exhaustive example statements based on table foo and cars, I performed some measures on my side (PostgreSQL 8.3.1 server). Here are some statistical results: seq rtm delta ratio deco --- --- - - s1cc873 -1,74% 2//91+1 s1or889 1,71% 2//91+1 s2cc13228,53% 3//91+2 s2or1209-9,32% 3//91+2 s3cc892 -2,61% 2//91+(.5*2) s3or915 2,54% 2//91+(.5*2) s4cc511 -3,00% 1//9(.5*2) s4or526 2,91% 1//9(.5*2) s5cc16352,13% 4//91+1+2 s5or1600-2,17% 4//91+1+2 --- --- - - seq where clauses --- - s1cckeywords && '{ford, toyota}'::text[] s1orkeywords && '{ford}'::text[] OR keywords && '{toyota}'::text[] s2cckeywords && '{ford, honda}'::text[] s2orkeywords && '{ford}'::text[] OR keywords && '{honda}'::text[] s3cckeywords && '{honda, ferrari}'::text[] s3orkeywords && '{honda}'::text[] OR keywords && '{ferrari}'::text[] s4cckeywords && '{ferrari, hummer}'::text[] s4orkeywords && '{ferrari}'::text[] OR keywords && '{hummer}'::text[] s5cckeywords && '{ford, toyota, porsche}'::text[] s5or keywords && '{ford}'::text[] OR keywords && '{toyota}'::text[] OR keywords && '{porsche}'::text[] legend -- seq sequence of 10 subsequent explain analyze per row rtm runtime mean (in milliseconds) of 10 subsequent measures delta difference percentage between cc and or sequences cc unique where clause with >1-size table (eg. {foo,bar}) or multiple where clauses with 1-size text table (eg. {foo}) ratio ratio between # of result rows and # of table rows deco result row partition between constant text table values in where clause. In the following, I refer to your condition forms as: - arr&&{f,b} - arr&&{f} or arr&&{b} I noticed first that, contrarily to your observation, for the "ford or toyata" case (sequence s1 developped into 2 subcases s1cc and s1or for both forms of condition), runtime mean is shorter for s1cc (arr&&{f,t}) than for s1or (arr&&{f} or arr&&{t}). But the difference percentage is only about 1,7% (ie. not enough relevant). This empirical advantage of form arr&&{f,t} over form (arr&&{f} or arr&&{t}) is also observed for 2 cases (s3 and s4) out of 4 (s2 up to s5). The difference percentage looks more relevant (about 3%). The cases s3 and s4 differ from the others (s1, s2, and s5) by the fact that the sets of matching rows for each compared text table value intersect: all the rows matched by ferrari also match honda (strict inclusion not equality); all the rows matched by ferrari also match hummer and conversely (double inclusion here, ie. equality). In the other 3 cases, each compared text table value matches set of rows without intersecting the matching row set of the other(s) value(s). We may then assume that form arr&&{f,t} would fit better when there are lots of rows matched by several terms--however this cannot be generalized at this stage. The reported data let us also guess some linear relationship between runtime and # of result rows. Here this relationship seems equally applicable with both forms arr&&{f,t} and (arr&&{f} or arr&&{t}). Out of these measures and report, I notice that, regarding the query plan explanation of your queries over real data, the difference between actual runtimes reported for each case of condition forms is not so relevant with respect to the overall runtime of the queries. At bitmap heap scan on the table over which conditions are performed, when the last row is retrieved, actual runtime is respectively of: - for arr&&{f,b}: 1276.990 ms; - for (arr&&{f) or arr&&{b}): 1211.535 ms. While quite close (difference percentage of about 5%), the difference is not really harmful with respect to the overall runtimes (resp. 13197 ms and 7768 ms), ie. in terms of part of overall runtimes resp. (1276/13197
Re: [GENERAL] Upgrading 8.3 to 8.4 on Windows.
What I could suggest would be to use an ETL tool. There are open source tools available for free. Talend Open Studio is an open source ETL tool for data integration and migration experts. It's easy to learn for a non-technical user. What distinguishes Talend, when it comes to business users, is the tMap component. It allows the user to get a graphical and functional view of integration processes. For more information: http://www.talend.com/ Richard Huxton wrote: > > Hartman, Matthew wrote: >> Good morning. >> >> >> >> I am itching to upgrade my 8.3 development database to 8.4 before I move >> to production. Pg_migrator is listed as beta so I'd like to avoid that. >> Has anyone made the leap yet? > > Just dump/restore with the 8.4 pg_dump/restore if you're still in > development. > > -- >Richard Huxton >Archonet Ltd > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > -- View this message in context: http://www.nabble.com/Upgrading-8.3-to-8.4-on-Windows.-tp24308206p24357819.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres 8.3.7 Server Crash: failed to add item to the right sibling in index
CM J writes: >I am running Postgres 8.3.7 on Windows 2003 with my java > application.Off late, the server crashes with the following trace: > *2009-07-01 14:47:07.250 ISTPANIC: failed to add item to the right sibling > in index "mevservices2_ndx"* Sounds like a data corruption problem ... try reindexing that index. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] COALESCE not filtering well.
On Mon, Jul 6, 2009 at 6:37 AM, Mohan Raj B wrote: > CREATE OR REPLACE FUNCTION sp_item(itemid integer, itemname character > varying) > WHERE ( ( COALESCE($1,0)=0 OR itemid=$1) AND (COALESCE($2, '')='' OR > itemname LIKE '%'||$2||'%') ) LOOP itemid and itemname are your parameters, they're being substituted in the query so you're getting 10=10 and NULL LIKE '%'||NULL||'%' -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Normalize INTERVAL ouput format in a db driver
Further, little libpq question: When using INTERVAL types, can I rely on PQfmod(), PQfsize() to determine the exact definition of the INTERVAL precision? => what YEAR/MONTH/DAY/HOUR/MINUTE/SECOND fields where used to create the column. I get different values for the type modifier, but how to interpret this? Can someone point me to the documentation or source code where I can find more details about this? I found this: http://www.postgresql.org/docs/8.4/static/libpq-exec.html#LIBPQ-EXEC-SELECT-INFO But there are not much details in PQfmod() description... Thanks! Seb Sebastien FLAESCH wrote: Hi all, Just testing 8.4rc2 INTERVALs... According to the doc, INTERVAL output format is controlled by SET intervalstyle. I am writing an interface/driver and need a solution to fetch/convert interval values independently from the current format settings... I could force my driver to implicitly set the intervalstyle to iso_8601, but I would prefer to leave this in the hands of the programmer... Imagine you have to write and ODBC interface/driver with libpq that must support the SQLINTERVAL C structure, how would you deal with PostgreSQL intervals? Is it possible to query the current intervalstyle? Thanks! Seb -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres Plus Advanced Server and general Postgres compatibility?
On Sun, Jul 5, 2009 at 4:49 PM, Andre Lopes wrote: > Hi, > > I have installed "Postgres Plus Advanced Server" in a developement machine. > TAt this time I don't know the production server envoirement. It is > problematic to use "Postgres Plus Advanced Server" as developement server? > Could I have some compatibility problems? > > Best Regards, > André. > It won't be problematic to use as a dev server or as a prod server also. You won't face any compatibility problems. Abbas.
[GENERAL] COALESCE not filtering well.
G'Day! I have issues with filtering the data based on the criteria. Please take a look at the way I use COALESCE especially the WHERE part of my function. The function is not returning me a filtered result. for example, if I try to execute the function as follows: SELECT * FROM sp_item(10,NULL); [It returns all the rows which is not what I am expecting... I'm expecting only the row with itemid=10 ] Please advise. Thanks & Regards, Mohan CREATE OR REPLACE FUNCTION sp_item(itemid integer, itemname character varying) RETURNS SETOF item AS $BODY$ declare ret_row record; BEGIN FOR ret_row in --SELECT itemid,itemcode,itemname,itemdescription,archived from item SELECT * FROM item WHERE ( ( COALESCE($1,0)=0 OR itemid=$1) AND (COALESCE($2, '')='' OR itemname LIKE '%'||$2||'%') ) LOOP return next ret_row; END LOOP; return; END; $BODY$ LANGUAGE 'plpgsql' VOLATILE COST 100 ROWS 1000;
[GENERAL] Passing a table to function
In SQL 2008, we could pass tables into stored procedures. CREATE TABLE members -- Only username is required ( mem_username VARCHAR(25) NOT NULL PRIMARY KEY, mem_email VARCHAR(255), mem_fname VARCHAR(25), mem_lname VARCHAR(25) ); CREATE TABLE TYPE member_table_type ( mem_username VARCHAR(25) ); CREATE STORED PROCEDURE CreateMembers @members member_table_type READONLY AS INSERT INTO [members] SELECT * FROM @members; To execute this stored procedure, you would do: DECLARE @members member_table_type; INSERT INTO @members (mem_username) VALUES( ('mem1'), ('mem2'), ('mem3') ); EXECUTE CreateMembers @members; How would you accomplish this on Postgre 8.4? I know you can pass an entire row to a function but that is not what I want. Notice that even though the table has many columns (nullable), I'm only passing in the username. With the ROW datatype in Postgre, you have to pass in all the columns (null if no value). This is what I have so far in Postgre: CREATE FUNCTION create_members(IN var_members members) BEGIN INSERT INTO members SELECTvar_members.mem_username, var_members.mem_email, var_members.mem_fname, var_members.mem_lname; END SELECT create_members(ROW('mem1', NULL, NULL, NULL)); -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Normalize INTERVAL ouput format in a db driver
Hi all, Just testing 8.4rc2 INTERVALs... According to the doc, INTERVAL output format is controlled by SET intervalstyle. I am writing an interface/driver and need a solution to fetch/convert interval values independently from the current format settings... I could force my driver to implicitly set the intervalstyle to iso_8601, but I would prefer to leave this in the hands of the programmer... Imagine you have to write and ODBC interface/driver with libpq that must support the SQLINTERVAL C structure, how would you deal with PostgreSQL intervals? Is it possible to query the current intervalstyle? Thanks! Seb -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Documentation - PgAdmin
On Sun, Jul 5, 2009 at 9:00 PM, Guillaume Lelarge wrote: > If I remember well, the reason was the installer would be much bigger if we > still included it. And we would also have to carry the pgAdmin CHM, and the > Slony one, and the EnterpriseDB one and now the Greenplum one. And what about > translations? If we do this, 90% of the installer size would be for docs. Kind > of strange if you want my opinion. Yeah, that's a major part of it. The manual hasn't been in pgAdmin for something like 15 months. This is because: - It's difficult and error-prone to merge the manuals into one CHM file - We (== I) had to update the CHM file whenever PostgreSQL or Slony was updated. - Bundling the manual bloated the distribution size - Bundling the manual pretty much prevented the manual being translated. - The bundled manual was no use unless you used the latest version of PostgreSQL. - The bundled manual was of little use to EnterpriseDB or Greenplum users. > The way we do it now allows you to read the docs on the Internet. But you can > grab of copy of those, put them on your hard disk, and change the preferences > to show them instead of the internet ones. Yes - in fact it's even more flexible than that: PG help path - This option is used to specify the path to the PostgreSQL help files. This may be a URL or directory containing the HTML format help files (note that some browsers may require file:///path/to/local/files/ notation, whilst others require /path/to/local/files/ notation), or the path to a Compiled HTML Help (.chm) file (on Windows), an HTML Help project (.hhp) file, or a Zip archive containing the HTML files and the HTML Help project file. The PostgreSQL Help is used when selected from the Help menu, or when working with many database objects on PostgreSQL servers. Similar options are present for EnterpriseDB, Greenplum and Slony help. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] howto determine rows count to be returned by DECLARE ... SELECT ...
Dear Community, I'm working on implementation of virtual grid using DECLARE... SELECT Advantage of virtual grid is that it loads only rows that a user is willing to see (with FETCH). However, it is not clear how to determine max rows count that the cursor can return. The count is necessary for two purposes: render scrollbar and support jumping to the last rows in the grid. The only obvious solution is to execute SELECT COUNT(1) ... before declaring the cursor. Is there a better solution? Can the cursor return total rows count or is there a way to position cursor to the last row? (Then the number of roundtrips to server can be less by 1 and virtual grid can render last rows in reverse order). Thank you, K.