Re: [GENERAL] Where clause limited to 8 items?
Henry Combrinck [EMAIL PROTECTED] writes: The above works fine - the index is used. However, extend the where clause with an extra line (say, col1 = 9) and the index is no longer used. Do explain analyze select ... with both versions and send the results (preferably without line wrapping it). I'm a bit skeptical about your description since I don't see how either query could possibly be using an index here. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] SOLVED: Where clause limited to 8 items?
Check the estimated number of rows returned. It's presumably believing that the a sequential scan will be cheaper for the estimated number of rows. If the estimated number of rows is significantly off, you may wish to change the statistics target (see ALTER TABLE) for col1 and analyze the table again. If it still is choosing a sequential scan over an index scan and the number of rows is similar, you may want to look at the random_page_cost variable. You have to be careful not too lower it too far that other queries are pessimized the other direction, but some experimentation comparing the real times and estimated costs of queries with and without enable_seqscan=off may help. Thanks for the detailed response! Your suggestion was spot-on. Regards Henry This message was sent using MetroWEB's AirMail service. http://www.metroweb.co.za/ - full access for only R73. Free Web Accelerator, WebMail, Calendar, Anti-Virus, Anti-Spam, 10 emails, 100MB personal webspace, and more! Phone Now! 086 11 11 440 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] how much ram do i give postgres?
On 19 Oct 2004 at 17:35, Josh Close wrote: Well, I didn't find a whole lot in the list-archives, so I emailed that list whith a few more questions. My postgres server is just crawling right now :( Unlike many other database engines the shared buffers of Postgres is not a private cache of the database data. It is a working area shared between all the backend processes. This needs to be tuned for number of connections and overall workload, *not* the amount of your database that you want to keep in memory. There is still lots of debate about what the sweet spot is. Maybe there isn't one, but its not normally 75% of RAM. If anything, the effective_cache_size needs to be 75% of (available) RAM as this is telling Postgres the amount of your database the *OS* is likely to cache in memory. Having said that, I think you will need to define crawling. Is it updates/inserts that are slow? This may be triggers/rules/referential integrity checking etc that is slowing it. If it is selects that are slow, this may be incorrect indexes or sub-optimal queries. You need to show us what you are trying to do and what the results are. Regards, Gary. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] SQL update function faililed in Webmin Interface
Hello All, I am a newbie to PostgreSQL. I am using postgreSQL 7.4.5 in HP-Unix 11.11 PA , and 11.23 PA. I have a problem with postgreSQL Webmin (Webmin Version 1.070) testing in update function. This problem exists only when i create a new data base through webmin interface, and insert some values,and modify those values afterwards. For E.g : I created database called "test" and created table name called "one" for that DB, which contains filed name "Name" with varchar(10) as a type and allows Null values. I inserted values for two rows as first and second, it can be viewed perfectly. But when i select second row (Which contains string "second" as value) to edit, and change it value as "second1" instead of "second", it throws me following error when i saved it : ~~~ "SQL update "one" set Name = 'Second1' where oid = 25349 failed : column "name" of relation "one" does not exist". ~~~ But when i created the database without using this Webmin interface ($ echo "create table one(Name varchar(10))"|psql test ), and then edit with webmin interface means it works well and get updated. Is there any problem with postgreSQL or with Webmin interface ?Any idea to solve this issue ? Thanks in Advance, Kathir ---Outgoing mail is certified Virus Free.Checked by AVG anti-virus system (http://www.grisoft.com).Version: 6.0.778 / Virus Database: 525 - Release Date: 10/15/2004 Do you Yahoo!?vote.yahoo.com - Register online to vote today!
Re: [GENERAL] Where clause limited to 8 items?
Henry Combrinck [EMAIL PROTECTED] writes: The above works fine - the index is used. However, extend the where clause with an extra line (say, col1 = 9) and the index is no longer used. Do explain analyze select ... with both versions and send the results (preferably without line wrapping it). I'm a bit skeptical about your description since I don't see how either query could possibly be using an index here. Why? Either it uses an index, or it doesn't. Being skeptical doesn't change the reality of what is in fact happening. Anyway, the suggestion from Stephan Szabo was the right one. Just in case you're still feeling skeptical: DB=# set enable_seqscan=on; SET DB=# explain analyse select count(*) from test1 where a=1 or a=2 or a=3 or a=4 or a=5 or a=6 or a=7 or a=8; QUERY PLAN - Aggregate (cost=38.75..38.75 rows=1 width=0) (actual time=0.291..0.292 rows=1 loops=1) - Index Scan using test1_pkey, test1_pkey, test1_pkey, test1_pkey, test1_pkey, test1_pkey, test1_pkey, test1_pkey on test1 (cost=0.00..38.72 rows=8 width=0) (actual time=0.089..0.228 rows=8 loops=1) Index Cond: ((a = 1) OR (a = 2) OR (a = 3) OR (a = 4) OR (a = 5) OR (a = 6) OR (a = 7) OR (a = 8)) Total runtime: 0.744 ms (4 rows) DB=# explain analyse select count(*) from test1 where a=1 or a=2 or a=3 or a=4 or a=5 or a=6 or a=7 or a=8 or a=9; QUERY PLAN --- Aggregate (cost=42.52..42.52 rows=1 width=0) (actual time=0.249..0.250 rows=1 loops=1) - Seq Scan on test1 (cost=0.00..42.50 rows=9 width=0) (actual time=0.067..0.182 rows=9 loops=1) Filter: ((a = 1) OR (a = 2) OR (a = 3) OR (a = 4) OR (a = 5) OR (a = 6) OR (a = 7) OR (a = 8) OR (a = 9)) Total runtime: 0.493 ms (4 rows) DB=# When used on a real table (ie, with hundreds of thousands of records), the total runtime peaks at over 8000ms (seq scan)... This message was sent using MetroWEB's AirMail service. http://www.metroweb.co.za/ - full access for only R73. Free Web Accelerator, WebMail, Calendar, Anti-Virus, Anti-Spam, 10 emails, 100MB personal webspace, and more! Phone Now! 086 11 11 440 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] [SQL] SQL update function faililed in Webmin Interface
Kathiravan Velusamy wrote: I created database called test and created table name called one for that DB, which contains filed name Name with varchar(10) as a type and allows Null values. The issue here is that you have created a column Name with quotes, which means it is case-sensitive. SQL update one set Name = 'Second1' where oid = 25349 failed : column name of relation one does not exist. You are then trying to access it without quotes which means it gets folded to lower-case name (look carefully at the error message). If you quote the name when you create it, ALWAYS quote it. If you never quote names then you won't have any problems. It might be that the webmin module quoted the column-name for you without your knowledge. You'll need to consult your webmin documentation for details. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] files ending with .1 or .2
I got a table with oid 25459. The file is 1073741824 bytes big. I did some more inserts, and now I have this two new files: size/name: 1073741824 25459.1 21053440 25459.2 What are they? The 25459.1 looks exactly like the 25459. I tried looking at the docs, but searching for .1 or .2 wasn't that helpful... ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] SQL update function faililed in Webmin Interface
On Wed, 2004-10-20 at 01:03, Kathiravan Velusamy wrote: Hello All, I am a newbie to PostgreSQL. I am using postgreSQL 7.4.5 in HP-Unix 11.11 PA , and 11.23 PA. I have a problem with postgreSQL Webmin (Webmin Version 1.070) testing in update function. This problem exists only when i create a new data base through webmin interface, and insert some values,and modify those values afterwards. For E.g : I created database called test and created table name called one for that DB, which contains filed name Name with varchar(10) as a type and allows Null values. I inserted values for two rows as first and second, it can be viewed perfectly. But when i select second row (Which contains string second as value) to edit, and change it value as second1 instead of second, it throws me following error when i saved it : ~~~ SQL update one set Name = 'Second1' where oid = 25349 failed : column name of relation one does not exist. ~~~ But when i created the database without using this Webmin interface ($ echo create table one(Name varchar(10))|psql test ), and then edit with webmin interface means it works well and get updated. Is there any problem with postgreSQL or with Webmin interface ? Any idea to solve this issue ? It looks like the table is being defined with the column quoted, like this: create table one (Name text, moredefshere...) but accessed without quotes, like above. Whether a database folds to upper or lower case, the columns need to be accessed consistenly, either all quoted or never quoted. An application that mixes quoting and not quoting identifiers is going to have problems. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] files ending with .1 or .2
When a data file for a specific table (or index?) is larger than 1GB, its split up in several parts. This is probably a left over from the time OSs used to have problems with large files. The file name, that number, is the OID of the table afaik. And the postfix is of course the number in the order of parts. Best regards, Arjen On 20-10-2004 9:39, Leonardo Francalanci wrote: I got a table with oid 25459. The file is 1073741824 bytes big. I did some more inserts, and now I have this two new files: size/name: 1073741824 25459.1 21053440 25459.2 What are they? The 25459.1 looks exactly like the 25459. I tried looking at the docs, but searching for .1 or .2 wasn't that helpful... ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] files ending with .1 or .2
When a data file for a specific table (or index?) is larger than 1GB, its split up in several parts. This is probably a left over from the time OSs used to have problems with large files. Thank you. Is there any documentation I can read about this? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] files ending with .1 or .2
I don't know. I just deduced that from an earlier situation where I new the size of the data, and noticed that the largest table was split up in enough 1GB parts to fit that size ;) Best regards, Arjen On 20-10-2004 10:14, Leonardo Francalanci wrote: When a data file for a specific table (or index?) is larger than 1GB, its split up in several parts. This is probably a left over from the time OSs used to have problems with large files. Thank you. Is there any documentation I can read about this? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] beta3 winxp initdb problems
Speaking off-list with Zoltan, it appears this problem was *also* related to nod32 antivirus. Just a different error message than we've seen before. Seems nod32 is significantly worse than any other AV products for postgresql... //Magnus -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, October 14, 2004 9:34 AM To: [EMAIL PROTECTED] Cc: Magnus Hagander Subject: Re: [GENERAL] beta3 winxp initdb problems Magnus, et al., This is, what initdb -d says: listing D:\tmp\datainitdb -d Running in debug mode. VERSION=8.0.0beta3 PGDATA=d:/tmp/data share_path=c:/msys/1.0/share/postgresql PGPATH=c:/msys/1.0/bin POSTGRES_SUPERUSERNAME=postgresql POSTGRES_BKI=c:/msys/1.0/share/postgresql/postgres.bki POSTGRES_DESCR=c:/msys/1.0/share/postgresql/postgres.description POSTGRESQL_CONF_SAMPLE=c:/msys/1.0/share/postgresql/postgresql .conf.sample PG_HBA_SAMPLE=c:/msys/1.0/share/postgresql/pg_hba.conf.sample PG_IDENT_SAMPLE=c:/msys/1.0/share/postgresql/pg_ident.conf.sample The files belonging to this database system will be owned by user postgresql. This user must also own the server process. The database cluster will be initialized with locale Slovak_Slovakia.1250. fixing permissions on existing directory d:/tmp/data ... ok creating directory d:/tmp/data/global ... ok creating directory d:/tmp/data/pg_xlog ... ok creating directory d:/tmp/data/pg_xlog/archive_status ... ok creating directory d:/tmp/data/pg_clog ... ok creating directory d:/tmp/data/pg_subtrans ... ok creating directory d:/tmp/data/base ... ok creating directory d:/tmp/data/base/1 ... ok creating directory d:/tmp/data/pg_tblspc ... ok selecting default max_connections ... 10 selecting default shared_buffers ... 50 creating configuration files ... ok creating template1 database in d:/tmp/data/base/1 ... DEBUG: TZ Europe/Belgrad e matches Windows timezone Central Europe Daylight Time FATAL: could not select a suitable default timezone DETAIL: It appears that your GMT time zone uses leap seconds. PostgreSQL does n ot support leap seconds. DEBUG: proc_exit(1) DEBUG: shmem_exit(1) DEBUG: exit(1) child process was terminated by signal 1 initdb: failed initdb: removing contents of data directory d:/tmp/data D:\tmp\data /listing So I said to myself, set the timezone to GMT. This, what I got after switching off the automatic daylight saving time notification: listing D:\tmp\datainitdb -d Running in debug mode. VERSION=8.0.0beta3 PGDATA=d:/tmp/data share_path=c:/msys/1.0/share/postgresql PGPATH=c:/msys/1.0/bin POSTGRES_SUPERUSERNAME=postgresql POSTGRES_BKI=c:/msys/1.0/share/postgresql/postgres.bki POSTGRES_DESCR=c:/msys/1.0/share/postgresql/postgres.description POSTGRESQL_CONF_SAMPLE=c:/msys/1.0/share/postgresql/postgresql .conf.sample PG_HBA_SAMPLE=c:/msys/1.0/share/postgresql/pg_hba.conf.sample PG_IDENT_SAMPLE=c:/msys/1.0/share/postgresql/pg_ident.conf.sample The files belonging to this database system will be owned by user postgresql. This user must also own the server process. The database cluster will be initialized with locale Slovak_Slovakia.1250. fixing permissions on existing directory d:/tmp/data ... ok creating directory d:/tmp/data/global ... ok creating directory d:/tmp/data/pg_xlog ... ok creating directory d:/tmp/data/pg_xlog/archive_status ... ok creating directory d:/tmp/data/pg_clog ... ok creating directory d:/tmp/data/pg_subtrans ... ok creating directory d:/tmp/data/base ... ok creating directory d:/tmp/data/base/1 ... ok creating directory d:/tmp/data/pg_tblspc ... ok selecting default max_connections ... 10 selecting default shared_buffers ... 50 creating configuration files ... ok creating template1 database in d:/tmp/data/base/1 ... DEBUG: TZ Europe/Dublin matches Windows timezone GMT Standard Time FATAL: could not select a suitable default timezone DETAIL: It appears that your GMT time zone uses leap seconds. PostgreSQL does n ot support leap seconds. DEBUG: proc_exit(1) DEBUG: shmem_exit(1) DEBUG: exit(1) child process was terminated by signal 1 initdb: failed initdb: removing contents of data directory d:/tmp/data D:\tmp\data /listing Any ideas? Zoltan Hello! Could you please run this with debugging enabled? You do this by passing -d to initdb. The interesting output is the stuff that comes between the creating template1 database message and the FATAL error. Did you restart the commandprompt you ran initdb in after changing the timezone? //Magnus ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get
[GENERAL] union query returning duplicates
I am using 8.0 beta 1 on an RH 8 Linux server. I have a union query that I am converting from access (where it worked) and it is returning duplicates. The only difference between the two rows is the Row field, which is returned automatically. and an example of a row that it has returned duplicate. I have verified that the row only shows up 1 time in each select statement when run individually. Here is a sample of the duplicates it returned (one was row 2 and the other row 3, but that didn't seem to come with a copy and paste from pgadmin): 2;486CORE-D16-F4-C66-N0-R3-S-E;6.6;5.274;97;3;6.6 2;486CORE-D16-F4-C66-N0-R3-S-E;6.6;5.274;97;3;6.6 Below is the query: SELECT a.assemblyid, a.assemblyname, b.fixedprice, CASE WHEN sum(packagecount) totalcount::numeric THEN NULL::double precision ELSE sum(calculatedprice) END AS calcprice, b.supplierid, a.productid, COALESCE(b.fixedprice, CASE WHEN sum(packagecount) totalcount::numeric THEN NULL::double precision ELSE sum(calculatedprice) END) AS activeprice FROM assemblies a JOIN qry_assemblyfixedprices b ON a.assemblyid = b.assemblyid LEFT JOIN qry_assemblycalcprices c ON c.supplierid = b.supplierid AND b.assemblyid = c.assemblyid WHERE b.supplierid =97 GROUP BY a.assemblyid, a.assemblyname, b.fixedprice, b.supplierid, totalcount, a.productid order by assemblyid UNION SELECT a.assemblyid, a.assemblyname, c.fixedprice, CASE WHEN sum(packagecount) totalcount::numeric THEN NULL::double precision ELSE sum(calculatedprice) END AS calcprice, b.supplierid, a.productid, COALESCE(c.fixedprice, CASE WHEN sum(packagecount) totalcount::numeric THEN NULL::double precision ELSE sum(calculatedprice) END) AS activeprice FROM assemblies a JOIN qry_assemblycalcprices b ON a.assemblyid = b.assemblyid LEFT JOIN qry_assemblyfixedprices c ON c.supplierid = b.supplierid AND c.assemblyid = b.assemblyid WHERE b.supplierid =97 GROUP BY a.assemblyid, a.assemblyname, c.fixedprice, b.supplierid, totalcount, a.productid order by assemblyid Thank You Sim Zacks IT Manager CompuLab 04-829-0145 - Office 04-832-5251 - Fax ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] union query returning duplicates
It is very weird, I just tried both a group by and distinct and both of them still return the duplicates. I also tried a very simple union which didn't return any duplicates, both of these said, it is obviously not a problem with union. I just tried the query without the case statement that does the sum and it did work. I am wondering if there might be something about double precision numbers (such as a weird roundoff error or something) that prevent it from comparing it to another number. In my example it is returning fairly simple numbers (6.6) so I don't see where it could make a mistake. The system automatically put in the ::double precision when I created the View that encases the query I sent. Maybe there is a better typecast that I should use to manually override it? Thank You Sim Zacks IT Manager CompuLab 04-829-0145 - Office 04-832-5251 - Fax Did you tried a select distinct? Hagen Sim Zacks wrote: I am using 8.0 beta 1 on an RH 8 Linux server. I have a union query that I am converting from access (where it worked) and it is returning duplicates. The only difference between the two rows is the Row field, which is returned automatically. and an example of a row that it has returned duplicate. I have verified that the row only shows up 1 time in each select statement when run individually. Here is a sample of the duplicates it returned (one was row 2 and the other row 3, but that didn't seem to come with a copy and paste from pgadmin): 2;486CORE-D16-F4-C66-N0-R3-S-E;6.6;5.274;97;3;6.6 2;486CORE-D16-F4-C66-N0-R3-S-E;6.6;5.274;97;3;6.6 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] union query returning duplicates
double precision is inexact and therefore any query returning a field of that type cannot be in a group by/distinct... I switched it to type ::numeric(10,4) and it worked fine. It was the system that automatically did the conversion for me, so I will have to figure out why and keep that in mind for the next time. Thank You Sim Zacks IT Manager CompuLab 04-829-0145 - Office 04-832-5251 - Fax It is very weird, I just tried both a group by and distinct and both of them still return the duplicates. I also tried a very simple union which didn't return any duplicates, both of these said, it is obviously not a problem with union. I just tried the query without the case statement that does the sum and it did work. I am wondering if there might be something about double precision numbers (such as a weird roundoff error or something) that prevent it from comparing it to another number. In my example it is returning fairly simple numbers (6.6) so I don't see where it could make a mistake. The system automatically put in the ::double precision when I created the View that encases the query I sent. Maybe there is a better typecast that I should use to manually override it? Thank You Sim Zacks IT Manager CompuLab 04-829-0145 - Office 04-832-5251 - Fax Did you tried a select distinct? Hagen Sim Zacks wrote: I am using 8.0 beta 1 on an RH 8 Linux server. I have a union query that I am converting from access (where it worked) and it is returning duplicates. The only difference between the two rows is the Row field, which is returned automatically. and an example of a row that it has returned duplicate. I have verified that the row only shows up 1 time in each select statement when run individually. Here is a sample of the duplicates it returned (one was row 2 and the other row 3, but that didn't seem to come with a copy and paste from pgadmin): 2;486CORE-D16-F4-C66-N0-R3-S-E;6.6;5.274;97;3;6.6 2;486CORE-D16-F4-C66-N0-R3-S-E;6.6;5.274;97;3;6.6 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] union query returning duplicates
On Wed, Oct 20, 2004 at 01:54:04PM +0200, Sim Zacks wrote: It is very weird, I just tried both a group by and distinct and both of them still return the duplicates. I also tried a very simple union which didn't return any duplicates, both of these said, it is obviously not a problem with union. Not related to your underlying problem, but be aware that UNION does eliminate duplicates by design, so that could explain what you are seeing here. If you don't want it to do that, use UNION ALL instead (the same applies to INTERSECT and EXCEPT if you ever happen to use them). -- Alvaro Herrera ([EMAIL PROTECTED]) Saca el libro que tu religión considere como el indicado para encontrar la oración que traiga paz a tu alma. Luego rebootea el computador y ve si funciona (Carlos Duclós) ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] how much ram do i give postgres?
On Wed, 20 Oct 2004 08:00:55 +0100, Gary Doades [EMAIL PROTECTED] wrote: Unlike many other database engines the shared buffers of Postgres is not a private cache of the database data. It is a working area shared between all the backend processes. This needs to be tuned for number of connections and overall workload, *not* the amount of your database that you want to keep in memory. There is still lots of debate about what the sweet spot is. Maybe there isn't one, but its not normally 75% of RAM. If anything, the effective_cache_size needs to be 75% of (available) RAM as this is telling Postgres the amount of your database the *OS* is likely to cache in memory. Having said that, I think you will need to define crawling. Is it updates/inserts that are slow? This may be triggers/rules/referential integrity checking etc that is slowing it. If it is selects that are slow, this may be incorrect indexes or sub-optimal queries. You need to show us what you are trying to do and what the results are. It's slow due to several things happening all at once. There are a lot of inserts and updates happening. There is periodically a bulk insert of 500k - 1 mill rows happening. I'm doing a vacuum anaylyze every hour due to the amount of transactions happening, and a vacuum full every night. All this has caused selects to be very slow. At times, a select count(1) from a table will take several mins. I don't think selects would have to wait on locks by inserts/updates would it? I would just like to do anything possible to help speed this up. -Josh ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Database Disappeared
Roberts, Adam [EMAIL PROTECTED] writes: So, my main question is, is it reasonable to say that a trans id wraparound failure could create a situation in which you could use/manipulate user data tables if you refer to the data tables directly but if you tried to use a util (such as pgdump) or an internal psql query like \d or \df it would appear that you have no data tables? Yeah, it is possible, because the system's internal catalog fetches use SnapshotNow rules, which only look to see if a row's inserting/deleting transaction(s) committed or not; they don't apply any comparison to the current transaction ID. So what you've got is a situation where the tables' pg_class rows have wrapped around and become invisible to SQL queries, but the system's internal operations are still happy. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] files ending with .1 or .2
Leonardo Francalanci [EMAIL PROTECTED] writes: Is there any documentation I can read about this? The best concise documentation I know about is in the CVS-tip docs for contrib/oid2name (reproduced below; the bit about tablespaces is irrelevant to pre-8.0 versions, but the rest is accurate). I've been wanting to transpose this into the mainstream admin docs, but haven't decided where to put it. regards, tom lane Databases are placed in directories named after their OIDs in pg_database, and the table files within a database's directory are named by filenode numbers, which are stored in pg_class.relfilenode. Note that while a table's filenode often matches its OID, this is *not* necessarily the case; some operations, like TRUNCATE, REINDEX, CLUSTER and some forms of ALTER TABLE, can change the filenode while preserving the OID. Avoid assuming that filenode and table OID are the same. When a table exceeds 1Gb, it is divided into gigabyte-sized segments. The first segment's file name is the same as the filenode; subsequent segments are named filenode.1, filenode.2, etc. Tablespaces make the scenario more complicated. Each non-default tablespace has a symlink inside the pg_tblspc directory, which points to the physical tablespace directory (as specified in its CREATE TABLESPACE command). The symlink is named after the tablespace's OID. Inside the physical tablespace directory there is another directory for each database that has elements in the tablespace, named after the database's OID. Tables within that directory follow the filenode naming scheme. The pg_default tablespace is not addressed via pg_tblspc, but corresponds to $PGDATA/base. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Views + UNION ALL = Slow ?
Hello ! I have two tables (which contains individual months' data). One of them contains 500 thousand records and the other one about 40k, 8 columns. When I do a simple query on them individually it takes milli-seconds to complete (see gory details below). For some querys I want to include data from multiple months so I created a view using Union all. But Using the view it takes 31 Seconds to complete the same query. I am obviously doing something wrong or using something the wrong way. Any Ideas ? /Otto Blomqvist test=# explain analyze select fid_2 from file_92_904 where fid_4=1024; NOTICE: QUERY PLAN: Index Scan using file_92_904_ltn_idx on file_92_904 (cost=0.00..219.90 rows=65 width=4) (actual time=0.49..0.49 rows=0 loops=1) Total runtime: 0.57 msec EXPLAIN test=# explain analyze select fid_2 from file_92_1004 where fid_4=1024; NOTICE: QUERY PLAN: Index Scan using file_92_1004_ltn_idx on file_92_1004 (cost=0.00..4505.20 rows=1197 width=4) (actual time=32.36..32.36 rows=0 loops=1) Total runtime: 32.46 msec EXPLAIN test=# create view twotables as select * from file_92_1004 UNION ALL Select * from file_92_904; CREATE test=# explain analyze select fid_2 from twotables where fid_4=1024; NOTICE: QUERY PLAN: Subquery Scan twotables (cost=1.00..200023000.53 rows=569553 width=203) (actual time=31590.97..31590.97 rows=0 loops=1) - Append (cost=1.00..200023000.53 rows=569553 width=203) (actual time=12.13..30683.67 rows=569553 loops=1) - Subquery Scan *SELECT* 1 (cost=1.00..100021799.06 rows=540306 width=199) (actual time=12.12..28417.81 rows=540306 loops=1) - Seq Scan on file_92_1004 (cost=1.00..100021799.06 rows=540306 width=199) (actual time=12.09..14946.47 rows=540306 loops=1) - Subquery Scan *SELECT* 2 (cost=1.00..11201.47 rows=29247 width=203) (actual time=0.19..1525.18 rows=29247 loops=1) - Seq Scan on file_92_904 (cost=1.00..11201.47 rows=29247 width=203) (actual time=0.14..793.34 rows=29247 loops=1) Total runtime: 31591.34 msec EXPLAIN ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] download postgreql problem
On Tue, 19 Oct 2004 13:33:07 -0400, Joseph.Dunleavy wrote: I am trying to download postgresql from one of the mirror sites. I get prompted for a username and password. I try anonymous login and my password and I get an error stating either the server doesn't support anonymous logins or that my email address wasn't accepted. What am I doing incorrectly? When you login anonymously you must use the username anonymous and send an email address as the password - you are probably suplying a password that does not contain an @-sign and so isnot accepted as a valid password. Graeme ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] Numeric user names
I want to use bare numbers because that is how the users (students in this case) are identified on the network and in the student information system. They've been identified this way for over 20 years, so it would be near impossible to change at this point (although it is not always very convenient :-). I'm trying (and almost finished now) to have the postgres server be the source of all user account information on the network (windows and linux). -Ed Tom Lane wrote: Neil Conway [EMAIL PROTECTED] writes: I don't know of an easy workaround. Why do you need numeric usernames? There's always double-quoted identifiers: create user 12345 with password ... Considering that the SQL standard defines authorization identifier as an identifier, I'm not sure why Ed is expecting that he should be able to use a bare number as a user name. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[GENERAL] Sequence question
Hi, I have a question about sequences. I need a field to have values with no holes in the sequence. However, the values do not need to be in order. My users will draw a number or numbers from the sequence and write to the field. Sometimes, however, these sequence numbers will be discarded (after a transaction is complete), and thus available for use. During the transaction, however, any drawn numbers need to be unavailable. I would like the next user who draws a number to draw the lowest number she can, starting with the holes in the sequence. This continuous sequence is absolutely required by our company, as the fact that the sequence has no holes is used to check for much more serious problems. So my question is: what's the most effective way to get the next available number? My present method is to do a query that finds the first and last number in each of the holes, step through those holes, and then start generating new numbers. Unfortunately, this involves doing a table scan each time - before I generate the number, and does not produce the transaction-safety I want. Does anyone have any better ideas? Places I should look? Thanks, Eric ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] Numeric user names
Thanks. This worked. This is exactly what I was looking for. Stephan Szabo wrote: On Tue, 12 Oct 2004, Ed Stoner wrote: I am unable to use the CREATE USER command with numeric user names (i.e. CREATE USER 35236 WITH PASSWORD '1234';). Is this a limitation or a problem somewhere with how I have things configured? Is there are workaround? I believe you can create a user with a quoted identifier that is all numbers (ie CREATE USER 35236) but then you have to quote the username for sql statements like grant and revoke as well. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[GENERAL] Sequence question
Hi, I have a question about sequences. I need a field to have values with no holes in the sequence. However, the values do not need to be in order. My users will draw a number or numbers from the sequence and write to the field. Sometimes, however, these sequence numbers will be discarded (after a transaction is complete), and thus available for use. During the transaction, however, any drawn numbers need to be unavailable. I would like the next user who draws a number to draw the lowest number she can, starting with the holes in the sequence. This continuous sequence is absolutely required by our company, as the fact that the sequence has no holes is used to check for much more serious problems. So my question is: what's the most effective way to get the next available number? My present method is to do a query that finds the first and last number in each of the holes, step through those holes, and then start generating new numbers. Unfortunately, this involves doing a table scan each time - before I generate the number, and does not produce the transaction-safety I want. Does anyone have any better ideas? Places I should look? Thanks, Eric ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[GENERAL] OID and PK/FK KEYS
Hi! I'm from Italy, and sorry about my english... I have a question that I know it is already said in the groups but I have however some doubts I have seen it is technically possible to use OID as PRIMARY KEY and as FOREIGN KEY but it is correct to do so for the database's logical integrity? Is it better I use in any case other keys and not oid to avoid the possible wraparound? or the wraparound is an extreme case and so I can use quietly OID as PRIMARY and FOREIGN KEY? Thanks ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] how much ram do i give postgres?
It's slow due to several things happening all at once. There are a lot of inserts and updates happening. There is periodically a bulk insert of 500k - 1 mill rows happening. I'm doing a vacuum anaylyze every hour due to the amount of transactions happening, and a vacuum full every night. All this has caused selects to be very slow. At times, a select count(1) from a table will take several mins. I don't think selects would have to wait on locks by inserts/updates would it? I would just like to do anything possible to help speed this up. If there are really many rows in table , select count(1) would be a little bit slow, for postgresql use sequential scan to count the rows. If the query is other kind, then may be check if there are index on search condition or use EXPLAIN command to see the query plan would be greatly help. By the way, what's the version of your postgresql? older version (7.4?) still suffer from index space bloating. regards Laser ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[GENERAL] index not used?
I'm trying to access a table with about 120M rows. It's a vertical version of a table with 360 or so columns. The new columns are: original item col, original item row, and the value. I created an index: CREATE INDEX idx on table (col, row) however, selects are still very slow. It seems it still needs a sequential scan: EXPLAIN SELECT * FROM table WHERE col=1 AND row=10; QUERY PLAN -- Seq Scan on table (cost=1.00..102612533.00 rows=1 width=14) Filter: ((col = 1) AND (row = 10)) What am I doing wrong? -- Dan Pelleg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[GENERAL] Oracle Varray to Postgres conversion problem
Hello, We have been working on migrating oracle database to postgres for one of our client. Wehave a stored procedure in oracle which uses varray and I have to convert this stored procedureto postgres. Any help with respect to this will be greatly appreciated. Thanks in advance. Best Regards,Arvind Purohit==The Journey Of Thousand Miles Can Be Started With One Step In Right Direction.==
Re: [GENERAL] Sequence question
Far from being a perfect idea but a faster solution than stepping through all holes: 1) Create a second table containing only one field of type of your key. 2) When you delete an entry place the delete key value in your second table 3) If you insert a new entry into your old table and your new table contains a value, take the minimum value in the new table as your new key and delete that entry from the new table. If the new table is empty just use the sequence to get the new key value. Hope that helps David Ecker Eric E wrote: Hi, I have a question about sequences. I need a field to have values with no holes in the sequence. However, the values do not need to be in order. My users will draw a number or numbers from the sequence and write to the field. Sometimes, however, these sequence numbers will be discarded (after a transaction is complete), and thus available for use. During the transaction, however, any drawn numbers need to be unavailable. I would like the next user who draws a number to draw the lowest number she can, starting with the holes in the sequence. This continuous sequence is absolutely required by our company, as the fact that the sequence has no holes is used to check for much more serious problems. So my question is: what's the most effective way to get the next available number? My present method is to do a query that finds the first and last number in each of the holes, step through those holes, and then start generating new numbers. Unfortunately, this involves doing a table scan each time - before I generate the number, and does not produce the transaction-safety I want. Does anyone have any better ideas? Places I should look? Thanks, Eric ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[GENERAL] Free PostgreSQL Training, Philadelphia, Oct 30
All, My company (Chariot Solutions) is sponsoring a day of free PostgreSQL training by Bruce Momjian (one of the core PostgreSQL developers). The day is split into 2 sessions (plus a QA session): * Mastering PostgreSQL Administration * PostgreSQL Performance Tuning Registration is required, and space is limited. The location is Malvern, PA (suburb of Philadelphia) and it's on Saturday Oct 30. For more information or to register, see http://chariotsolutions.com/postgresql.jsp Thanks, Aaron P.S. If you're planning to take the train (from Philly, NYC, etc.) please send me a note off-list so we can arrange to get you from the station to the event -- it's close but not really walking distance. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Sequence question
Hi, On Tue, 2004-10-19 at 01:16, Eric E wrote: Hi, I have a question about sequences. I need a field to have values with no holes in the sequence. However, the values do not need to be in order. My users will draw a number or numbers from the sequence and write to the field. Sometimes, however, these sequence numbers will be discarded (after a transaction is complete), and thus available for use. During the transaction, however, any drawn numbers need to be unavailable. I would like the next user who draws a number to draw the lowest number she can, starting with the holes in the sequence. This continuous sequence is absolutely required by our company, as the fact that the sequence has no holes is used to check for much more serious problems. I would recheck this requirement. What should actually be achieved with the check for no holes in the numbering? Remember you can always enumerate using a set returning function or by means of a temporary sequence for a query. So my question is: what's the most effective way to get the next available number? There is none. My present method is to do a query that finds the first and last number in each of the holes, step through those holes, and then start generating new numbers. Unfortunately, this involves doing a table scan each time - before I generate the number, and does not produce the transaction-safety I want. You cannot eat the cake and keep it - either you have holes or you have transaction security or you have bad performance by locking the whole table on insert. Regards Tino ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Upgrade to Win XP Service Pak 2 SP2 causes connection failure
I've got an app written with Access XP using a PostgreSQL backend internet connection (using the latest ODBC driver) that was deployed 2 years ago. Recently a client upgraded to XP Service Pack 2. After the upgrade she was unable to connect to the remote database, getting the error: Unable to connect to remote database File Name=C:\Program Files\Common Files\ODBC\Data Sources\PostgresCNI.dsn Error message: The file is not a valid compound file. I did some research and found people recommending the connection string use 'FileDSN=dsn file' instead of 'File name=dsn file' So I changed it from: Set conn = New ADODB.connection conn.Open File Name=C:\Program Files\Common Files\ODBC\Data Sources\PostgresCNI.dsn to Set conn = New ADODB.connection conn.Open FileDSN=C:\Program Files\Common Files\ODBC\Data Sources\PostgresCNI.dsn Now the app gives this error: This file is located outside your intranet or on an untrusted site. Microsoft Acess will not open the file due to potential security problems. To open the file, copy it to your machine or an accessible network location. We've tried turning off the WinXP firewall entirely, adding the server IP to trusted internet sites, all to no avail. Any one else run into this? -- Randall Perry sysTame Xserve Web Hosting/Co-location Website Design/Development WebObjects Hosting Mac Consulting/Sales http://www.systame.com/ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] index not used?
On Wed, 2004-10-20 at 08:06, Dan Pelleg wrote: I'm trying to access a table with about 120M rows. It's a vertical version of a table with 360 or so columns. The new columns are: original item col, original item row, and the value. I created an index: CREATE INDEX idx on table (col, row) however, selects are still very slow. It seems it still needs a sequential scan: EXPLAIN SELECT * FROM table WHERE col=1 AND row=10; QUERY PLAN -- Seq Scan on table (cost=1.00..102612533.00 rows=1 width=14) Filter: ((col = 1) AND (row = 10)) What am I doing wrong? What type are row and col? If they're bigint (i.e. not int / int4) then you might need to quote the value to get the query to use an index: SELECT * FROM table WHERE col='1' AND row='10'; also, have you vacuumed / analyzed the table? I'm assuming yes. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] index not used?
Scott Marlowe writes: On Wed, 2004-10-20 at 08:06, Dan Pelleg wrote: I'm trying to access a table with about 120M rows. It's a vertical version of a table with 360 or so columns. The new columns are: original item col, original item row, and the value. I created an index: CREATE INDEX idx on table (col, row) however, selects are still very slow. It seems it still needs a sequential scan: EXPLAIN SELECT * FROM table WHERE col=1 AND row=10; QUERY PLAN -- Seq Scan on table (cost=1.00..102612533.00 rows=1 width=14) Filter: ((col = 1) AND (row = 10)) What am I doing wrong? What type are row and col? If they're bigint (i.e. not int / int4) then you might need to quote the value to get the query to use an index: SELECT * FROM table WHERE col='1' AND row='10'; also, have you vacuumed / analyzed the table? I'm assuming yes. They're not bigints: CREATE TABLE table (col int2, row integer, val double precision) Yes, I vacuumed and analyzed, right after creating the index. Should I try and issue a few queries beforehand? --Dan ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] how much ram do i give postgres?
On Wed, 2004-10-20 at 07:25, Josh Close wrote: It's slow due to several things happening all at once. There are a lot of inserts and updates happening. There is periodically a bulk insert of 500k - 1 mill rows happening. I'm doing a vacuum anaylyze every hour due to the amount of transactions happening, and a vacuum full every night. All this has caused selects to be very slow. At times, a select count(1) from a table will take several mins. I don't think selects would have to wait on locks by inserts/updates would it? 1: Is the bulk insert being done inside of a single transaction, or as individual inserts? 2: Are your fsm settings high enough for an hourly vacuum to be effective? 3: How selective is the where clause for your select (1) query? If there is no where clause or the where clause isn't very selective, then there will be a sequential scan every time. Since PostgreSQL has to hit the table after using an index anyway, if it's going to retrieve a fair percent of a table, it just goes right to a seq scan, which for postgresql, is the right thing to do. Post explain analyze of your slowest queries to the performance list if you can. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] index not used?
On Wed, 2004-10-20 at 09:45, Dan Pelleg wrote: Scott Marlowe writes: On Wed, 2004-10-20 at 08:06, Dan Pelleg wrote: I'm trying to access a table with about 120M rows. It's a vertical version of a table with 360 or so columns. The new columns are: original item col, original item row, and the value. I created an index: CREATE INDEX idx on table (col, row) however, selects are still very slow. It seems it still needs a sequential scan: EXPLAIN SELECT * FROM table WHERE col=1 AND row=10; QUERY PLAN -- Seq Scan on table (cost=1.00..102612533.00 rows=1 width=14) Filter: ((col = 1) AND (row = 10)) What am I doing wrong? What type are row and col? If they're bigint (i.e. not int / int4) then you might need to quote the value to get the query to use an index: SELECT * FROM table WHERE col='1' AND row='10'; also, have you vacuumed / analyzed the table? I'm assuming yes. They're not bigints: CREATE TABLE table (col int2, row integer, val double precision) Yes, I vacuumed and analyzed, right after creating the index. Should I try and issue a few queries beforehand? but one is an int2 (i.e. not int / int4) so you'll need to quote that value to get an index to work. Note this is fixed in 8.0 I understand. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] Sequence question
On Tue, Oct 19, 2004 at 11:19:05AM -0400, Eric E wrote: My users will draw a number or numbers from the sequence and write to the field. Sometimes, however, these sequence numbers will be discarded (after a transaction is complete), and thus available for use. During the transaction, however, any drawn numbers need to be unavailable. I would like the next user who draws a number to draw the lowest number she can, starting with the holes in the sequence. There are two ways I've seen to do this. One is the low-concurrency way. Another is a sort of clever approach that isn't theoretically perfect, but which provides slightly better concurrency. The low-concurrency approach is pretty much what you'd expect: keep the value in a table which is locked by each transaction which is incrementing it, and complete the incrementing in the transaction scope. That way, if it rolls back, the value hasn't been incremented, and is ready for the next user. The problem, of course, is that this forces every transaction to stand in line. An alternative approach I've heard is to pre-allocate numbers from a sequence into a table: create table seq_allocation ( serialno int8 not null unique, grant_status int constraint status_limiter check (grant_status in (1,2,3)) ); The idea is that a grant_status of 1 means the serial number is unallocated, a grant_status of 2 means it's pending, and 3 means it's granted. When you start, in one transaction you pick the next available serialno with a status of 1. Then you update that row to set it to 2 (make sure you use where grant_status = 1 to avoid a race condition), and then commit. Now you have your serial number. Use it, and then at the end of your transaction where you are committing, set the grant_status to 3, so you know it's really used. Now, how do you handle the cases where either the transaction fails so you can't set it to 3? Simple: your client captures errors and then sets the value back to 1 later. For client errors, you need yet another process which will go around periodically and check for grant_status = 2, and make sure nobody's actually in the middle of trying to use them. (You could refine the seq_allocation table by storing the pid of the allocating back end. Then your maintenance script could look for such a back end while cleaning up.) The savepoints features of 8.0 will make some of this even easier for you. Note that this second method is not completely bulletproof, but it might be good enough for the cases you want. I have a feeling, however, that you're creating a new problem for yourself by not being able to skip sequence values. My bet is that you actually need to find a better way to solve the other serious problems you have rather than banging on sequences to get them to fit your intended use. A -- Andrew Sullivan | [EMAIL PROTECTED] I remember when computers were frustrating because they *did* exactly what you told them to. That actually seems sort of quaint now. --J.D. Baldwin ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[GENERAL] undefined symbols
IM trying to build the ppostgres ODBC driver for MacOSX. I thought id try and build it as a bundle from XCode. All compiles no problem but then at the end of the compile i get an undefined symbols error, here it is: ld: Undefined symbols: _CurrentMemoryContext _MemoryContextAlloc _pfree Any idea what might be causing this? How can i get rid of this? thanks Alex ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] index not used?
Scott Marlowe writes: On Wed, 2004-10-20 at 09:45, Dan Pelleg wrote: Scott Marlowe writes: On Wed, 2004-10-20 at 08:06, Dan Pelleg wrote: I'm trying to access a table with about 120M rows. It's a vertical version of a table with 360 or so columns. The new columns are: original item col, original item row, and the value. I created an index: CREATE INDEX idx on table (col, row) however, selects are still very slow. It seems it still needs a sequential scan: EXPLAIN SELECT * FROM table WHERE col=1 AND row=10; QUERY PLAN -- Seq Scan on table (cost=1.00..102612533.00 rows=1 width=14) Filter: ((col = 1) AND (row = 10)) What am I doing wrong? What type are row and col? If they're bigint (i.e. not int / int4) then you might need to quote the value to get the query to use an index: SELECT * FROM table WHERE col='1' AND row='10'; also, have you vacuumed / analyzed the table? I'm assuming yes. They're not bigints: CREATE TABLE table (col int2, row integer, val double precision) Yes, I vacuumed and analyzed, right after creating the index. Should I try and issue a few queries beforehand? but one is an int2 (i.e. not int / int4) so you'll need to quote that value to get an index to work. Note this is fixed in 8.0 I understand. Bingo. = explain select * from table where col='302' and row =100600400; QUERY PLAN - Index Scan using idx2 on table (cost=0.00..5.27 rows=1 width=14) Index Cond: ((col = 302::smallint) AND (row = 100600400)) (2 rows) = explain select * from table where col=302 and row =100600400; QUERY PLAN Seq Scan on table (cost=1.00..102612533.00 rows=1 width=14) Filter: ((col = 302) AND (row = 100600400)) (2 rows) Wow, that sure is a big difference for such a small change in the query. Thank you very much! ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Numeric user names
At 09:25 AM 10/19/2004 -0400, Ed Stoner wrote: I want to use bare numbers because that is how the users (students in this case) are identified on the network and in the student information system. They've been identified this way for over 20 years, so it would be near impossible to change at this point (although it is not always very convenient :-). I'm trying (and almost finished now) to have the postgres server be the source of all user account information on the network (windows and linux). Just curious - but it doesn't seem obvious why you need all postgresql users == all users in the student information system? Any reasons why? Assuming the student information system is an application, I'd have created a table and each basic user account info would be in its own row in that table, and link rows from other tables to those rows as necessary. Or maybe used something like LDAP (and add the necessary glue :( ). Of course if ALL students need to directly use the same postgresql database with their own individual accounts then that's probably a good reason. Regards, Link. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[GENERAL] Query buffer editing on Win32 version of postgresql?
I was attempting to set up my psql client on the Win32 version of postgres 8.0 beta 2 to be able to use an external editor. I set the environment variable in windows like so: PSQL_EDITOR=c:\progra~1\Textpa~1\Textpad.exe which does appear to work correctly. However, I get the following when attempting to edit the query buffer: db_merrymaids=# \e could not open temporary file .\psqA8C.tmp: File exists db_merrymaids=# If I specify a file using db_merrymaids=# \e somefile.txt Textpad does launch and ask to create the new file. But, I don't need to create a new file. I need to edit the built-in query buffer. :-) Any ideas why I'm getting the File exists error on the temp file? Bug maybe? Thanks for any help ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] how much ram do i give postgres?
On Wed, 20 Oct 2004 09:52:25 -0600, Scott Marlowe [EMAIL PROTECTED] wrote: 1: Is the bulk insert being done inside of a single transaction, or as individual inserts? The bulk insert is being done by COPY FROM STDIN. It copies in 100,000 rows at a time, then disconnects, reconnects, and copies 100k more, and repeats 'till done. There are no indexes on the tables that the copy is being done into either, so it won't be slowed down by that at all. 2: Are your fsm settings high enough for an hourly vacuum to be effective? What is fsm? I'll tell you when I find that out. 3: How selective is the where clause for your select (1) query? If there is no where clause or the where clause isn't very selective, then there will be a sequential scan every time. Since PostgreSQL has to hit the table after using an index anyway, if it's going to retrieve a fair percent of a table, it just goes right to a seq scan, which for postgresql, is the right thing to do. There was no where clause. Post explain analyze of your slowest queries to the performance list if you can. I don't think it's a query problem ( but I could optimize them more I'm sure ), 'cause the same query takes a long time when there are other queries happening, and not long at all when nothing else is going on. -Josh ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Sequence question
On Wed, Oct 20, 2004 at 11:57:42AM -0400, Andrew Sullivan wrote: Now, how do you handle the cases where either the transaction fails so you can't set it to 3? Simple: your client captures errors and then sets the value back to 1 later. Has anyone read the Sagas paper by Garcia-Molina? They present a way to handle extended transaction models, trying to cope sort-of automatically with this kind of situations. More generally, AFAIU the idea is to have multi-transaction recoverability and rollback-ability. It seems interesting. http://portal.acm.org/citation.cfm?doid=38713.38742 I have only skimmed through it, but it sounds somewhat interesting. I'd love to know what do people think of this. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) Los dioses no protegen a los insensatos. Éstos reciben protección de otros insensatos mejor dotados (Luis Wu, Mundo Anillo) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Views + UNION ALL = Slow ?
The difficulty is, that your view-based statement do not make use of any index. So the query must look at each tuple. It seems, that union all requires a full scan of the participates relations. I dont know if it is possible but try to create an index on the view ;-) Hagen Otto Blomqvist wrote: Hello ! I have two tables (which contains individual months' data). One of them contains 500 thousand records and the other one about 40k, 8 columns. When I do a simple query on them individually it takes milli-seconds to complete (see gory details below). For some querys I want to include data from multiple months so I created a view using Union all. But Using the view it takes 31 Seconds to complete the same query. I am obviously doing something wrong or using something the wrong way. Any Ideas ? /Otto Blomqvist test=# explain analyze select fid_2 from file_92_904 where fid_4=1024; NOTICE: QUERY PLAN: Index Scan using file_92_904_ltn_idx on file_92_904 (cost=0.00..219.90 rows=65 width=4) (actual time=0.49..0.49 rows=0 loops=1) Total runtime: 0.57 msec EXPLAIN test=# explain analyze select fid_2 from file_92_1004 where fid_4=1024; NOTICE: QUERY PLAN: Index Scan using file_92_1004_ltn_idx on file_92_1004 (cost=0.00..4505.20 rows=1197 width=4) (actual time=32.36..32.36 rows=0 loops=1) Total runtime: 32.46 msec EXPLAIN test=# create view twotables as select * from file_92_1004 UNION ALL Select * from file_92_904; CREATE test=# explain analyze select fid_2 from twotables where fid_4=1024; NOTICE: QUERY PLAN: Subquery Scan twotables (cost=1.00..200023000.53 rows=569553 width=203) (actual time=31590.97..31590.97 rows=0 loops=1) - Append (cost=1.00..200023000.53 rows=569553 width=203) (actual time=12.13..30683.67 rows=569553 loops=1) - Subquery Scan *SELECT* 1 (cost=1.00..100021799.06 rows=540306 width=199) (actual time=12.12..28417.81 rows=540306 loops=1) - Seq Scan on file_92_1004 (cost=1.00..100021799.06 rows=540306 width=199) (actual time=12.09..14946.47 rows=540306 loops=1) - Subquery Scan *SELECT* 2 (cost=1.00..11201.47 rows=29247 width=203) (actual time=0.19..1525.18 rows=29247 loops=1) - Seq Scan on file_92_904 (cost=1.00..11201.47 rows=29247 width=203) (actual time=0.14..793.34 rows=29247 loops=1) Total runtime: 31591.34 msec EXPLAIN ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] how much ram do i give postgres?
On 20 Oct 2004 at 11:37, Josh Close wrote: On Wed, 20 Oct 2004 09:52:25 -0600, Scott Marlowe [EMAIL PROTECTED] wrote: 1: Is the bulk insert being done inside of a single transaction, or as individual inserts? The bulk insert is being done by COPY FROM STDIN. It copies in 100,000 rows at a time, then disconnects, reconnects, and copies 100k more, and repeats 'till done. There are no indexes on the tables that the copy is being done into either, so it won't be slowed down by that at all. What about triggers? Also constraints (check contraints, integrity constraints) All these will slow the inserts/updates down. If you have integrity constraints make sure you have indexes on the referenced columns in the referenced tables and make sure the data types are the same. How long does 100,000 rows take to insert exactly? How many updates are you performing each hour? Regards, Gary. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[GENERAL] create table/type
is there a way to create a table with a certain type? CREATE TYPE typename AS (id integer, name varchar); and something like CREATE TABLE names OF TYPE typename. Is there a syntax to support this? thanks, --h ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Sequence question
Hi Tino, Many thanks for helping me. I know that the sequence issue is a troubling one for many on the list. Perhaps if I explain the need for a continuous sequence I can circumvent some of that: This database is for a laboratory, and the numbers in sequence determine storage locations for a sample. Having a physical space in our storage boxes tells us something has happened - the sample was used up, broken, in use, etc - and account for that missing sample. If the generated sequence has holes in it, we cannot tell if a sample is properly not in the rack, or if that hole was simply generated by the database. Allowing empties would also fill up limited box space with spaces generated by the database. If anyone has a brilliant idea for how a non-continuous sequence could address the needs, I'd be delighted to hear it, but short of that I think I have to keep this requirement. One thought I had, and I'd love to hear what people think of this, is to build a table of storage location numbers that are available for use. That way the search for new numbers could be pushed off until some convenient moment well after the user requests them. Thanks again for any ideas. Cheers, Eric Tino Wildenhain wrote: Hi, On Tue, 2004-10-19 at 01:16, Eric E wrote: Hi, I have a question about sequences. I need a field to have values with no holes in the sequence. However, the values do not need to be in order. My users will draw a number or numbers from the sequence and write to the field. Sometimes, however, these sequence numbers will be discarded (after a transaction is complete), and thus available for use. During the transaction, however, any drawn numbers need to be unavailable. I would like the next user who draws a number to draw the lowest number she can, starting with the holes in the sequence. This continuous sequence is absolutely required by our company, as the fact that the sequence has no holes is used to check for much more serious problems. I would recheck this requirement. What should actually be achieved with the check for no holes in the numbering? Remember you can always enumerate using a set returning function or by means of a temporary sequence for a query. So my question is: what's the most effective way to get the next available number? There is none. My present method is to do a query that finds the first and last number in each of the holes, step through those holes, and then start generating new numbers. Unfortunately, this involves doing a table scan each time - before I generate the number, and does not produce the transaction-safety I want. You cannot eat the cake and keep it - either you have holes or you have transaction security or you have bad performance by locking the whole table on insert. Regards Tino ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] files ending with .1 or .2
[EMAIL PROTECTED] (Leonardo Francalanci) writes: When a data file for a specific table (or index?) is larger than 1GB, its split up in several parts. This is probably a left over from the time OSs used to have problems with large files. Thank you. Is there any documentation I can read about this? It's discussed in the Douglas Douglas book on PostgreSQL; I'm not sure where else it gets discussed... -- let name=cbbrowne and tld=cbbrowne.com in String.concat @ [name;tld];; http://www.ntlug.org/~cbbrowne/linuxxian.html A VAX is virtually a computer, but not quite. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] how much ram do i give postgres?
On Wed, Oct 20, 2004 at 08:25:22 -0500, Josh Close [EMAIL PROTECTED] wrote: It's slow due to several things happening all at once. There are a lot of inserts and updates happening. There is periodically a bulk insert of 500k - 1 mill rows happening. I'm doing a vacuum anaylyze every hour due to the amount of transactions happening, and a vacuum full every night. All this has caused selects to be very slow. At times, a select count(1) from a table will take several mins. I don't think selects would have to wait on locks by inserts/updates would it? You might not need to do the vacuum fulls that often. If the your hourly vacuums have a high enough fsm setting, they should be keeping the database from continually growing in size. At that point daily vacuum fulls are overkill and if they are slowing stuff down you want to run quickly, you should cut back on them. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Sequence question
On Wed, Oct 20, 2004 at 01:52:59PM -0400, Eric E wrote: One thought I had, and I'd love to hear what people think of this, is to build a table of storage location numbers that are available for use. That way the search for new numbers could be pushed off until some convenient moment well after the user requests them. That very application is how I heard the idea for the second method I sent in another email. In the case I was thinking of, someone used it for room allocation. It worked pretty well, as long as you can tolerate occasional periods where you _do_ have gaps. A -- Andrew Sullivan | [EMAIL PROTECTED] In the future this spectacle of the middle classes shocking the avant- garde will probably become the textbook definition of Postmodernism. --Brad Holland ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[GENERAL] max length bit(x) bit varying(x)
cc me please: I can't find in the HTML documentation the max length of a bit string. Anyone know where it is? ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] how much ram do i give postgres?
On Wed, 20 Oct 2004 18:47:25 +0100, Gary Doades [EMAIL PROTECTED] wrote: What about triggers? Also constraints (check contraints, integrity constraints) All these will slow the inserts/updates down. No triggers or constraints. There are some foreign keys, but the tables that have the inserts don't have anything to them, even indexes, to help speed up the inserts. If you have integrity constraints make sure you have indexes on the referenced columns in the referenced tables and make sure the data types are the same. How long does 100,000 rows take to insert exactly? I believe with the bulk inserts, 100k only takes a couple mins. How many updates are you performing each hour? I'm not sure about this. Is there a pg stats table I can look at to find this out. I suppose I could do a count on the time stamp also. I'll let you know when I find out. Regards, Gary. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] how much ram do i give postgres?
On Wed, 20 Oct 2004 13:35:43 -0500, Bruno Wolff III [EMAIL PROTECTED] wrote: You might not need to do the vacuum fulls that often. If the your hourly vacuums have a high enough fsm setting, they should be keeping the database from continually growing in size. At that point daily vacuum fulls are overkill and if they are slowing stuff down you want to run quickly, you should cut back on them. I have the vacuum_mem set at 32M right now. I haven't changed the fsm settings at all though. -Josh ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Sequence question
Hi Andrew, I had basically started working on an idea like the second approach, but had not been able to put the status element so clearly. I really like the statuses of available, pending, and granted. There's one more twist I think I can use to optimize this: once a number is assigned, it cannot be reassigned. So I think I can add: - have the sequence preallocation table hold only numbers with status being available or pending, i.e., delete numbers once they have been allocated. This leaves on two possible statuses: available and pending. - push off getting new numbers into the preallocation table with a full-table search until convenient times I also liked your point about the atomicity of : get number, change status to pending, commit After that one can proceed with writing the number into my data table. My thought was that the you could set the status ussing sessionID. That way a server-side job could look for expired sessions and remark those numbers available. Any thoughts? This is point is definitely important I have a feeling, however, that you're creating a new problem for yourself by not being able to skip sequence values. My bet is that you actually need to find a better way to solve the other serious problems you have rather than banging on sequences to get them to fit your intended use. I just haven't really seen anyway around the need to use all of our storage rows that doens't involve a complicated mapping to boxes. Thanks, Eric Andrew Sullivan wrote: On Tue, Oct 19, 2004 at 11:19:05AM -0400, Eric E wrote: My users will draw a number or numbers from the sequence and write to the field. Sometimes, however, these sequence numbers will be discarded (after a transaction is complete), and thus available for use. During the transaction, however, any drawn numbers need to be unavailable. I would like the next user who draws a number to draw the lowest number she can, starting with the holes in the sequence. There are two ways I've seen to do this. One is the low-concurrency way. Another is a sort of clever approach that isn't theoretically perfect, but which provides slightly better concurrency. The low-concurrency approach is pretty much what you'd expect: keep the value in a table which is locked by each transaction which is incrementing it, and complete the incrementing in the transaction scope. That way, if it rolls back, the value hasn't been incremented, and is ready for the next user. The problem, of course, is that this forces every transaction to stand in line. An alternative approach I've heard is to pre-allocate numbers from a sequence into a table: create table seq_allocation ( serialno int8 not null unique, grant_status int constraint status_limiter check (grant_status in (1,2,3)) ); The idea is that a grant_status of 1 means the serial number is unallocated, a grant_status of 2 means it's pending, and 3 means it's granted. When you start, in one transaction you pick the next available serialno with a status of 1. Then you update that row to set it to 2 (make sure you use where grant_status = 1 to avoid a race condition), and then commit. Now you have your serial number. Use it, and then at the end of your transaction where you are committing, set the grant_status to 3, so you know it's really used. Now, how do you handle the cases where either the transaction fails so you can't set it to 3? Simple: your client captures errors and then sets the value back to 1 later. For client errors, you need yet another process which will go around periodically and check for grant_status = 2, and make sure nobody's actually in the middle of trying to use them. (You could refine the seq_allocation table by storing the pid of the allocating back end. Then your maintenance script could look for such a back end while cleaning up.) The savepoints features of 8.0 will make some of this even easier for you. Note that this second method is not completely bulletproof, but it might be good enough for the cases you want. I have a feeling, however, that you're creating a new problem for yourself by not being able to skip sequence values. My bet is that you actually need to find a better way to solve the other serious problems you have rather than banging on sequences to get them to fit your intended use. A ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] how much ram do i give postgres?
On 20 Oct 2004 at 13:34, Josh Close wrote: How long does 100,000 rows take to insert exactly? I believe with the bulk inserts, 100k only takes a couple mins. Hmm, that seems a bit slow. How big are the rows you are inserting? Have you checked the cpu and IO usage during the inserts? You will need to do some kind of cpu/IO monitoring to determine where the bottleneck is. What hardware is this on? Sorry if you specified it earlier, I can't seem to find mention of it. Cheers, Gary. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] how much ram do i give postgres?
On Wed, 20 Oct 2004 19:59:38 +0100, Gary Doades [EMAIL PROTECTED] wrote: Hmm, that seems a bit slow. How big are the rows you are inserting? Have you checked the cpu and IO usage during the inserts? You will need to do some kind of cpu/IO monitoring to determine where the bottleneck is. The bulk inserts don't take full cpu. Between 40% and 80%. On the other hand, a select will take 99% cpu. What hardware is this on? Sorry if you specified it earlier, I can't seem to find mention of it. It's on a P4 HT with 1,128 megs ram. -Josh ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] Sequence question
Hi, Am Mi, den 20.10.2004 schrieb Eric E um 19:52: Hi Tino, Many thanks for helping me. I know that the sequence issue is a troubling one for many on the list. Perhaps if I explain the need for a continuous sequence I can circumvent some of that: This database is for a laboratory, and the numbers in sequence determine storage locations for a sample. Having a physical space in our storage boxes tells us something has happened - the sample was used up, broken, in use, etc - and account for that missing sample. If the generated sequence has holes in it, we cannot tell if a sample is properly not in the rack, or if that hole was simply generated by the database. Allowing empties would also fill up limited box space with spaces generated by the database. If anyone has a brilliant idea for how a non-continuous sequence could address the needs, I'd be delighted to hear it, but short of that I think I have to keep this requirement. Maybe you skip the sequence thingy alltogether in this case and use an approach like this: initialize a table with all possible locations and mark them as empty. CREATE TABLE locations (location_id int2,taken bool); (you might want to have a timestamp for changes too) Whenever you change state of a location, do it like this (perhaps in a function) SELECT INTO loc_id location_id FROM locations WHERE taken FOR UPDATE; IF FOUND THEN UPDATE location SET taken=true WHERE location_id=loc_id; ELSE RAISE EXCEPTION 'no free location anymore'; ... AND the other way round for freeing a location. The SELECT ... FOR UPDATE should lock the candidate position in the table so concurrent transactions have to wait then then find another free cell when they wake up. Advantage: not a full table scan. Only the first matching row should be used and locked. Not this is only a rough sketch and you should look for the actual syntax and more flesh for the function. Regards Tino ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] Sequence question
Hmm that's a really intesting idea, Tino. Since we're probably talking about 100 numbers max, a query on this table would work fairly fast, and operationally simple. I'll think about that. Thanks, Eric Tino Wildenhain wrote: Hi, Am Mi, den 20.10.2004 schrieb Eric E um 19:52: Hi Tino, Many thanks for helping me. I know that the sequence issue is a troubling one for many on the list. Perhaps if I explain the need for a continuous sequence I can circumvent some of that: This database is for a laboratory, and the numbers in sequence determine storage locations for a sample. Having a physical space in our storage boxes tells us something has happened - the sample was used up, broken, in use, etc - and account for that missing sample. If the generated sequence has holes in it, we cannot tell if a sample is properly not in the rack, or if that hole was simply generated by the database. Allowing empties would also fill up limited box space with spaces generated by the database. If anyone has a brilliant idea for how a non-continuous sequence could address the needs, I'd be delighted to hear it, but short of that I think I have to keep this requirement. Maybe you skip the sequence thingy alltogether in this case and use an approach like this: initialize a table with all possible locations and mark them as empty. CREATE TABLE locations (location_id int2,taken bool); (you might want to have a timestamp for changes too) Whenever you change state of a location, do it like this (perhaps in a function) SELECT INTO loc_id location_id FROM locations WHERE taken FOR UPDATE; IF FOUND THEN UPDATE location SET taken=true WHERE location_id=loc_id; ELSE RAISE EXCEPTION 'no free location anymore'; ... AND the other way round for freeing a location. The SELECT ... FOR UPDATE should lock the candidate position in the table so concurrent transactions have to wait then then find another free cell when they wake up. Advantage: not a full table scan. Only the first matching row should be used and locked. Not this is only a rough sketch and you should look for the actual syntax and more flesh for the function. Regards Tino ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Sequence question
On Wed, Oct 20, 2004 at 02:59:27PM -0400, Eric E wrote: - have the sequence preallocation table hold only numbers with status being available or pending, i.e., delete numbers once they have been allocated. This leaves on two possible statuses: available and pending. I would argue that you're best to record everything. That is, your proposal moves from a table which covers all the possible states -- granted, pending, and available -- to a table which entails ambiguous answers to some questions. Since you're not going to preallocate every logically possible id, then if an id isn't in the table, you can't tell if it simply has never been allocated, or if it has already been used. If you have that stored elsewhere, though, you can get it from a join, so perhaps there's no need for this. (I note that a real normalisation freak, like the one I occasionally play on TV, would require you to use a REFERENCES constraint on the status value, and use the referenced table as a control for what status values you can use. This has the not inconsiderable benefit that if you have a new status -- say, storage burned down or impounded by SCO for copyright violation or something else -- you have a completely trivial way to add it. It's certainly the way I'd actually do this.) I also liked your point about the atomicity of : get number, change status to pending, commit The real problem with it is that you do have the possibility of orphaned pending actions. My thought was that the you could set the status ussing sessionID. That way a server-side job could look for expired sessions and remark those numbers available. That's something like what I'd do, yes. It mostly depends on what's available to your application. I tend to be very belt-and-suspenders about this sort of thing. Probably I'd put a wall-clock timestamp on the field, too, to give me clues about when things might be going wrong, c. find a better way to solve the other serious problems you have rather than banging on sequences to get them to fit your intended use. I just haven't really seen anyway around the need to use all of our storage rows that doens't involve a complicated mapping to boxes. I was more concerned that you were trying to do this for invoice numbers, another place where people often require serial numbers. In that case, I usually think they're wrong, because I can think of plenty of better ways to solve that one (unless it's a legal requirement, which is sometimes is). But mapping data points to places in space is one of those cases where you probably _do_ need this sort of preallocation mechanism. It's what hotels do, after all. A -- Andrew Sullivan | [EMAIL PROTECTED] The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] how much ram do i give postgres?
On 20 Oct 2004 at 14:09, Josh Close wrote: On Wed, 20 Oct 2004 19:59:38 +0100, Gary Doades [EMAIL PROTECTED] wrote: Hmm, that seems a bit slow. How big are the rows you are inserting? Have you checked the cpu and IO usage during the inserts? You will need to do some kind of cpu/IO monitoring to determine where the bottleneck is. The bulk inserts don't take full cpu. Between 40% and 80%. On the other hand, a select will take 99% cpu. Is this the select(1) query? Please post an explain analyze for this and any other slow queries. I would expect the selects to take 99% cpu if all the data you were trying to select was already in memory. Is this the case in general? I can do a select count(1) on a 500,000 row table in about 1 second on a Athlon 2800+ if all the data is cached. It takes about 25 seconds if it has to fetch it from disk. I have just done a test by inserting (via COPY) of 149,000 rows in a table with 23 columns, mostly numeric, some int4, 4 timestamps. This took 28 seconds on my Windows XP desktop, Athlon 2800+, 7200 rpm SATA disk, Postgres 8.0 beta 2. It used around 20% to 40% cpu during the copy. The only index was the int4 primary key, nothing else. How does this compare? What hardware is this on? Sorry if you specified it earlier, I can't seem to find mention of it. It's on a P4 HT with 1,128 megs ram. Disk system?? Regards, Gary. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] how much ram do i give postgres?
On Wed, 20 Oct 2004 20:49:54 +0100, Gary Doades [EMAIL PROTECTED] wrote: Is this the select(1) query? Please post an explain analyze for this and any other slow queries. I think it took so long 'cause it wasn't cached. The second time I ran it, it took less than a second. How you can tell if something is cached? Is there a way to see what's in cache? I would expect the selects to take 99% cpu if all the data you were trying to select was already in memory. Is this the case in general? I can do a select count(1) on a 500,000 row table in about 1 second on a Athlon 2800+ if all the data is cached. It takes about 25 seconds if it has to fetch it from disk. I think that's what's going on here. I have just done a test by inserting (via COPY) of 149,000 rows in a table with 23 columns, mostly numeric, some int4, 4 timestamps. This took 28 seconds on my Windows XP desktop, Athlon 2800+, 7200 rpm SATA disk, Postgres 8.0 beta 2. It used around 20% to 40% cpu during the copy. The only index was the int4 primary key, nothing else. Well, there are a 3 text columns or so, and that's why the COPY takes longer than yours. That hasn't been a big issue though. I copies fast enough. How does this compare? Disk system?? It's in ide raid 1 config I believe. So it's not too fast. It will soon be on a scsi raid 5 array. That should help speed some things up also. Regards, Gary. What about the postgresql.conf config settings. This is what I have and why. shared_buffers = 21250 This is 174 megs, which is 15% of total ram. I read somewhere that it should be between 12-15% of total ram. sort_mem = 32768 This is default. vacuum_mem = 32768 This is 32 megs. I put it that high because of something I read here http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html #max_fsm_pages = 2 Default. I would think this could be upped more, but I don't know how much. effective_cache_size = 105750 This is 846 megs ram which is 75% of total mem. I put it there 'cause of a reply I got on the performance list. I made all these changes today, and haven't had much of a chance to speed test postgres since. Any thoughs on these settings? -Josh ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[GENERAL] CREATE TEMPORARY TABLE AS ... ON COMMIT?
Is the ON COMMIT syntax available to temporary tables created using the CREATE TABLE AS syntax? If not, is there a way to drop such a table at the end of a transaction? -tfo -- Thomas F. O'Connell Co-Founder, Information Architect Sitening, LLC http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203-6320 615-260-0005 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] OID and PK/FK KEYS
A better solution is to use the serial data type. OID is depreciated and may go away. http://www.postgresql.org/docs/7.4/static/datatype.html#DATATYPE-SERIAL On 19 Oct 2004 07:54:36 -0700, Raffaele Spizzuoco [EMAIL PROTECTED] wrote: Hi! I'm from Italy, and sorry about my english... I have a question that I know it is already said in the groups but I have however some doubts I have seen it is technically possible to use OID as PRIMARY KEY and as FOREIGN KEY but it is correct to do so for the database's logical integrity? Is it better I use in any case other keys and not oid to avoid the possible wraparound? or the wraparound is an extreme case and so I can use quietly OID as PRIMARY and FOREIGN KEY? Thanks ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] how much ram do i give postgres?
On 20 Oct 2004 at 15:36, Josh Close wrote: On Wed, 20 Oct 2004 20:49:54 +0100, Gary Doades [EMAIL PROTECTED] wrote: Is this the select(1) query? Please post an explain analyze for this and any other slow queries. I think it took so long 'cause it wasn't cached. The second time I ran it, it took less than a second. How you can tell if something is cached? Is there a way to see what's in cache? No. The OS caches the data as read from the disk. If you need the data to be in memory for performance then you need to make sure you have enough available RAM to hold your typical result sets if possible. What about the postgresql.conf config settings. This is what I have and why. sort_mem = 32768 This is default. This is not the default. The default is 1000. You are telling Postgres to use 32Megs for *each* sort that is taking place. If you have several queries each performing large sorts you can quickly eat up available RAM this way. If you will only have a small number of concurrrent queries performing sorts then this may be OK. Don't forget, a single query can perform more than one sort operation. If you have 10 large sorts happening at the same time, you can eat up to 320 megs this way! You will need to tell us the number of updates/deletes you are having. This will determine the vacuum needs. If the bulk of the data is inserted you may only need to analyze frequently, not vacuum. In order to get more help you will need to supply the update/delete frequency and the explain analyze output from your queries. Regards, Gary. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] how much ram do i give postgres?
On Wed, 20 Oct 2004 23:43:54 +0100, Gary Doades [EMAIL PROTECTED] wrote: You will need to tell us the number of updates/deletes you are having. This will determine the vacuum needs. If the bulk of the data is inserted you may only need to analyze frequently, not vacuum. In order to get more help you will need to supply the update/delete frequency and the explain analyze output from your queries. I will have to gather this information for you. -Josh ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] table size/record limit
I am designing something that may be the size of yahoo, google, ebay, etc. Just ONE many to many table could possibly have the following characteristics: 3,600,000,000 records each record is 9 fields of INT4/DATE Other tables will have about 5 million records of about the same size. There are lots of scenarios here to lessson this. BUT, is postgres on linux, maybe necessarily a 64 bit system, cabable of this? And there'd be 4-5 indexes on that table. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] CREATE TEMPORARY TABLE AS ... ON COMMIT?
On Thu, 2004-10-21 at 06:40, Thomas F.O'Connell wrote: Is the ON COMMIT syntax available to temporary tables created using the CREATE TABLE AS syntax? No, but it should be. There's a good chance this will be in 8.1 If not, is there a way to drop such a table at the end of a transaction? DROP TABLE :) -Neil ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[GENERAL] Invalid page header
I have 5 corrupted page headers as evidenced by these errors: ERROR: Invalid page header in block 13947 of ... The corruption is causing numerous queries to abort. First option is to try to salvage data before attempt restore from backup. I want to try to edit the file to zero out the bogus headers. I realize there may be data lost from this attempt. I have scalpel (binary editor) in hand. Which bytes should I edit, and what do I make them? $ pg_filedump -if -R 1343 1343 25268878.38650946 * * PostgreSQL File/Block Formatted Dump Utility - Version 1.1 * * File: 25268878.38650946 * Options used: -if -R 1343 1343 * * Dump created on: Wed Oct 20 19:14:06 2004 * Block 1343 ** Header - Block Offset: 0x00a7e000 Offsets: Lower 0 (0x) Block: Size0 Version0Upper 0 (0x) LSN: logid 0 recoff 0x Special 0 (0x) Items: -5 Free Space:0 Length (including item array): 24 Error: Invalid header information. : 0010: Data -- Error: Item index corrupt on block. Offset: -5. Special Section - Error: Invalid special section encountered. Error: Special section points off page. Unable to dump contents. *** End of Requested Range Encountered. Last Block Read: 1343 *** ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Invalid page header
On Wednesday October 20 2004 5:34, Ed L. wrote: I have 5 corrupted page headers as evidenced by these errors: ERROR: Invalid page header in block 13947 of ... The corruption is causing numerous queries to abort. First option is to try to salvage data before attempt restore from backup. I want to try to edit the file to zero out the bogus headers. I realize there may be data lost from this attempt. I have scalpel (binary editor) in hand. Which bytes should I edit, and what do I make them? In other words, how do I calculate which bytes to zero to simulate zero_damaged_pages?? TIA. $ pg_filedump -if -R 1343 1343 25268878.38650946 * * PostgreSQL File/Block Formatted Dump Utility - Version 1.1 * * File: 25268878.38650946 * Options used: -if -R 1343 1343 * * Dump created on: Wed Oct 20 19:14:06 2004 * Block 1343 ** Header - Block Offset: 0x00a7e000 Offsets: Lower 0 (0x) Block: Size0 Version0Upper 0 (0x) LSN: logid 0 recoff 0x Special 0 (0x) Items: -5 Free Space:0 Length (including item array): 24 Error: Invalid header information. : 0010: Data -- Error: Item index corrupt on block. Offset: -5. Special Section - Error: Invalid special section encountered. Error: Special section points off page. Unable to dump contents. *** End of Requested Range Encountered. Last Block Read: 1343 *** ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[GENERAL] inquiry on readline functionality on psql client
Hello, I appreciate very much the readline functionality on the psql client. However, I'd like to ask if it is possible for the readline functionality to gobble up even table names and field names: For example, if have tables `table1' with 3 fields `field1', 'field2', and `field3' and `table2' with 3 fields `field1', 'field2', and `field3', Is it possible to do select tab# then, after tabbing I select `table1' then select table1.tab # then, after tabbing I select field2 Thus, I would have the complete statment `select table1.field2 from table1' Well, the above statement is equivalent to `select field1 from table1' but I was wondering how the name globbing scenario I presented is possible? If it is not possible to do it with any configuration files, could anyone point out at source code level what can be done since I'd like to try playing around with this feature. Thank you very much. Best Regards, Carlo -- Carlo Florendo Astra Philippines Inc. www.astra.ph ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[GENERAL] generic sql question
My question is it possible to speed up a query doing preselects? What I'm working on could end up being a very large dataset. I hope to have 100-1000 queries per second (0r more?), and if very large tables are joined with very large tables, I imagine that the memory would be get very full, overfull? So in the schema below the following queries, usrs own articles, articles are of article types, issues have dates and names, and issues_published has an issue id and sets of articles, and article can be in many issues. So if I wanted to find articles of a certain article type within a certain date range for the article and had actually been published, I believe that this query could find it, joining three tables and then doing the qualifications for date and type: (assume values in {} are escaped and proper syntax) - SELECT article_id FROM issues_published, issues, articles WHERE issues_published.article_id = articles.article_id AND issues_published.issue_id = issues.issue_id AND articles.article_type = {article_type_id desired} AND article.article_date {highest date} AND issues.article_date {lowest date}; But would the following reduce the size of the join in memory? SELECT article_id FROM (select * from articles where article_date {highest date} AND article_date {lowest date} ) as articles_in_range, issues, issues_published WHERE issues_published.article_id = articles_in_range.article_id AND issues_published.issue_id = issues.issue_id AND articles_in_range.article_type = {article type desired} - CREATE TABLE usr ( usr_id SERIAL NOT NULL, PRIMARY KEY (usr_id) ); CREATE TABLE article_types ( ariticle_type_id SERIAL NOT NULL, article_type VARCHAR(40) NOT NULL, PRIMARY KEY (ariticle_type_id) ); CREATE TABLE articles ( article_id SERIAL NOT NULL, ariticle_type_id INT4 NOT NULL, author INT4 NOT NULL, body TEXT NOT NULL, date_written DATE NOT NULL, PRIMARY KEY (article_id, ariticle_type_id, author) ); CREATE TABLE issues ( issue_id SERIAL NOT NULL, issue_title VARCHAR(40) NOT NULL, issue_date DATE NOT NULL, PRIMARY KEY (issue_id) ); CREATE TABLE issues_published ( issue_id INT4 NOT NULL, article_id INT4 NOT NULL, PRIMARY KEY (issue_id, author, ariticle_type_id, article_id) ); /*==*/ /* Foreign Keys*/ /*==*/ ALTER TABLE articles ADD FOREIGN KEY (author) REFERENCES usr (usr_id); ALTER TABLE articles ADD FOREIGN KEY (ariticle_type_id) REFERENCES article_types (ariticle_type_id); ALTER TABLE issue_articles ADD FOREIGN KEY (issue_id) REFERENCES issues (issue_id); ALTER TABLE issue_articles ADD FOREIGN KEY (author,ariticle_type_id,article_id) REFERENCES articles (author, ariticle_type_id, article_id); ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] inquiry on readline functionality on psql client
Carlo Florendo wrote: Hello, I appreciate very much the readline functionality on the psql client. However, I'd like to ask if it is possible for the readline functionality to gobble up even table names and field names: For example, if have tables `table1' with 3 fields `field1', 'field2', and `field3' and `table2' with 3 fields `field1', 'field2', and `field3', Is it possible to do select tab# then, after tabbing I select `table1' then select table1.tab # then, after tabbing I select field2 Thus, I would have the complete statment `select table1.field2 from table1' Well, the above statement is equivalent to `select field1 from table1' Sorry, the above sql shoud read `select field2 from table1' Thank you very much. Best Regards, Carlo -- Carlo Florendo Astra Philippines Inc. www.astra.ph ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] Invalid page header
On Wednesday October 20 2004 10:12, Ed L. wrote: On Wednesday October 20 2004 10:00, Tom Lane wrote: Ed L. [EMAIL PROTECTED] writes: In other words, how do I calculate which bytes to zero to simulate zero_damaged_pages?? Why simulate it, when you can just turn it on? But anyway, the answer is the whole page. Old 7.3.4 installation, didn't realize that feature was there. Thx. That worked for 3 of 4 cases, but for a fourth, I see the message that it's zeroing the page, but then it continues to report invalid page header for that block... maybe the header is too fouled up to fix? Ed ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Invalid page header
Ed L. [EMAIL PROTECTED] writes: In other words, how do I calculate which bytes to zero to simulate zero_damaged_pages?? Why simulate it, when you can just turn it on? But anyway, the answer is the whole page. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] Invalid page header
On Wednesday October 20 2004 10:00, Tom Lane wrote: Ed L. [EMAIL PROTECTED] writes: In other words, how do I calculate which bytes to zero to simulate zero_damaged_pages?? Why simulate it, when you can just turn it on? But anyway, the answer is the whole page. Old 7.3.4 installation, didn't realize that feature was there. Thx. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] generic sql question
i do no think writing the query in the second form differs from the first one. In both cases, only the relevent articles (in range and of desired type) will come out of the scan operator that scans the articles. --h Dennis Gearon [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] My question is it possible to speed up a query doing preselects? What I'm working on could end up being a very large dataset. I hope to have 100-1000 queries per second (0r more?), and if very large tables are joined with very large tables, I imagine that the memory would be get very full, overfull? So in the schema below the following queries, usrs own articles, articles are of article types, issues have dates and names, and issues_published has an issue id and sets of articles, and article can be in many issues. So if I wanted to find articles of a certain article type within a certain date range for the article and had actually been published, I believe that this query could find it, joining three tables and then doing the qualifications for date and type: (assume values in {} are escaped and proper syntax) - SELECT article_id FROM issues_published, issues, articles WHERE issues_published.article_id = articles.article_id AND issues_published.issue_id = issues.issue_id AND articles.article_type = {article_type_id desired} AND article.article_date {highest date} AND issues.article_date {lowest date}; But would the following reduce the size of the join in memory? SELECT article_id FROM (select * from articles where article_date {highest date} AND article_date {lowest date} ) as articles_in_range, issues, issues_published WHERE issues_published.article_id = articles_in_range.article_id AND issues_published.issue_id = issues.issue_id AND articles_in_range.article_type = {article type desired} - CREATE TABLE usr ( usr_id SERIAL NOT NULL, PRIMARY KEY (usr_id) ); CREATE TABLE article_types ( ariticle_type_id SERIAL NOT NULL, article_type VARCHAR(40) NOT NULL, PRIMARY KEY (ariticle_type_id) ); CREATE TABLE articles ( article_id SERIAL NOT NULL, ariticle_type_id INT4 NOT NULL, author INT4 NOT NULL, body TEXT NOT NULL, date_written DATE NOT NULL, PRIMARY KEY (article_id, ariticle_type_id, author) ); CREATE TABLE issues ( issue_id SERIAL NOT NULL, issue_title VARCHAR(40) NOT NULL, issue_date DATE NOT NULL, PRIMARY KEY (issue_id) ); CREATE TABLE issues_published ( issue_id INT4 NOT NULL, article_id INT4 NOT NULL, PRIMARY KEY (issue_id, author, ariticle_type_id, article_id) ); /*== */ /* Foreign Keys */ /*== */ ALTER TABLE articles ADD FOREIGN KEY (author) REFERENCES usr (usr_id); ALTER TABLE articles ADD FOREIGN KEY (ariticle_type_id) REFERENCES article_types (ariticle_type_id); ALTER TABLE issue_articles ADD FOREIGN KEY (issue_id) REFERENCES issues (issue_id); ALTER TABLE issue_articles ADD FOREIGN KEY (author,ariticle_type_id,article_id) REFERENCES articles (author, ariticle_type_id, article_id); ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] table size/record limit
Hi, Am Do, den 21.10.2004 schrieb Dennis Gearon um 1:30: I am designing something that may be the size of yahoo, google, ebay, etc. Just ONE many to many table could possibly have the following characteristics: 3,600,000,000 records each record is 9 fields of INT4/DATE Other tables will have about 5 million records of about the same size. There are lots of scenarios here to lessson this. BUT, is postgres on linux, maybe necessarily a 64 bit system, cabable of this? And there'd be 4-5 indexes on that table. Sure. Why not? 3...5mio records is not really a problem. We had bigger tables with historic commercial transactions (even on an old dual PIII/1000) with fine performance. I bet however, yahoo, google at least are much bigger :-) Regards Tino ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] table size/record limit
Google probably is much bigger, and on mainframes, and probably Oracle or DB2. But the table I am worried about is the one sized = 3.6 GIGA records. Tino Wildenhain wrote: Hi, Am Do, den 21.10.2004 schrieb Dennis Gearon um 1:30: I am designing something that may be the size of yahoo, google, ebay, etc. Just ONE many to many table could possibly have the following characteristics: 3,600,000,000 records each record is 9 fields of INT4/DATE Other tables will have about 5 million records of about the same size. There are lots of scenarios here to lessson this. BUT, is postgres on linux, maybe necessarily a 64 bit system, cabable of this? And there'd be 4-5 indexes on that table. Sure. Why not? 3...5mio records is not really a problem. We had bigger tables with historic commercial transactions (even on an old dual PIII/1000) with fine performance. I bet however, yahoo, google at least are much bigger :-) Regards Tino ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html