Re: [GENERAL] Amazon EC2 | Any recent developments
Hi, On Tue, Jun 16, 2009 at 11:05 AM, Greg Smith wrote: > You just have to recognize that the volumes are > statistically pretty fragile compared to a traditional RAID configuration on > dedicated hardware and plan accordingly. I agree completely. I think the advantage is that it FORCES you to plan for failure. Now, I know we all SHOULD plan for failure, but I also know how many do not... Bye, Guy. Family management on rails: http://www.famundo.com My development related blog: http://devblog.famundo.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Amazon EC2 | Any recent developments
Hi, > So, when a cloud machine fails does it get de-allocated/wiped out? or > does it is it still out there in a bad state? how do you recover your > data? It depends. Sometimes it dies and you can't do anything with it. In others you can restart it. As we store the data on EBS (which is a network storage in AMazon's cloud), the data is not related to the instance directly. And that's the beauty of it. If a machine dies I can launch a replacement machine (it takes about 3-5 minutes for it to boot and be ready), or in some cases I just have a replacement instance waiting. I then mount the EBS volume holding the Postgres data, let the server do the recovery if needed and I'm back online. No need to replace a chip, go to the DC, etc... In case the volume got corrupted (a very rare situation, as the EBS volumes are very durable), there are snapshots I can recover from and the WAL files I stream to another storage system (Amazon's S3). I have some systems where I take daily tar backups of the PG directory and ship those to a separate storage. There is no doubt you can get a much stronger machine by building your own hardware, but the ability to easily recover, and easily launch and cluster are a huge advantage for the cloud. It does takes getting used to, though. You need to think of machines as expendable, and plan for easy failure preparation and replacement. It does make you really prepare and test your recovery strategies. I know of too many companies that just trust the DB to be ok. And it is most of the time, but when a catastrophe happens, recovery is a long and risky process. Bye, Guy. -- Family management on rails: http://www.famundo.com My development related blog: http://devblog.famundo.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Amazon EC2 | Any recent developments
Hi, I've seen both - some unknown reason for it to die (mostly related to the underlying hardware having issues). We also see instance failure from time to time with advanced notice. Just like a regular machine dies from time to time, so do cloud instances. I'd say it's bit more common on the cloud, but not by a big margin. I might see it more because I have hundreds of instances running. Bye, Guy. On Mon, Jun 15, 2009 at 3:46 PM, David Kerr wrote: > On Mon, Jun 15, 2009 at 12:11:54PM -0700, Just Someone wrote: > - Hi, > - > - I have more than a few Postgres instances on EC2. For reliability I > - use EBS, and take regular snapshots while also streaming the WAL files > - to S3. So far, the few times that my machine died, I had no issue with > - getting it back from EBS or the EBS volume. I also take tar backups > - every day, and I keep a few days back of tar, snapshots and WAL log > - files. > Your machine died? Was it the cloud's fault or something else? > > Dave > -- Family management on rails: http://www.famundo.com My development related blog: http://devblog.famundo.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Amazon EC2 | Any recent developments
Hi, I have more than a few Postgres instances on EC2. For reliability I use EBS, and take regular snapshots while also streaming the WAL files to S3. So far, the few times that my machine died, I had no issue with getting it back from EBS or the EBS volume. I also take tar backups every day, and I keep a few days back of tar, snapshots and WAL log files. If you require high performance you might have to look into the large or XL instances, as their networking is a lot faster, and EBS is accessed through the network. They also have a lot more memory. I actually think that Postgres has a big advantage on the cloud, and that's the ability to easily recover from crashes. Because of the way Postgres commits to disk and work with WALs, there is much higher chance of recovering the DB than most other DB servers out there. Bye, Guy. On Mon, Jun 15, 2009 at 11:12 AM, AJAY A wrote: > Hello All, > > I am investigating the possibility of hosting pgsql 8.3 on Amazon EC2 > & implementing a simple HA solution. My search of postgresql & amazon > cloud has produced little result. Just wondering if there has been > any recent development with EBS etc. and anybody would care to share > their experiences. > > Thank you very much. > > aj > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Family management on rails: http://www.famundo.com My development related blog: http://devblog.famundo.com -- Family management on rails: http://www.famundo.com My development related blog: http://devblog.famundo.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Very slow catalog query
Hi Tom, > Well, it's hard to be sure what the problem is when you're not showing > us a problem case ... but I notice that this indexscan is estimated > awfully high: Whenever I do it manually it works fast. But in the log I see lots of slow ones. Could it be caused by auto vacuum? Or by check pointing or WAL writing? Are there way to check that? > >-> Index Scan using > > pg_depend_reference_index on pg_depend dep (cost=0.00..64942.17 > > rows=247 width=12) (actual time=396.542..1547.172 rows=22 loops=1) > > Index Cond: (refobjid = 30375069::oid) > > The reason is not far to seek: the scan is checking only the second > index key, meaning that it has to scan the entire index. (I am > surprised it didn't use a seqscan instead. Are you using enable_seqscan > = off? Not a great idea.) Since you know you are looking for a table, > you could improve matters by adding a constraint on refclassid: > > dep.refclassid = 'pg_class'::regclass enable_setscan is on. Is there a way to analyze/vacuum those tables? I will look if I can also improve the query to be more exact. Bye, Guy. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Very slow catalog query
Hi Tom, Here is the result of explain analyze (though this one took 1500ms and not 169000): On Mon, Mar 31, 2008 at 7:37 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Just Someone" <[EMAIL PROTECTED]> writes: > > Any ideas how to start finding the culprit? > > EXPLAIN ANALYZE? explain analyze SELECT attr.attname, name.nspname, seq.relname FROM pg_class seq, pg_attribute attr, pg_depend dep, pg_namespace name, pg_constraint cons WHERE seq.oid = dep.objid AND seq.relnamespace = name.oid AND seq.relkind = 'S' AND attr.attrelid = dep.refobjid AND attr.attnum = dep.refobjsubid AND attr.attrelid = cons.conrelid AND attr.attnum = cons.conkey[1] AND cons.contype = 'p' AND dep.refobjid = 'activities'::regclass; QUERY PLAN Nested Loop (cost=64956.07..65025.73 rows=31 width=192) (actual time=1547.720..1547.749 rows=1 loops=1) -> Nested Loop (cost=64956.07..64987.48 rows=1 width=166) (actual time=1547.662..1547.684 rows=1 loops=1) -> Nested Loop (cost=64956.07..64981.47 rows=1 width=106) (actual time=1547.616..1547.631 rows=1 loops=1) -> Merge Join (cost=64956.07..64957.36 rows=4 width=42) (actual time=1547.484..1547.502 rows=3 loops=1) Merge Cond: ("outer"."?column3?" = "inner".refobjsubid) -> Sort (cost=4.08..4.08 rows=3 width=30) (actual time=0.149..0.151 rows=1 loops=1) Sort Key: cons.conkey[1] -> Index Scan using pg_constraint_conrelid_index on pg_constraint cons (cost=0.00..4.05 rows=3 width=30) (actual time=0.110..0.112 rows=1 loops=1) Index Cond: (30375069::oid = conrelid) Filter: (contype = 'p'::"char") -> Sort (cost=64951.99..64952.61 rows=247 width=12) (actual time=1547.303..1547.318 rows=9 loops=1) Sort Key: dep.refobjsubid -> Index Scan using pg_depend_reference_index on pg_depend dep (cost=0.00..64942.17 rows=247 width=12) (actual time=396.542..1547.172 rows=22 loops=1) Index Cond: (refobjid = 30375069::oid) -> Index Scan using pg_class_oid_index on pg_class seq (cost=0.00..6.02 rows=1 width=72) (actual time=0.034..0.035 rows=0 loops=3) Index Cond: (seq.oid = "outer".objid) Filter: (relkind = 'S'::"char") -> Index Scan using pg_namespace_oid_index on pg_namespace name (cost=0.00..6.00 rows=1 width=68) (actual time=0.039..0.041 rows=1 loops=1) Index Cond: ("outer".relnamespace = name.oid) -> Index Scan using pg_attribute_relid_attnum_index on pg_attribute attr (cost=0.00..38.00 rows=20 width=70) (actual time=0.050..0.052 rows=1 loops=1) Index Cond: ((30375069::oid = attr.attrelid) AND (attr.attnum = "outer".refobjsubid)) Total runtime: 1548.082 ms Bye, Guy. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Very slow catalog query
Hi, > I'm wondering... just 4GB of ram? > What's the "normal" "hammering" -- a.k.a. user access -- to all of this? > PG, as expected, launches a separate process for each connection. this eats > up > resources quite quickly > Did you check your system processes with 'top' ? how's it looking for swap > usage? Swap usage is almost nil. And I only have a constant number of connections (about 10-15) as it serves as the backend for a Web application. Transactions run at about 20-40 per second, but mostly very short and simple ones. Thanks, Guy. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Very slow catalog query
Hi, I have a DB with a large number schemas (around 10K) and a large number of tables (400K). The app became slow lately, and logging the slow queries, I see more than a few like this: SELECT: LOG: duration: 169547.424 ms statement: SELECT attr.attname, name.nspname, seq.relname FROM pg_class seq, pg_attribute attr, pg_depend dep, pg_namespace name, pg_constraint cons WHERE seq.oid = dep.objid AND seq.relnamespace = name.oid AND seq.relkind = 'S' AND attr.attrelid = dep.refobjid AND attr.attnum = dep.refobjsubid AND attr.attrelid = cons.conrelid AND attr.attnum = cons.conkey[1] AND cons.contype = 'p' AND dep.refobjid = 'activities'::regclass Almost all slow queries are of this type, though most of those do finish really fast. From time to time it gets really slow. Some details on the setup: Dual Opteron with 4GB RAM RAID1 for WAL on 10K SCSI RAID10 over 6 x 10K scsi drives for main the rest for the DB files Auto vaccum is on, and in addition I do some vacuuming for specific high use tables nightly Any ideas how to start finding the culprit? Bye, Guy. -- Family management on rails: http://www.famundo.com My development related blog: http://devblog.famundo.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema search_path and views
Cool! That explains it fully. So i guess there will be a better performance to the pre-generated views at the price of more views. Thanks! On 11/6/06, Merlin Moncure <[EMAIL PROTECTED]> wrote: On 11/6/06, Just Someone <[EMAIL PROTECTED]> wrote: > I have a database with multiple schemas all with the same structure > (but of course different data...). > > I want to create a view that will be created in a shared schema, and > when executed will be executed against the current schema. Whenever I > try it, it seems the view is linked to a specific schema used when > creating it, and doesn't reevaluates based on the current schema. no, or not exactly. views resolve the search path when they are generated. this is a fundemental part of how they work. functions, however, are a bit different. the plans are lazily generated and 'stick' to the tables that are resolved in the search path when the plan is generated, which is basically the first time you run them in a session. so, you could in theory do what you want with a view if it called functions for all the switchable parts. merlin -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Schema search_path and views
I have a database with multiple schemas all with the same structure (but of course different data...). I want to create a view that will be created in a shared schema, and when executed will be executed against the current schema. Whenever I try it, it seems the view is linked to a specific schema used when creating it, and doesn't reevaluates based on the current schema. Here is the pseudo structure/code: schema1: === create table t1 ... schema2: === create table t1 ... shared_schema: create table t3 ... create the view: === set search_path to shared_schema, schema1; create view view1 as select * from t1; try the view: set search_path to shared_schema, schema1; select * from view1; set search_path to shared_schema, schema2; select * from view1; Results: == In the above, both select * from view1; will return the same data, though the search path changed. Is there a way to make the view use the current search_path? -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] using schema's for data separation
I am using a similar solution, and I tested it with a test containing 20K+ different schemas. Postgres didn't show slowness at all even after the 20K (over 2 million total tables) were created. So I have feeling it can grow even more. Guy. On 9/28/06, snacktime <[EMAIL PROTECTED]> wrote: I'm re evaluating a few design choices I made a while back, and one that keeps coming to the forefront is data separation. We store sensitive information for clients. A database for each client isn't really workable, or at least I've never though of a way to make it workable, as we have several thousand clients and the databases all have to be accessed through a limited number of web applications where performance is important and things like persistant connections are a must. I've always been paranoid about a programmer error in an application resulting in data from multiple clients getting mixed together. Right now we create a schema for each client, with each schema having the same tables. The connections to the database are from an unprivileged user, and everything goes through functions that run at the necessary privileges. We us set_search_path to public,user. User data is in schema user and the functions are in the public schema. Every table has a client_id column. This has worked well so far but it's a real pain to manage and as we ramp up I'm not sure it's going to scale that well. So anyways my questions is this. Am I being too paranoid about putting all the data into one set of tables in a common schema? For thousands of clients what would you do? Chris ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] SELinux + CREATE TABLESPACE = ?
If you rather keep SELinux on, you can still set the SELinux context on the directory where you want the tablespaces to one postgres will like. To find what is the permissions you need, you can use ls -Z. It will list the SELinux context. Check /var/lib/pgsql/data (or wherever postgres data is pointing to), and then set this same permission on the target dir using chcon. For example, on my FC4 system all subdirectories on the data directory have: root:object_r:postgresql_db_t or user_u:object_r:postgresql_db_t So if you want to chage /path/to/foo/which/is/not/under/pgdata, run (as root or sudo): chcon root:object_r:postgresql_db_t /path/to/foo/which/is/not/under/pgdata This way postgres can access it, and you get the SELinux security. Bye, Guy. http://www.famundo.com http://devblog.famundo.com - Hide quoted text - On 8/2/06, David Fetter <[EMAIL PROTECTED]> wrote: On Wed, Aug 02, 2006 at 04:47:36PM -0700, David Fetter wrote: > Folks, > > This may have come up before, but I didn't see a specific answer in > the archives. > > When I try to do: > > CREATE TABLESPACE foo LOCATION '/path/to/foo/which/is/not/under/$PGDATA'; > > I get: > > ERROR: could not set permissions on directory "/path/to/foo/which/is/not/under/$PGDATA" > > Apparently this is a SELinux problem. How do I set the policy to > allow for this, or if that's not possible, how do I disable SELinux? > > Thanks in advance :) Pardon my self-followup for the archives :) Thanks to Talha Khan, who said: >setenforce 1; > >will disable SELINUX Thanks also to Clodoaldo Pinto, who said: > >Apparently this is a SELinux problem. > > Confirm it looking for a message in /var/log/messages. > > >How do I set the policy to allow for this, > > This Fedora FAQ is good: > http://fedora.redhat.com/docs/selinux-faq-fc5/#faq-div-controlling-selinux > > >or if that's not possible, how do I disable SELinux? > > edit /ect/selinux/config Cheers, D -- David Fetter <[EMAIL PROTECTED]> http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] A better AND query?
The schema can change, but I rather not. The use case is a web app where you can tag items with tags (many-2-many). There are multiple items you can tag: contacts, schedules, lists, etc... And then you can search and categorize by tags. The standard for this if you look aroung the web is to retrieve the tagged records with any of the tags you select. Effectively an OR query. What I'm trying to do is search for items matching multiple tags at the same time - and AND query. So that I can bring up all contacts that are tagged with friends and movie-lovers. Hope that clears it up a bit... Guy. On 5/9/06, Wayne Conrad <[EMAIL PROTECTED]> wrote: > tagged_type int -- points to the table this tag is tagging My head exploded right about here. Is the schema written in stone, or can it change? What is the use case for this schema? What's it for? What is a "tag" about? Best Regards, Wayne Conrad -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] A better AND query?
I'm trying to generate a query that will handle tags matching in a database. The simplified structure is create table contacts ( id serial primary key, name varchar ); create table books ( id serial primary key, name varchar ); create table tags ( id serial primary key, name varchar ); create table taggings ( tag_id int, tagged_id int, tagged_type int -- points to the table this tag is tagging ); What I want to now achieve is to find all items that are tagged with the same set of tags. So it's an AND matching on a list of tags I have. I have two types of matching. One is within the same object type (where both tagged objects are the same, say two books with the same set of tags) and one that will find ANY object that's tagged with the same tag (like book and contact) Current query (for the same object type) I am using is the following, for a list of 4 tags called summer, winter, spring and fall. SELECT * FROM contacts WHERE 4 = ( SELECT COUNT(*) FROM tags, taggings WHERE tags.id = taggings.tag_id AND lower(tags.name) IN ( 'summer' , 'winter', 'spring', 'fall' ) AND taggings.tagged_type = 1 AND taggings.tagged_id = contacts.id); The query to match all the objects tagged with a given set of tags is: SELECT DISTINCT taggings.tagged_id, taggings.tagged_type FROM taggings WHERE 4 = ( SELECT COUNT(*) FROM tags, taggings as taggings2 WHERE tags.id = taggings2.tag_id AND lower(tags.name) IN ( 'summer' , 'winter', 'spring', 'fall' ) AND taggings.tagged_type = taggings2.tagged_type AND taggings.tagged_id = taggings2.tagged_id ); The idea in both is to see that I find the number of tags needed. I've attached a script that will create the tables, insert some data and run the queries to make it easy to try it. Is there a way to simplify this query and make it more efficient? Thanks! Guy. -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com drop table taggings; drop table contacts; drop table books; drop table tags; create table contacts ( id serial primary key, name varchar ); create table books ( id serial primary key, name varchar ); create table tags ( id serial primary key, name varchar ); create table taggings ( tag_id int, tagged_id int, tagged_type int -- points to the table this tag is tagging ); insert into contacts (id,name) VALUES (1,'guy'); insert into contacts (id,name) VALUES (2,'michal'); insert into contacts (id,name) VALUES (3,'gal'); insert into contacts (id,name) VALUES (4,'noa'); insert into contacts (id,name) VALUES (5,'edo'); insert into books (id,name) VALUES (1,'B1'); insert into books (id,name) VALUES (2,'B2'); insert into books (id,name) VALUES (3,'B3'); insert into books (id,name) VALUES (4,'B4'); insert into books (id,name) VALUES (5,'B5'); insert into tags (id,name) values (1,'summer'); insert into tags (id,name) values (2,'winter'); insert into tags (id,name) values (3,'spring'); insert into tags (id,name) values (4,'fall'); insert into tags (id,name) values (5,'sea'); insert into tags (id,name) values (6,'beach'); insert into taggings (tag_id,tagged_id,tagged_type) values (1,1,1); insert into taggings (tag_id,tagged_id,tagged_type) values (2,1,1); insert into taggings (tag_id,tagged_id,tagged_type) values (3,1,1); insert into taggings (tag_id,tagged_id,tagged_type) values (4,1,1); insert into taggings (tag_id,tagged_id,tagged_type) values (1,2,1); insert into taggings (tag_id,tagged_id,tagged_type) values (2,2,1); insert into taggings (tag_id,tagged_id,tagged_type) values (3,2,1); insert into taggings (tag_id,tagged_id,tagged_type) values (4,2,1); insert into taggings (tag_id,tagged_id,tagged_type) values (5,2,1); insert into taggings (tag_id,tagged_id,tagged_type) values (1,3,1); insert into taggings (tag_id,tagged_id,tagged_type) values (4,3,1); insert into taggings (tag_id,tagged_id,tagged_type) values (1,4,1); insert into taggings (tag_id,tagged_id,tagged_type) values (2,4,1); insert into taggings (tag_id,tagged_id,tagged_type) values (3,4,1); insert into taggings (tag_id,tagged_id,tagged_type) values (4,4,1); insert into taggings (tag_id,tagged_id,tagged_type) values (1,2,2); insert into taggings (tag_id,tagged_id,tagged_type) values (2,2,2); insert into taggings (tag_id,tagged_id,tagged_type) values (3,2,2); insert into taggings (tag_id,tagged_id,tagged_type) values (4,2,2); insert into taggings (tag_id,tagged_id,tagged_type) values (1,1,2); insert into taggings (tag_id,tagged_id,tagged_type) values (4,1,2); -- Find all items tagged with the same set of tags SELECT DISTINCT taggings.tagged_id, taggings.tagged_type FROM taggings WHERE 4= ( SELECT COUNT(*) FROM tags, taggings as taggings2 WHERE tags.id = taggings2.tag_id AND lower(tags.name) IN ( 'summer' , 'winter', 'spring', 'fall' ) AND taggings.tagged_type = taggings2.tagged_type AND taggings.tagged_id = taggings2.t
[GENERAL] Restoring a PITR backup
I have a process for PITR backups running nicely. I'm pretty amazed by the smoothness of it all! Now I'm looking at the retrieval part, and I have something I'm looking for clarification on. The documentation say that a recovery.conf file is needed for the restore. My tests indicate that I can just restore the backup, copy over the latest WAL files and launch postgres. Is that ok? Can I just use this way? I could create the recovery file and have it copy the files, but as the second machine I am restoring into can always have the DB reloaded, I wonder if it isn't easier just to copy the files. I actually rsync my backup directory and my WAL archive directories, into the second machine. Bye, Guy. -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] How to find the latest (partial) WAL file
Hi Tom, > If you sort first by mtime and second by file name you should find the > right one in all cases, ie, take the latest mtime among the > properly-named files, breaking ties by taking the higher filename. > > It'd probably be better if we had a function to report this, but > you can get along without one. For now I'm using ls with grep: LAST_WAL=$(/bin/ls -t1p $WAL_DIR | /bin/grep -v / | /bin/grep -v backup | /usr/bin/head -1) But a pg function for that would be great. Regards, Guy Naor. -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] How to find the latest (partial) WAL file
What is the best way to find the latest partial WAL file? Based on my tests, using the mtime isn't 100% accurate, as if a pg_start_backup/pg_stop_backup() operation is run, the .backup file created might be newer than the last WAL file. It also seems that the WAL file related to the backup is being updated according to the mtime. Using a file name is also not 100% accurate, as it seems that postgres will recycle the archived files, giving them "future" names before they are used, leaving the directory with files that are older by name, but not yet with relevant data. Is there a way to discover what is the real current WAL file? I've read that one of the todo's for 8.2 is to handle partial files archiving, but I'm looking for something I can use now. Regards, Guy. -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] WAL archiving and deletion of the WAL segments
I implemented wal archiving and it seems to be working. The segments are being copied by the shell script, and in the pg_log file I see this line: LOG: archived transaction log file "0001001D0096" But the file is still int he pg_xlog directory. In the documentation I read that it might either delete or recycle the segment files. As I still see it in the directory, I want to make absolutely sure my archiving is working correctly. So my question is: Is what I'm seeing meaning the WAL archiving is working? Or should I expect the file to be deleted? Bye, Guy. -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] A place to post pgbench results
After exchanging a few emails regarding pgbench in the list, I was thinking it'll be cool to have a place to post pgbench resulats from all over the place. Just so people can get an idea of what others are getting. If more people think it's a good idea, I'll be glad to host it as part of my blog. I'll create a section for that and will update it everytime results are submitted. If that picks up steam I'll create a small database and app to make it more automatic. Let me know if there's interest and I'll take care of it. Bye, Guy. -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Practical limit on number of tables ina single database
Hi Jim, Actually, most table are VERY small. On each schema I would expect 4 - 5 tables to go over 1000 records. The rest will be much smaller, with most at under 100. And aside from 2-3 tables, the activity will be pretty low (few records a day at the most extreme). Can I use this to optimize the fsm part in a different way? Bye, Guy. On 3/24/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote: > On Fri, Mar 24, 2006 at 11:15:56AM -0800, Just Someone wrote: > > Hi Jim, > > > > On 3/24/06, Jim Nasby <[EMAIL PROTECTED]> wrote: > > > You want max_fsm_relations to be greater than select count(*) from > > > pg_class where > > > relkind in ('i','t') *across all databases*. And you want max_fsm_pages > > > to be bigger than > > > that. That's the only way you can be assured that you'll be tracking free > > > space info for > > > every table. > > > > So I need something like 2,000,000 in max_fsm_relations. So > > max_fsm_pages now need to be 16 times that? That's how I read the > > postgresql.conf comment. > > Hrm... I forgot that FSM allocation is done in terms of CHUNKPAGES, > which is #defined at 16. So yes, you'd need 32M pages to track freespace > for all tables. Given that that's 250GB, I guess it won't work terribly > well... :) > > Will any of these tables be very small and not see any real update > activity? If so, you could possibly do without being able to store FSM > info for them. Keeping a close eye on the last few lines of vacuumdb -av > would be key here. > > Another possibility is to change CHUNKPAGES in > include/storage/freespace.h to 1. That means you could get by with 2M > pages, which is 'only' 16GB. > > Perhaps it would be worth considering some alternatives to how the FSM > works. In particular, it might be worth it to be able to store free > space info for multiple relations on a single page. Or perhaps allow the > backend to tablescan very small tables to look for free space. > -- > Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] > Pervasive Software http://pervasive.comwork: 512-231-6117 > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Practical limit on number of tables ina single database
Hi Jim, On 3/24/06, Jim Nasby <[EMAIL PROTECTED]> wrote: > You want max_fsm_relations to be greater than select count(*) from pg_class > where > relkind in ('i','t') *across all databases*. And you want max_fsm_pages to be > bigger than > that. That's the only way you can be assured that you'll be tracking free > space info for > every table. So I need something like 2,000,000 in max_fsm_relations. So max_fsm_pages now need to be 16 times that? That's how I read the postgresql.conf comment. > Taking a look at the last few lines of a vacuumdb -av would also be > insightful, but I'm > pretty certain that will only show what you need right *now*, not what you > might need in > the future. Thanks! Bye, Guy -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Practical limit on number of tables ina single database
Hi Jim, > Just make sure you increase max_fsm_relations, and that max_fsm_pages is > at least > max_fsm_relations, because each relation must get at least > one page. I increased it to 4 relations, should I go even higher? -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Practical limit on number of tables ina single database
Hi, I am creating a hosted solution that I want to base on separation by schemas. So that each hosted family we will have, will have a schema assigned to it (and a user). On login I will set the search path, and so each family will see it's tables. This is all tested and works fine. But I would like to know if there's a practical limit to the number of schemas and tables I can have. Please note that I'm using table spaces to make sure the directories are manageable. I tested it so far with 13000 schemas and users, with 26 tables in each schema (a total of more that 33 tables). It works perfectly, but I would like to know if someone has experience with this number of tables/schemas, and if there's a limit I should be careful of. Bye, Guy. -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Some pgbench results
Hi Magnus, > It might seem that I'm selling ext3 or something :) but it's the linux > filesystem I know best. > If you want ext3 to perform with large directories, there is an mkfs > option that enables directory hashing that you can try: -O dir_index. Not at all (sell ext3 ;-) ). It's great to get this kind of info! I rather use ext3 as it's VERY stable., and the default in Fedora anyway. So thanks for the tip! Bye, Guy. -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Some pgbench results
I played a bit with kernnel versions as I was getting a kernel panic on my Adaptec card. I downgraded to 2.6.11 (the original that came with fedora core 4) and the panic went away, but more than that, the performance on XFS went considerably higher. With the exact same settings as before, I got now Average of 813.65tps with a standard deviation of: 130.33. I hope this kernel doesn't panic on me. But I'll know just tomorrow as I'm pounding on the machine now. Bye, Guy. On 3/23/06, Magnus Naeslund(f) <[EMAIL PROTECTED]> wrote: > Just Someone wrote: > > > > Initialized the data with: pgbench -i -s 100 > > Test runs: pgbench -s 100 -t 1 -c 20 > > I did 20 runs, removed the first 3 runs from each sample to account > > for stabilization. > > Did you re-initialize the test pgbench database between runs? > I get weird results otherwise since some integers gets overflowed in the > test (it doesn't complete the full 1 transactions after the first run). > > > Here are the results in tps without connection > > establishing: > > > > FS: JFS XFS EXT3 > > Avg: 462 425 319 > > Stdev: 10474 106 > > > > Could you please tell me what stripe size you have on the raid system? > Could you also share the mkfs and mount options on each filesystem you > tried? > > I ran some tests on an somewhat similar system: > A supermicro H8SSL-i-B motherboard with one dual core opteron 165 with > 4gb of memory, debian sarge amd64 (current stable) but with a pristine > kernel.org 2.6.16 kernel (there's no debian patches or packages yet). > > It has a 3ware 9550 + BBU sata raid card with 6 disks in a raid 10 > configuration with 256kb stripe size. I think this results in about > 200mb/s raw read performance and about 155mb/s raw write performance (as > in tested with dd:ing a 10gb file back and forth). > I had no separate WAL device/partition, only tweaked postgresql.conf. > > I get about 520-530 tps with your pgbench parameters on ext3 but very > poor (order of magnitude) performance on xfs (that's why I ask of your > mkfs parameters). > > A hint on using a raided ext3 system is to use whole block device > instead of partitions to align the data better and use data=journal with > a big journal. This might seem counter-productive at first (it did to > me) but I increased my throughput a lot when using this. > > My filesystem parameters are calculated like this: > stripe=256 # <- 256k raid stripe size > bsize=4 # 4k blocksize > bsizeb=$(( $bsize * 1024 )) # in bytes > stride=$(( $stripe / $bsize )) > > mke2fs -b $bsizeb -j -J size=400 -m 1 -O sparse_super \ > -T largefile4 -E stride=$stride /dev/sdb > > Mounted with: mount -t ext3 -o data=journal,noatime /dev/sdb /mnt/test8 > > I'm a little surprised that I can get more pgbench performance out of my > system since you're using 10K scsi disks. Please try the above settings > and see if it helps you... > > I've not run so many tests yet, I'll do some more after the weekend... > > Regards, > Magnus > > > -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Some pgbench results
Hi, > Did you re-initialize the test pgbench database between runs? > I get weird results otherwise since some integers gets overflowed in the > test (it doesn't complete the full 1 transactions after the first run). No, I didn't. The reason is that I noticed that the first run is always MUCH faster. My initial runs if I reinit pgbench and run again will always hover around 900-970 tps for xfs. And I didn't need this as a real performance test, it was a side effect of a load test I was doing on the server. Also, pgbench isn't close to the load I'll see on my server (web application which will be mostly read) > Could you please tell me what stripe size you have on the raid system? > Could you also share the mkfs and mount options on each filesystem you > tried? RAID stripe size of 256K. File system creation: xfs: mkfs -t xfs -l size=64m /dev/md0 jfs: mkfs -t jfs /dev/md0 Mount for xfs with -o noatime,nodiratime,logbufs=8 jfs: -o noatime,nodiratime > A hint on using a raided ext3 system is to use whole block device > instead of partitions to align the data better and use data=journal with > a big journal. This might seem counter-productive at first (it did to > me) but I increased my throughput a lot when using this. Thanks for the advice! Actually, the RAID 10 I have is mounted as /var/lib/pgsql, so it's ONLY for postgres data, and the pg_xlog directory is mounted on another disk. > My filesystem parameters are calculated like this: > stripe=256 # <- 256k raid stripe size > bsize=4 # 4k blocksize > bsizeb=$(( $bsize * 1024 )) # in bytes > stride=$(( $stripe / $bsize )) > > mke2fs -b $bsizeb -j -J size=400 -m 1 -O sparse_super \ > -T largefile4 -E stride=$stride /dev/sdb > > Mounted with: mount -t ext3 -o data=journal,noatime /dev/sdb /mnt/test8 That's an interesting thing to try, though because of other things I want, I prefer xfs or jfs anyway. I will have an extreme number of schemas and files, which make high demands on the directory structure. My tests showed me that ext3 doesn't cope with many files in directories very well. With xfs and jfs I can create 500K files in one directory in no time (about 250 seconds), with ext3 it start to crawl after about 30K files. > I'm a little surprised that I can get more pgbench performance out of my > system since you're using 10K scsi disks. Please try the above settings > and see if it helps you... > > I've not run so many tests yet, I'll do some more after the weekend... Please share the results. It's very interesting... Bye, Guy. BTW, one thing I also tested is a software RAID0 over two RAID5 SATA arrays. Total disk count in this is 15. The read performance was really good. The write performance (as expected) not so great. But that was just a test to get a feeling of the speed. This RAID5 system is only used for file storage, not database. -- Family management on rails: http://www.famundo.com - coming soon! My development related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Some pgbench results
Jim, I did another test with ext3 using data=writeback, and indeed it's much better: Avg:429.87 Stdev: 77 A bit (very tiny bit) faster than xfs and bit slower than jfs. Still, very much improved. Bye, Guy. On 3/23/06, Jim Nasby <[EMAIL PROTECTED]> wrote: > On Mar 23, 2006, at 11:32 AM, Bernhard Weisshuhn wrote: > > > Just Someone wrote: > > > >> 2 10K SCSI disks in RAID1 for OS and WAL (with it's own partiton on > >> ext3), > > > > You'll want the WAL on its own spindle. IIRC a separate partition > > on a shared disc won't give you much benefit. The idea is to keep > > the disc's head from moving away for other tasks. Or so they say. > > Actually, the OS partitions are normally quiet enough that it won't > make a huge difference, unless you're really hammering the database > all the time. > -- > Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] > Pervasive Software http://pervasive.comwork: 512-231-6117 > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > > > > ---(end of broadcast)--- > TIP 4: Have you searched our list archives? > >http://archives.postgresql.org > -- Family management on rails: http://www.famundo.com - coming soon! My develpment related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Some pgbench results
I was doing some load testing on a server, and decided to test it with different file systems to see how it reacts to load/speed. I tested xfs, jfs and ext3. The machine runs FC4 with the latest 2.6.15 kernel from Fedora. Hardware: Dual Opteron 246, 4GB RAM, Adaptec 2230 with battery backup, 2 10K SCSI disks in RAID1 for OS and WAL (with it's own partiton on ext3), 6 10K scsi disks in RAID10 (RAID1 in hw, RAID0 on top of that in sw). Postgres config tweaked as per the performance guide. Initialized the data with: pgbench -i -s 100 Test runs: pgbench -s 100 -t 1 -c 20 I did 20 runs, removed the first 3 runs from each sample to account for stabilization. Here are the results in tps without connection establishing: FS: JFS XFS EXT3 Avg: 462 425 319 Stdev: 10474 106 Intererstingly, the first 3 samples I removed had a MUCH higher tps count. Up to 900+. Bye, Guy. -- Family management on rails: http://www.famundo.com - coming soon! My develpment related blog: http://devblog.famundo.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] SELinux strangeness with 8.1.2 and 8.1.3
I just finished installing the PGDG rpms on my second server. This one is a single CPU Opteron with 2 SATA based RAID5 arrays. (Just to clear things up, I know RAID5 is bad for postgres, but this is a storage server that has postgres only as a backup for the main machine.) The problem diesn'[t happen on this second machine. Which leaves me with one big difference between the two machines. The main DB server (the one with the problem) has the RAID10 array mounted exclusively for it's access, and under it the pg_xlog directory is mounted on another exclusive partition on the other array. Here are the mount details: /dev/sdb1 on /var/lib/pgsql type xfs (rw,noatime,nodiratime,logbufs=8) /dev/sda3 on /var/lib/pgsql/data/pg_xlog type ext2 (rw,noatime,nodiratime) Any idea if this might be causing the problem? I don't see how it might do it, but as I said I'm not an SELinux expert. Bye, Guy. On 3/3/06, Just Someone <[EMAIL PROTECTED]> wrote: > Hi Tom, > > > Hmm. That seems like a SELinux policy bug. It doesn't happen for me: > > the pid file is created with the same context the other files have. > > I agree! I have the latest FC4 policy update. So I downloaded the > sources as the new one didn't solve the issue. The policy source has > no mention on the pid file, but it seems like it should be created > with the settings of the directory, which is set correctly. I'm not an > expert in SELinux, so I didn't want to mess with the policy, though I > think the pid file could be added to the policy specifically to solve > this issue. Also, I did run restorecon on the directory (that was the > first thing I tried), but it didn't help. Probably because the pid > file isn't there when postgres isn't running. > > Today I will have the results from my second machine update, as it > just finished installing all the FC4 updates through yum. I'll let you > know how it goes. > > Bye, > > Guy. > > > > > -rw--- postgres postgres root:object_r:postgresql_db_t > > postmaster.pid > > > > Are you sure that your SELinux policy is up-to-date? Maybe you need to > > do a restorecon on the postgres binaries and/or /var/lib/pgsql/data. > > > > > Some more info about the system: > > > * FC4 fully updated > > > * Postgres 8.1.3 built from the PGDG SRPMs > > > * Dual Opteron > > > > I tried it myself on a freshly-updated FC4 x86_64 system, using the current > > FC5 SRPMs, and couldn't see a problem. Red Hat's SRPMs are not exactly > > like the PGDG ones, but the only difference I can find that looks at all > > relevant to SELinux is this one in the init script: > > > > 132c134 > > < [ -x /usr/bin/chcon ] && /usr/bin/chcon -u system_u -r > > object_r -t postgresql_log_t "$PGLOG" > > --- > > > [ -x /usr/bin/chcon ] && /usr/bin/chcon -t postgresql_log_t > > > "$PGLOG" > > > > and that's not about the pid file. > > > > regards, tom lane > > > > > -- > Bye, > > Guy > > Family management on rails: http://www.famundo.com - coming soon! > -- Bye, Guy Family management on rails: http://www.famundo.com - coming soon! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] SELinux strangeness with 8.1.2 and 8.1.3
Hi Tom, > Hmm. That seems like a SELinux policy bug. It doesn't happen for me: > the pid file is created with the same context the other files have. I agree! I have the latest FC4 policy update. So I downloaded the sources as the new one didn't solve the issue. The policy source has no mention on the pid file, but it seems like it should be created with the settings of the directory, which is set correctly. I'm not an expert in SELinux, so I didn't want to mess with the policy, though I think the pid file could be added to the policy specifically to solve this issue. Also, I did run restorecon on the directory (that was the first thing I tried), but it didn't help. Probably because the pid file isn't there when postgres isn't running. Today I will have the results from my second machine update, as it just finished installing all the FC4 updates through yum. I'll let you know how it goes. Bye, Guy. > > -rw--- postgres postgres root:object_r:postgresql_db_tpostmaster.pid > > Are you sure that your SELinux policy is up-to-date? Maybe you need to > do a restorecon on the postgres binaries and/or /var/lib/pgsql/data. > > > Some more info about the system: > > * FC4 fully updated > > * Postgres 8.1.3 built from the PGDG SRPMs > > * Dual Opteron > > I tried it myself on a freshly-updated FC4 x86_64 system, using the current > FC5 SRPMs, and couldn't see a problem. Red Hat's SRPMs are not exactly > like the PGDG ones, but the only difference I can find that looks at all > relevant to SELinux is this one in the init script: > > 132c134 > < [ -x /usr/bin/chcon ] && /usr/bin/chcon -u system_u -r > object_r -t postgresql_log_t "$PGLOG" > --- > > [ -x /usr/bin/chcon ] && /usr/bin/chcon -t postgresql_log_t > > "$PGLOG" > > and that's not about the pid file. > > regards, tom lane > -- Bye, Guy Family management on rails: http://www.famundo.com - coming soon! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] SELinux strangeness with 8.1.2 and 8.1.3
Hi Tom, I looked into another system I have and after updating FC4 to the latest and installing the latest from the PGDG srpms, I didn't have this problem. Tomorrow I'm going to do a similar test on another server that I have to install Postgres on. I will report back with what I find on it. But on this machine the change to su solved the issue. Some more clues that might help you see if there's a real problem, is that the /var/lib/pgsql/data/postmaster.pid file is created with the a SELinux context that's different from the rest. It is created with system_u:object_r:file_t while the rest of the files are created with root:object_r:postgresql_db_t. And the postmaster (when using runuser) fails on accessing it according to the audit log. The file is created but it's empty. So the failure is when trying to write the pid and the rest of the info to it. When I run with su, it is be able to access it just fine. I retested now just to make sure I wasn't seeing things. Some more info about the system: * FC4 fully updated * Postgres 8.1.3 built from the PGDG SRPMs * Dual Opteron * 4GB RAM * /var/lib/pgsql/data on a RAID10 with xfs on top * WAL on a different RAID on a partition only for itself with ext2 * SELinux in targeted policy mode Bye, Guy. On 3/1/06, Tom Lane <[EMAIL PROTECTED]> wrote: > "Just Someone" <[EMAIL PROTECTED]> writes: > > I researched it a bit, and tried a few things, and discovered that the > > problem is in the init script at /etc/init.d/postgres users runuser > > instead of su on SELinux enabled systems. But for some reason it won't > > work this way. I manually reveted it to use su and it works fine. > > I don't think I believe this ... not least because the runuser-for-su > substitution has been in there for a long time. It doesn't explain > a breakage during an FC4 update. > > Can you provide a reasonably self-contained demonstration of the problem > you saw? > > regards, tom lane > -- Bye, Guy Family management on rails: http://www.famundo.com - coming soon! ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] SELinux strangeness with 8.1.2 and 8.1.3
Hi, I know this isn't directly a postgres issue, but it might help a few other users, so here goes. I did an upgrade on my Fedora Core 4 system, and postgres (8.1.2 from the postgres packages, not FC packages) stopped working because of permission issues when trying to create postmaster.pid in the /var/lib/pgsql/data directory. My system has SELinux active, and it used to work until now. So I upgraded using the latest 8.1.3 release (I built it from SRPMs as an x86_64 binary wasn't available), hoping it will help. It didn't. I researched it a bit, and tried a few things, and discovered that the problem is in the init script at /etc/init.d/postgres users runuser instead of su on SELinux enabled systems. But for some reason it won't work this way. I manually reveted it to use su and it works fine. It anyone knows of a better fix - please let me know. Thanks, Guy. --- Family management on postgres+rails: http://www.famundo.com - coming soon! ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Hosting options on Postgres - what's best?
Hi, I'm looking into creating a hosted application with Postgres as the SQL server. I would like to get some ideas and oppinions about the different ways to separate the different clients, using postgres. The options I had in mind: 1) Create a different database per client. How much overhead will this add? 2) Use schemas and authentication. So each client is a different schema on the database. 3) Use application level security (per object security maintained by the app). 4) Any ideas? Opinnions? js. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings