Re: [GENERAL] Upgrade questions
On 2012-03-12, Carson Gross wrote: > We've got a postgres database with *a lot* of data in one table. On the > order of 100 million rows at this point. Postgres is, of course, handling > it with aplomb. > ALTER TABLE my_table ALTER COLUMN id TYPE bigint; > However, given the size of this table, I have no idea how long something > like this might take. In general I've had a tough time getting feedback > from postgres on the progress of a query, how long something might take, > etc. I would estimate minutes to hours, it also depends how many foreign keys must be re-checked. > So my question is: is there a way to understand roughly how long something > like this might take? Our DB is out on crappy Amazon ec2 instances, so we > don't exactly have screamers set up. Any tools I can use? use the cloud. set up a clone and do some testing, -- ⚂⚃ 100% natural -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Table growing faster than autovacuum can vacuum
On Sat, Mar 24, 2012 at 9:40 PM, Jasen Betts wrote: > > have you tried using COPY instead of INSERT (you'll have to insert > into the correct partition) triggers fire on copy, but rules do not. So if he has partitioning triggers they'll fire on the parent table etc. HOWEVER, that'll be slower than copying to the proper partition to begin with. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Table growing faster than autovacuum can vacuum
On 2012-02-15, Asher Hoskins wrote: > Hello. > > I've got a database with a very large table (currently holding 23.5 > billion rows, the output of various data loggers over the course of my > PhD so far). The table itself has a trivial structure (see below) and is > partitioned by data time/date and has quite acceptable INSERT/SELECT > performance. > >CREATE TABLE rawdata ( > value REAL NOT NULL, > sample_time TIMESTAMP WITH TIME ZONE NOT NULL, > -- Dataset reference. > dataset INTEGER NOT NULL >); > The data loggers are collected every month or so and uploaded into the > database, resulting in another 1-2 billion rows in the table each time. > Data is never deleted from the table and so it's essentially read-only. > > My problem is that the autovacuum system isn't keeping up with INSERTs > and I keep running out of transaction IDs. SELECT performance also drops > off the more I insert, which from looking at the output of iostat seems > to be because the autovacuum tasks are taking up a lot of the disk > bandwidth - the disks are commodity items in a software RAID and not > terribly fast. have you tried using COPY instead of INSERT (you'll have to insert into the correct partition) or altertatiely putting several rows in the VALUES part of the insert (that should work with trigger based partitioning) insert into rawdata values (1.0,'2012-03-25 16:29:01 +13',1), (1.1,'2012-03-25 16:29:02 +13',1), (1.15,'2012-03-25 16:29:03 +13',1), (1.17,'2012-03-25 16:29:04 +13',1), (1.18,'2012-03-25 16:29:05 +13',1); etc... -- ⚂⚃ 100% natural -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Howto Replication for dummies?
Hi, I've got a PG 9.1.3 on an OpenSuse 12.1 runnig that lately crashed because of memory allocation errors probaply due to failing RAMs. In this case the server died thursday afternoon and at first didn't comlain when I restartet it. Later some queries produced "out of memory" errors which is daft as the box has 16 GB RAM. Friday I found that pg_dump failes on one of the big tables. Now I got the advice with a big stick that I should avoid data loss in the future. The DB is not exactly huge. A sql dump is about 1.2 GB big but has about 150 schemas with many smaller tables and there are 5 big tables that currently hold up to 9 mio lines. Is there a howto that explains what to do to get a mirror-server that follows the productive system? Btw. what would happen to the slave in the event that the main server messes up a table in a crash as above explained but goes on running after an automatic ort manual restart? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] plpgsql function to insert or update problem
On 03/24/2012 05:23 AM, Alban Hertroys wrote: On 23 Mar 2012, at 19:49, Andy Colson wrote: Anyway, the problem. I get a lot of DB Error messages: DB Error: ERROR: duplicate key value violates unique constraint "by_ip_pk" DETAIL: Key (ip, sessid, "time")=(97.64.237.59, 2qggi9gcdkcaoecqg3arvo1gu7, 2012-03-23 13:00:00) already exists. CONTEXT: SQL statement "insert into by_ip(vhost, time, ip, sessid, hits) values (vid, date_trunc('hour', ihittime), iip, isessid, 1)" Why is it hitting the insert statement? Why doesn't the "if not found" seem to work? Any hints would be appreciated. You forgot about vid in your PK ;) Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest. Wow, jeez... I don't know how many times I looked at that. Thank you very much! If you can't see the forest for the trees, Your sig is appropriate :-) -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pgcon 2012
On Sat, Mar 24, 2012 at 04:46:49AM +, Arvind Singh wrote: > Dear sir, > > we are recent users of Postgres and wish to be more aware and soak up the > activities around the DB. > we used to develop more on MySql , SQl Server combo > > Now we are using PG 9.0.5 , NPGSQL + Visual CSharp , Java > > Is http://www.pgcon.org/2012/?2nd > > a valid and certified Postgres Event or is it just a marketing strategy by > Private Event Management firm > > Especially related to > - latest PG developments > - learn about PG Stats , Logs , PG_Settings table > - hire postgres related manpower > > because we are based in Delhi, and it will take efforts to send candidates to > the event I will blog a little about PGCon on Monday. PGCon is exactly what it says it is, and is very non-marketing and very tech-heavy. If you look at the speaker list from last year, it has almost all the server developers in attendance. In fact, if anything, it might have too much technology focus and not enough hands-on/admin focus for you. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] plpgsql function to insert or update problem
On 23 Mar 2012, at 19:49, Andy Colson wrote: > Anyway, the problem. I get a lot of DB Error messages: > DB Error: ERROR: duplicate key value violates unique constraint "by_ip_pk" > DETAIL: Key (ip, sessid, "time")=(97.64.237.59, 2qggi9gcdkcaoecqg3arvo1gu7, > 2012-03-23 13:00:00) already exists. > CONTEXT: SQL statement "insert into by_ip(vhost, time, ip, sessid, hits) >values (vid, date_trunc('hour', ihittime), iip, > isessid, 1)" > Why is it hitting the insert statement? Why doesn't the "if not found" seem > to work? Any hints would be appreciated. You forgot about vid in your PK ;) Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] broken xlog - recovery plan check
On 24 March 2012 00:45, Colin Taylor wrote: > Hi I seem to have an 8.3.9 database with a broken xlog, > > PANIC: heap_insert_redo: invalid max offset number > > My plan is to run pg_resetxlog. > Hopefully it then starts up. > Test recent data as thoroughly as possible - (script some Select * ' s?) > If ok -> curse ops and their raid caches > If not -> curse ops and tell them to recover from backup (v. large and > therefore very slow process). > > Can anyone give me feedback on this plan? Yes, it's almost certainly corrupted. How old is the backup? I ask this because if you use pg_resetxlog, it would be a good idea to dump and restore the database once you get it up and running anyway. This is because you can't trust that your database will be consistent. I guess technically it *might* be fine, but you wouldn't know this unless you went through verifying all your data made sense from a referential integrity perspective. So it will be a trade-off between one of: - restore from an existing backup, losing the data since you last backed up - doing a dump/restore after resetting xlog to ensure your database is consistent - running full checks once you've got your database up and running (or ignore it and possibly find weird problems later) Also, PostgreSQL 8.3.9 is over 2 years out of date. I'd recommend bringing it up to 8.3.18 to take advantage of the hundreds of bug fixes that have since gone in. -- Thom -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] group by does not show error
AI Rumman wrote on 24.03.2012 09:06: I am using Postgresql 9.1.0. I found that following GROUP BY query works in my DB :- \d t1 Table "public.t1" Column | Type | Modifiers +-+--- i | integer | not null nam| text| Indexes: "t1_pkey" PRIMARY KEY, btree (i) select i,nam from t1 group by i; i | nam ---+- (0 rows) I think the query should show an error as I did not maintain group by standard. Does anyone know why it does not show syntax error? That's an enhancement in 9.1 As you included the primary key in the group by clause the rows are going to be unique anyway and no additional columns are necessary. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] group by does not show error
I am using Postgresql 9.1.0. I found that following GROUP BY query works in my DB :- \d t1 Table "public.t1" Column | Type | Modifiers +-+--- i | integer | not null nam| text| Indexes: "t1_pkey" PRIMARY KEY, btree (i) select i,nam from t1 group by i; i | nam ---+- (0 rows) I think the query should show an error as I did not maintain group by standard. Does anyone know why it does not show syntax error?
Re: [GENERAL] PostgreSQL 64 Bit XIDs - Transaction IDs
On Sat, Mar 24, 2012 at 1:27 AM, Scott Marlowe wrote: > On Fri, Mar 23, 2012 at 11:18 PM, Bret Stern > wrote: >> VoltDB maybe >> - Original Message - > > VoltDB has a completely different focus than PostgreSQL really. > PostgreSQL is a general purpose database that can achieve some very > impressive numbers using super fast hardware, while still being a > general purpose db that can do more than just shove through thousands > of transactions a second. There's some good talk on usenix's youtube channel about dbs etc.: One Size Does Not Fit All in DB Systems: www.youtube.com/watch?v=QQdbTpvjITM&lr=1 NewSQL vs. NoSQL for New OLTP by Mike Stonebraker: http://www.youtube.com/watch?v=uhDM4fcI2aI are both excellent presentations. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL 64 Bit XIDs - Transaction IDs
On Fri, Mar 23, 2012 at 11:18 PM, Bret Stern wrote: > VoltDB maybe > - Original Message - VoltDB has a completely different focus than PostgreSQL really. PostgreSQL is a general purpose database that can achieve some very impressive numbers using super fast hardware, while still being a general purpose db that can do more than just shove through thousands of transactions a second. For instance, I've tested machines with spinning drives that could sustain, for weeks, 5 to 7k tps. Autovacuum suitably tuned to be fast enough to keep up (i.e. no sleep time, greatly increased cost ilmit, lots of checkpoint segments, and very aggressive bgwriter settings as well. I can imagine doubling that with a rack of SSDs behind it. And all on a single machine, that can be running PITR or streaming replication at the same time. VoltDB accomplishes this kind of throughput by running in memory, doing a kind of auto-sharded, RAIDed database setup for redundancy. I imagine a farm of little 1U 8 or 16 core machines running multiple instances of VoltDB (individual instances are single threaded) could just stomp pg right into the ground as a transactional engine. Because now you're scaling horizontally No matter how fast the hardware underlying it gets, pg will have a limit by the laws of physics on an individual machine that VoltDB simply works around by having dozens or hundreds of individual boxes doing the same job, albeit in a very specilalized manner. I've not played with VoltDB and I don't know how stable and reliable it is yet, but it was built by Michael Stonebraker, so I'd expect it to at least hold the promise of stable operation as it matures, if not being stable and reliable now. But the % of db systems that need that level of performance are pretty limited, and generally designed to have a large chunk of the world's population working on them at the same time. For a system that's more typical, i.e. 95% or more read, you can throw memcache in front of even a pretty mediocre box and have it keep up just fine for hundreds or even thousands of writes per second and hundreds of thousands of reads to the cache layer per second. There are real world problems with redundancy and reliability to work out with VoltDB that are quite different from Postgresql, since you need real synchronous replication from two geographically separated data centers, because if something horrible happens in one, (it gets hit by a meteor, or some junior electrician drops a piece of wire into a power conditioner, one whole segment of your db fabric could go dark. And if it's not synchronous, then the replication is useless for a true transactional banking style system. Keep in mind a pg server simply losing power is usually safe from data loss, and a nearby machine with syncrhonous replication is likely to provide a high enough level of redundancy for most transactional systems. Since VoltDB is in memory, you HAVE to have a live backup running somewhere remote, or a simple power loss kills all your data / latest transactions. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general