Re: [GENERAL] Upgrade questions

2012-03-24 Thread Jasen Betts
On 2012-03-12, Carson Gross  wrote:

> We've got a postgres database with *a lot* of data in one table.  On the
> order of 100 million rows at this point.  Postgres is, of course, handling
> it with aplomb.

>   ALTER TABLE my_table ALTER COLUMN id TYPE bigint;


> However, given the size of this table, I have no idea how long something
> like this might take.  In general I've had a tough time getting feedback
> from postgres on the progress of a query, how long something might take,
> etc.

I would estimate minutes to hours, 

it also depends how many foreign keys must be re-checked.

> So my question is: is there a way to understand roughly how long something
> like this might take?  Our DB is out on crappy Amazon ec2 instances, so we
> don't exactly have screamers set up.  Any tools I can use?

use the cloud. set up a clone and do some testing,


-- 
⚂⚃ 100% natural


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Table growing faster than autovacuum can vacuum

2012-03-24 Thread Scott Marlowe
On Sat, Mar 24, 2012 at 9:40 PM, Jasen Betts  wrote:
>
> have you tried using COPY instead of INSERT (you'll have to insert
> into the correct partition)

triggers fire on copy, but rules do not.  So if he has partitioning
triggers they'll fire on the parent table etc.

HOWEVER, that'll be slower than copying to the proper partition to begin with.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Table growing faster than autovacuum can vacuum

2012-03-24 Thread Jasen Betts
On 2012-02-15, Asher Hoskins  wrote:
> Hello.
>
> I've got a database with a very large table (currently holding 23.5 
> billion rows, the output of various data loggers over the course of my 
> PhD so far). The table itself has a trivial structure (see below) and is 
> partitioned by data time/date and has quite acceptable INSERT/SELECT 
> performance.
>
>CREATE TABLE rawdata (
>  value REAL NOT NULL,
>  sample_time TIMESTAMP WITH TIME ZONE NOT NULL,
>  -- Dataset reference.
>  dataset INTEGER NOT NULL
>);

> The data loggers are collected every month or so and uploaded into the 
> database, resulting in another 1-2 billion rows in the table each time. 
> Data is never deleted from the table and so it's essentially read-only.
>
> My problem is that the autovacuum system isn't keeping up with INSERTs 
> and I keep running out of transaction IDs. SELECT performance also drops 
> off the more I insert, which from looking at the output of iostat seems 
> to be because the autovacuum tasks are taking up a lot of the disk 
> bandwidth - the disks are commodity items in a software RAID and not 
> terribly fast.

have you tried using COPY instead of INSERT (you'll have to insert
into the correct partition)

or altertatiely putting several rows in the VALUES part of the insert
(that should work with trigger based partitioning)

insert into rawdata 
values
 (1.0,'2012-03-25 16:29:01 +13',1),
 (1.1,'2012-03-25 16:29:02 +13',1),
 (1.15,'2012-03-25 16:29:03 +13',1),
 (1.17,'2012-03-25 16:29:04 +13',1),
 (1.18,'2012-03-25 16:29:05 +13',1);

etc...


-- 
⚂⚃ 100% natural


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Howto Replication for dummies?

2012-03-24 Thread Andreas

Hi,

I've got a PG 9.1.3 on an OpenSuse 12.1 runnig that lately crashed 
because of memory allocation errors probaply due to failing RAMs.


In this case the server died thursday afternoon and at first didn't 
comlain when I restartet it.
Later some queries produced "out of memory" errors which is daft as the 
box has 16 GB RAM.

Friday I found that pg_dump failes on one of the big tables.

Now I got the advice with a big stick that I should avoid data loss in 
the future.


The DB is not exactly huge.
A sql dump is about 1.2 GB big but has about 150 schemas with many 
smaller tables and there are 5 big tables that currently hold up to 9 
mio lines.


Is there a howto that explains what to do to get a mirror-server that 
follows the productive system?



Btw. what would happen to the slave in the event that the main server 
messes up a table in a crash as above explained but goes on running 
after an automatic ort manual restart?


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] plpgsql function to insert or update problem

2012-03-24 Thread Andy Colson

On 03/24/2012 05:23 AM, Alban Hertroys wrote:

On 23 Mar 2012, at 19:49, Andy Colson wrote:


Anyway, the problem.  I get a lot of DB Error messages:
DB Error: ERROR:  duplicate key value violates unique constraint "by_ip_pk"
DETAIL:  Key (ip, sessid, "time")=(97.64.237.59, 2qggi9gcdkcaoecqg3arvo1gu7, 
2012-03-23 13:00:00) already exists.
CONTEXT:  SQL statement "insert into by_ip(vhost, time, ip, sessid, hits)
values (vid, date_trunc('hour', ihittime), iip, isessid, 
1)"



Why is it hitting the insert statement?  Why doesn't the "if not found" seem to 
work?  Any hints would be appreciated.



You forgot about vid in your PK ;)

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.




Wow, jeez...  I don't know how many times I looked at that.  Thank you very 
much!


If you can't see the forest for the trees,


Your sig is appropriate :-)

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pgcon 2012

2012-03-24 Thread Bruce Momjian
On Sat, Mar 24, 2012 at 04:46:49AM +, Arvind Singh wrote:
> Dear sir,
>  
> we are recent users of Postgres and wish to be more aware and soak up the
> activities around the DB.
> we used to develop more on MySql , SQl Server combo
>  
> Now we are using PG 9.0.5 , NPGSQL + Visual CSharp , Java
>  
> Is http://www.pgcon.org/2012/?2nd
>  
> a valid and certified Postgres Event or is it just a marketing strategy by
> Private Event Management firm
>  
> Especially related to 
> - latest PG developments 
> - learn about PG Stats , Logs , PG_Settings table
> - hire postgres related manpower
>  
> because we are based in Delhi, and it will take efforts to send candidates to
> the event

I will blog a little about PGCon on Monday.  PGCon is exactly what it
says it is, and is very non-marketing and very tech-heavy.  If you look
at the speaker list from last year, it has almost all the server
developers in attendance.

In fact, if anything, it might have too much technology focus and not
enough hands-on/admin focus for you.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] plpgsql function to insert or update problem

2012-03-24 Thread Alban Hertroys
On 23 Mar 2012, at 19:49, Andy Colson wrote:

> Anyway, the problem.  I get a lot of DB Error messages:
> DB Error: ERROR:  duplicate key value violates unique constraint "by_ip_pk"
> DETAIL:  Key (ip, sessid, "time")=(97.64.237.59, 2qggi9gcdkcaoecqg3arvo1gu7, 
> 2012-03-23 13:00:00) already exists.
> CONTEXT:  SQL statement "insert into by_ip(vhost, time, ip, sessid, hits)
>values (vid, date_trunc('hour', ihittime), iip, 
> isessid, 1)"

> Why is it hitting the insert statement?  Why doesn't the "if not found" seem 
> to work?  Any hints would be appreciated.


You forgot about vid in your PK ;)

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] broken xlog - recovery plan check

2012-03-24 Thread Thom Brown
On 24 March 2012 00:45, Colin Taylor  wrote:
> Hi I seem to have an 8.3.9 database with a broken xlog,
>
> PANIC:  heap_insert_redo: invalid max offset number
>
> My plan is to run pg_resetxlog.
> Hopefully it then starts up.
> Test recent data as thoroughly as possible - (script some Select * ' s?)
> If ok -> curse ops and their raid caches
> If not -> curse ops and tell them to recover from backup (v. large and
> therefore very slow process).
>
> Can anyone give me feedback on this plan?

Yes, it's almost certainly corrupted.  How old is the backup?  I ask
this because if you use pg_resetxlog, it would be a good idea to dump
and restore the database once you get it up and running anyway.  This
is because you can't trust that your database will be consistent.  I
guess technically it *might* be fine, but you wouldn't know this
unless you went through verifying all your data made sense from a
referential integrity perspective.  So it will be a trade-off between
one of:

- restore from an existing backup, losing the data since you last backed up
- doing a dump/restore after resetting xlog to ensure your database is
consistent
- running full checks once you've got your database up and running (or
ignore it and possibly find weird problems later)

Also, PostgreSQL 8.3.9 is over 2 years out of date.  I'd recommend
bringing it up to 8.3.18 to take advantage of the hundreds of bug
fixes that have since gone in.

-- 
Thom

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] group by does not show error

2012-03-24 Thread Thomas Kellerer

AI Rumman wrote on 24.03.2012 09:06:

I am using Postgresql 9.1.0.

I found that following GROUP BY query works in my DB :-

\d t1
   Table "public.t1"
  Column |  Type   | Modifiers
+-+---
  i  | integer | not null
  nam| text|
Indexes:
"t1_pkey" PRIMARY KEY, btree (i)


select i,nam
from t1
group by i;

 i | nam
---+-
(0 rows)


I think the query should show an error as I did not maintain group by standard.
Does anyone know why it does not show syntax error?


That's an enhancement in 9.1

As you included the primary key in the group by clause the rows are going to be 
unique anyway and no additional columns are necessary.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] group by does not show error

2012-03-24 Thread AI Rumman
I am using Postgresql 9.1.0.

I found that following GROUP BY query works in my DB :-

   \d t1
  Table "public.t1"
 Column |  Type   | Modifiers
+-+---
 i  | integer | not null
 nam| text|
Indexes:
"t1_pkey" PRIMARY KEY, btree (i)


   select i,nam
   from t1
   group by i;

i | nam
---+-
(0 rows)


I think the query should show an error as I did not maintain group by
standard.
Does anyone know why it does not show syntax error?


Re: [GENERAL] PostgreSQL 64 Bit XIDs - Transaction IDs

2012-03-24 Thread Scott Marlowe
On Sat, Mar 24, 2012 at 1:27 AM, Scott Marlowe  wrote:
> On Fri, Mar 23, 2012 at 11:18 PM, Bret Stern
>  wrote:
>> VoltDB maybe
>> - Original Message -
>
> VoltDB has a completely different focus than PostgreSQL really.
> PostgreSQL is a general purpose database that can achieve some very
> impressive numbers using super fast hardware, while still being a
> general purpose db that can do more than just shove through thousands
> of transactions a second.

There's some good talk on usenix's youtube channel about dbs etc.:

One Size Does Not Fit All in DB Systems:
www.youtube.com/watch?v=QQdbTpvjITM&lr=1

NewSQL vs. NoSQL for New OLTP by Mike Stonebraker:
http://www.youtube.com/watch?v=uhDM4fcI2aI

are both excellent presentations.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL 64 Bit XIDs - Transaction IDs

2012-03-24 Thread Scott Marlowe
On Fri, Mar 23, 2012 at 11:18 PM, Bret Stern
 wrote:
> VoltDB maybe
> - Original Message -

VoltDB has a completely different focus than PostgreSQL really.
PostgreSQL is a general purpose database that can achieve some very
impressive numbers using super fast hardware, while still being a
general purpose db that can do more than just shove through thousands
of transactions a second.

For instance, I've tested machines with spinning drives that could
sustain, for weeks, 5 to 7k tps.  Autovacuum suitably tuned to be fast
enough to keep up (i.e. no sleep time, greatly increased cost ilmit,
lots of checkpoint segments, and very aggressive bgwriter settings as
well.

I can imagine doubling that with a rack of SSDs behind it.  And all on
a single machine, that can be running PITR or streaming replication at
the same time.

VoltDB accomplishes this kind of throughput by running in memory,
doing a kind of auto-sharded, RAIDed database setup for redundancy.  I
imagine a farm of little 1U 8 or 16 core machines running multiple
instances of VoltDB (individual instances are single threaded) could
just stomp pg right into the ground as a transactional engine.
Because now you're scaling horizontally

No matter how fast the hardware underlying it gets, pg will have a
limit by the laws of physics on an individual machine that VoltDB
simply works around by having dozens or hundreds of individual boxes
doing the same job, albeit in a very specilalized manner.

I've not played with VoltDB and I don't know how stable and reliable
it is yet, but it was built by Michael Stonebraker, so I'd expect it
to at least hold the promise of stable operation as it matures, if not
being stable and reliable now.

But the % of db systems that need that level of performance are pretty
limited, and generally designed to have a large chunk of the world's
population working on them at the same time.

For a system that's more typical, i.e. 95% or more read, you can throw
memcache in front of even a pretty mediocre box and have it keep up
just fine for hundreds or even thousands of writes per second and
hundreds of thousands of reads to the cache layer per second.

There are real world problems with redundancy and reliability to work
out with VoltDB that are quite different from Postgresql, since you
need real synchronous replication from two geographically separated
data centers, because if something horrible happens in one, (it gets
hit by a meteor, or some junior electrician drops a piece of wire into
a power conditioner, one whole segment of your db fabric could go
dark.  And if it's not synchronous, then the replication is useless
for a true transactional banking style system.  Keep in mind a pg
server simply losing power is usually safe from data loss, and a
nearby machine with syncrhonous replication is likely to provide a
high enough level of redundancy for most transactional systems.  Since
VoltDB is in memory, you HAVE to have a live backup running somewhere
remote, or a simple power loss kills all your data / latest
transactions.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general