from:"\"Greg Smith\""

Re: [GENERAL] SSDs - SandForce or not?

2012-12-09 Thread Greg Smith


On 11/14/12 2:11 AM, Toby Corkindale wrote:

So on the face of it, I think the Sandforce-based drives are probably a
winner here, so I should look at the Intel 520s for evaluation, and
whatever the enterprise equivalent are for production.


As far as I know the 520 series drives fail the requirements outlined at 
http://wiki.postgresql.org/wiki/Reliable_Writes and you can expect 
occasional data corruption after a crash when using them.  As such, any 
performance results you get back are fake.  You can't trust the same 
results will come back from their drives that do handle writes 
correctly.  I'm not aware of any SSD with one of these compressing 
Sandforce controller that's on the market right now that does this 
correctly; they're all broken for database use.  The quick rule of thumb 
is that if the manufacturer doesn't brag about the capacitors on the 
drive, it doesn't have any and isn't reliable for PostgreSQL.


The safe Intel SSD models state very clearly in the specifications how 
they write data in case of a crash.  The data sheet for the 320 series 
drives for example says "To reduce potential data loss, the Intel® SSD 
320 Series also detects and protects from unexpected system power loss 
by saving all cached data in the process of being written before 
shutting down".  The other model I've deployed and know is safe are the 
710 series models, which are the same basic drive but with different 
quality flash and tuning for longevity.  See 
http://blog.2ndquadrant.com/intel_ssds_lifetime_and_the_32/ for details. 
 The 710 series drives are quite a bit more expensive than Intel's 
other models.


Intel's recently released DC S3700 drives also look to have the right 
battery backup system to be reliable for PostgreSQL.  Those are expected 
to be significantly cheaper than the 710 models, while having the same 
reliability characteristics.  I haven't been able to get one yet though, 
so I don't really know for sure how well they perform.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Unexpectedly high disk space usage

2012-11-07 Thread Greg Smith


On 11/7/12 3:58 PM, Jeff Janes wrote:

 WHERE nspname NOT IN ('pg_catalog', 'information_schema')


I question the wisdom of that where clause (from the wiki)

If the pg_catalog relations are big, then they are big and why
shouldn't they get reported as such?  And if they are not big, that is
why the limit is there on the wiki page.


The idea was that in a new database with a relatively small number of 
tables, your own tables will be lost among the catalog data unless you 
filter them out.  Testing against an install with a single real table, 
the query there will show something like this right now:


relation| total_size
+
 public.t   | 3568 kB
 public.t_k_seq | 8192 bytes

But if the filter on pg_catalog is removed, you get this instead:

  relation   | total_size
-+
 public.t| 3568 kB
 pg_catalog.pg_depend| 808 kB
 pg_catalog.pg_proc  | 752 kB
 pg_catalog.pg_attribute | 568 kB
 pg_catalog.pg_rewrite   | 464 kB
 pg_catalog.pg_description   | 392 kB
 pg_catalog.pg_statistic | 328 kB
 pg_catalog.pg_operator  | 208 kB
 pg_catalog.pg_collation | 152 kB
 pg_catalog.pg_type  | 152 kB
 pg_catalog.pg_amop  | 136 kB
 pg_catalog.pg_class | 136 kB
 pg_catalog.pg_constraint| 112 kB
 pg_catalog.pg_conversion| 104 kB
 pg_catalog.pg_index | 88 kB
 pg_catalog.pg_amproc| 80 kB
 pg_catalog.pg_opclass   | 80 kB
 pg_catalog.pg_ts_config_map | 80 kB
 pg_catalog.pg_cast  | 80 kB
 pg_catalog.pg_authid| 72 kB

That is overload for a lot of people, and confusing to new users. 
That's why I opted for the shorter version.


There's no perfect answer to all use cases here.  This sort of thing is 
why there's three sets of queries for pg_stat_user_tables, 
pg_stat_sys_tables, and pg_stat_all_tables.  The wiki disk space queries 
aim to be like the user tables version from that trio.


Adding a note pointing out that you might want to remove pg_catalog and 
see the size of those relations would be appropriate.  I wouldn't make 
that the default case though, due to the issue highlighted above.  I'd 
rather optimize the initially suggested query so that new users get 
simple output, even if it means that might hide problems on larger 
installs, where the catalog data became big.


The other way I sometimes balance these two requirements--want to show 
all the big data, but not clutter small installs with the catalog--is to 
make the filter size-based instead:


SELECT nspname || '.' || relname AS "relation",
pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size"
pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('information_schema')
AND C.relkind <> 'i'
AND nspname !~ '^pg_toast'
AND relpages > 100
ORDER BY pg_total_relation_size(C.oid) DESC 
   LIMIT 20;


On my trivial test install that gives me just the one user table:

 relation | total_size
------+----
 public.t | 3568 kB

While still showing larger catalog tables if they grow to be noticeable.

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Plug-pull testing worked, diskchecker.pl failed

2012-10-26 Thread Greg Smith


On 10/24/12 4:04 PM, Chris Angelico wrote:


Is this a useful and plausible testing methodology? It's definitely
showed up some failures. On a hard-disk, all is well as long as the
write-back cache is disabled; on the SSDs, I can't make them reliable.


On Linux systems, you can tell when Postgres is busy writing data out 
during a checkpoint because the "Dirty:" amount will be dropping 
rapidly.  At most other times, that number goes up.  You can try to 
increase the odds of finding database level corruption during a pull the 
plug test by trying to yank during that most sensitive moment.  Combine 
a reasonable write-heavy test like you've devised with that 
"optimization", and systems that don't write reliably will usually 
corrupt within a few tries.


In general, through, diskchecker.pl is the more sensitive test.  If it 
fails, storage is unreliable for PostgreSQL, period.   It's good that 
you've followed up by confirming the real database corruption implied by 
that is also visible.  In general, though, that's not needed. 
Diskchecker says the drive is bad, you're done--don't put a database on 
it.  Doing the database level tests is more for finding false positives: 
 where diskchecker says the drive is OK, but perhaps there is a 
filesystem problem that makes it unreliable, one that it doesn't test for.


What SSD are you using?  The Intel 320 and 710 series models are the 
only SATA-connected drives still on the market I know of that pass a 
serious test.  The other good models are direct PCI-E storage units, 
like the FusionIO drives.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL training recommendations?

2012-10-16 Thread Greg Smith


On 10/16/12 3:24 PM, Thalis Kalfigkopoulos wrote:

Now I'd understand the Pg manual writers being reluctant about
shifting from manual to DB-book, but I'm guessing, the manual being as
well written as it is, that many of us are already using it as a
learning book anyway.


The official manual is a reference manual that also includes some good 
tutorial material.  Just trying to cover that depth well, it's already 
so large as to be cumbersome--both from the perspective of new readers 
and the people maintaining it.


Expecting to expand its scope even further toward the tutorial and 
example side is not something I'd expect to gain much traction.  Every 
example that appears in the manual is yet another place for the 
documentation to break when code changes are made.  And it's the same 
group of people maintaining both the documentation and the code.  Anyone 
who tries to rev up adding even more docs is going to pull focus off new 
code.  Would you like the core features to expand or to get a new type 
of documentation?  The way things are organized right now, you can't get 
both.


I would say that it's easier to write 400 pages of material outside of 
the manual and distribute them to the world than to add 40 pages to the 
official manual.  And I say that as someone who tried wandering down 
both paths to see which was more productive.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Partitioning Advice

2012-06-07 Thread Greg Smith


On 06/06/2012 01:07 AM, Ben Carbery wrote:
The new server has a great deal more memory which I am hoping will 
help (shared_buffers = 8GB, total RAM 20GB), but I am looking at what 
might be optimal for the storage configuration. From looking at 
previous conversations here I am thinking of something like this..


100GB OS (ext3)
50GB pg_xlog (ext2)
400GB pg_data (ext3 data=writeback noatime?)

Hopefully this would mean the small writes can continue while a large 
read is going.



Latency on ext3 is better on RHEL6 than earlier versions, but it's still 
hard to get to keep it low with that filesystem.  You should consider 
ext4 or xfs instead if you're already running into slow periods limited 
by disk I/O.


Large values of shared_buffers can also make write latency spikes worse, 
particularly when the underlying storage isn't very capable--which is 
likely to be the case in a VM environment.  Most of the performance gain 
is from going from the tiny default (<=32MB) for shared_buffers to a 
moderate size.  You'll probably get most of the performance gain setting 
that to around 1GB instead, and the worst case performance might improve.


If you already are seeing problems on your existing server, there are 
two things you could do to monitor what's going on:


-Turn on log_checkpoints on the server.  If you see high numbers for the 
"sync=" section, that normally narrows your problem very specifically to 
the database's background checkpoints.
-Watch /proc/meminfo , specificially the "Dirty:" number.  If that 
number gets very high during the same periods the slowdowns happen at, 
it might be possible to make things better by decreasing the amount of 
caching Linux does.  There's some intro material on that subject at 
http://notemagnet.blogspot.com/2008/08/linux-write-cache-mystery.html 
and http://blog.2ndquadrant.com/tuning_linux_for_low_postgresq/ (note 
that some of the links in that second one, to the test pgbench results, 
are broken; http://www.highperfpostgres.com/pgbench-results/index.htm is 
the right URL now)


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

Re: [GENERAL] oracle linux

2012-04-03 Thread Greg Smith


On 03/28/2012 10:38 AM, Gregg Jaskiewicz wrote:

They seem to claim up to 70% speed gain.
Did anyone proved it, tested it - with PostgreSQL in particular ?


RedHat's RHEL5 kernel is 2.6.18 with a bunch of backported features.  
Oracle just yanks that out and puts a closer to stock 2.6.32 based 
kernel in there instead.   Basically the speed gain is for people who 
don't want to update their whole distribution, because of nonsense like 
"SAP etc. is only supported on RHEL5 based platforms" I think, but need 
the better high-speed hardware support of a newer kernel.  Of course a 
several year newer kernel runs much faster on latest generation hardware.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Why checkpoint_timeout had maximum value of 1h?

2012-04-02 Thread Greg Smith


On 03/29/2012 06:57 AM, Maxim Boguk wrote:

Is there any real reason why checkpoint_timeout limited to 1hour?


Just to keep people from accidentally setting a value that's dangerously 
high.  There can be some pretty bad drops in performance if you let 
writes pile up for too long, once the checkpoint really does start running.


In my case I have some replicas with WAL on SAS raid and PGDATA on SSD 
with limited write enduranceIn that case having 
checkpoint_timeout=10hour could reduce amout of writes on SSD  by 
factor of 10, and increase planned ssd lifetime by the same amount.


The big write endurance problem is WAL data, and you're already 
addressing that.  Note that if nothing has been written out since the 
last one, the checkpoint won't actually do anything.  So this 10X 
endurance idea might only work out on a system that's always doing 
something.  You'll certainly get less wear; without measuring your 
workload better, I can't say just what the multiplier is.


The other idea you should be considering, if you haven't already, is not 
provisioning all of the space.



I would like to have ability to set checkpoint_timeout=high value
and (whats even better) checkpoint_timeout=0 - in that case checkpoint 
happen when all checkpoint_segments were used.

Is there any serious drawbacks in that idea?
Is it safe to increase that limit in source and rebuild database?  
(9.0 and 9.1 case)


You can edit src/backend/utils/misc/guc.c , find checkpoint_time, and 
change the 3600 value there to something higher.  You will need to 
rebuild the whole database cluster with that setting (initdb), and 
moving a database cluster of files between your tweaked version to/from 
a regular PostgreSQL will do strange things.  You can prevent that from 
happening accidentally by editing src/include/catalog/catversion.h ; 
find the line that looks like this:


#define CATALOG_VERSION_NO201202141

And change it.  It's just MMDDN to create a complete catalog serial 
number, where N is an incrementing number if more than one change is 
made on the same day.  If you do that and increase the upper bound on 
checkpoint_timeout, that should do what you want, while protecting 
against the dangerous situation--where system catalog doesn't match the 
database binaries.


Setting checkpoint_timeout to 0 instead won't work--it will checkpoint 
all of the time then.  The bottom limit is 30 seconds and you don't want 
to touch that.  It's possible to make 0 mean "never timeout", but that 
would require juggling a couple of code pieces around.  The idea of just 
making the timeout big is a safer thing to consider.  I'm not sure if 
you'll really see the gains you're hoping for, but it should be easy 
enough to test.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pgcon 2012

2012-03-25 Thread Greg Smith


On 03/24/2012 12:46 AM, Arvind Singh wrote:

Is http://www.pgcon.org/2012/?2nd

a valid and certified Postgres Event or is it just a marketing 
strategy by Private Event Management firm


There is no such thing as a "certified" Postgres event.  All of the 
conferences in the world are run by private groups and/or companies.  In 
addition to what attendees normally pay, there are normally a number of 
companies involved in sponsoring each event.  In many cases the events 
wouldn't be feasible without that sponsorship help.  Some conferences 
also benefit heavily from the various PostgreSQL user's groups around 
the world too, like the yearly European conference that you might 
consider too:  http://pgconf.eu


There are some policies for who can use PostgreSQL community resources 
like the postgresql.org web site to promote their event, which you can 
see at http://wiki.postgresql.org/wiki/NewsEventsApproval   But many 
organizations and companies meet those guidelines, and none of them are 
any more or less official than the others.


One helpful thing to consider when deciding which conference to attend 
is whether the speakers are involved in the areas you're interested in.  
For the database itself, a list of many frequent contributors is found 
at http://www.postgresql.org/community/contributors/  If you check the 
PGCon speaker list, you'll see many of them are also on that contributor 
list.  That's one reason that PGCon is an excellent conference.  It's 
also very well run by its organizers, even though they are "private 
event management" by your definition.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

Re: [GENERAL] Optimise PostgreSQL for fast testing

2012-02-23 Thread Greg Smith


On 02/23/2012 07:16 PM, Dmytrii Nagirniak wrote:

That's totally fine if PG can't beat SQLite on speed in **this
particular case**.
I just want to try to tune it to be as fast as it can (for **this
particular case**, see my reply to Adrian).


You can find all of the big tunable parameters at 
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server That's as 
good of a general "how do I make this faster by tweaking the server" 
guide as you'll get.


Once you've hit the big tunables--shared_buffers, checkpoint_segments, 
work_mem, effective_cache_size, and tweaking either synchronous_commit 
or fsync--there's not too much else you can do except dig into what's 
slow in individual queries.  Only other thing that might help is running 
ANALYZE against the whole database after any major loading of test data, 
just to make sure the queries are being executed with good statistics.


If you can extract the SQL from the test cases so they can be executed 
directly with the psql client, you could add "\timing" before them to 
see how long each individual query runs, to look for the long running 
ones.  It's possible that every statement is a little slower, which 
would be unsurprising and not something you can really resolve if so. 
It could just be a small number that are being executed poorly though, 
in which case specific query tweaking might be possible.  You might get 
further insight by posting the EXPLAIN ANALYZE plans of whatever the 
slowest single query is.  More on that subject at

http://wiki.postgresql.org/wiki/Slow_Query_Questions

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] fsync on ext4 does not work

2011-12-20 Thread Greg Smith


On 12/19/2011 10:52 AM, Havasvölgyi Ottó wrote:

PgSql 9.1.2
Debian, 2.6.32 kernel
WAL filesystem: ext4 with defaults


There's a pg_test_fsync program included with the postgresql-contrib 
package that might help you sort out what's going on here.  This will 
eliminate the possibility that you're doing something wrong with 
pgbench, and give an easy to interpret number relative to the drive RPM 
rate.


You said default settings, which eliminated "nobarrier" as a cause 
here.  The only other thing I know of that can screw up fsync here is 
using one of the incompatible LVM features to build your filesystem.  I 
don't know which currently work and don't work, but last I checked there 
were a few ways you could set LVM up that would eliminate filesystem 
barriers from working properly.  You might check:


dmesg | grep barrier

To see if you have any kernel messages related to this.

Here's a pg_test_fsync example from a Debian system on 2.6.32 with ext4 
filesystem and 7200 RPM drive, default mount parameters and no LVM:


$ ./pg_test_fsync
2000 operations per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a
fdatasync 113.901 ops/sec
fsync  28.794 ops/sec
fsync_writethroughn/a
open_sync 111.726 ops/sec

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a
fdatasync 112.637 ops/sec
fsync  28.641 ops/sec
fsync_writethroughn/a
open_sync  55.546 ops/sec

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
16kB open_sync write  111.909 ops/sec
 8kB open_sync writes  55.278 ops/sec
 4kB open_sync writes  28.026 ops/sec
 2kB open_sync writes  14.002 ops/sec
 1kB open_sync writes   7.011 ops/sec

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close28.836 ops/sec
write, close, fsync28.890 ops/sec

Non-Sync'ed 8kB writes:
write   112113.908 ops/sec

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hope for a new PostgreSQL era?

2011-12-13 Thread Greg Smith


On 12/12/2011 10:33 PM, Jayadevan M wrote:
But I miss all those user-friendly trouble-shooting utilities (like 
Automatic Workload Repository,Active Session History etc etc) in 
PostgreSQL. Yes - some of them are there, but one has to 
search,download, configure etc. I hope many of these features will 
become part of the 'core' soon.


That's unlikely, simply because the definition of "core" in PostgreSQL 
doesn't quite include the full experience of user tools like this.  For 
example, you might see the core collect the data needed for something 
that acts like a AWR baseline.  But you're unlikely to ever get the sort 
of easy baseline management+graph management tools that Oracle's 
Enterprise Manager layers on top of them in core.  There's no place to 
put a GUI/web tool like that there, and I wouldn't expect that to ever 
change.  You might see it provided as a feature to the pgAdmin tool 
though, or as a separate web application.


The fact that there are multiple pieces of software involved doesn't 
have to make this hard.  Ultimately the problem you're identifying is a 
packaging one.  Something doesn't have to be in the PostgreSQL core to 
be packaged nicely so that you can easily install and use it.  It's 
probably easy for you to get pgAdmin installed and working for example, 
and that's not a part of core.  There's just been a lot more work put 
into packaging it than most tools have gotten so far.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us

Re: [GENERAL] Hope for a new PostgreSQL era?

2011-12-12 Thread Greg Smith


On 12/09/2011 08:54 PM, Greg Smith wrote:
I decided about a year ago that further work on using Systemtap was a 
black hole:  time goes in, nothing really usable on any production 
server seems to come out.


My off-list e-mail this weekend has, quite rightly, pointed out that 
this cheap shot is unfair bordering on libel toward the hard working 
Systemtap developers.  I'd like to publicly apologize for that and 
clarify my frustrated statement here (I'd *really* like this sort of 
tool available more)


The main problem I've had with Systemtap is its reputation; I don't 
actually have any real, informed gripes about its current state.  But 
the sort of customers I have are very risk-adverse.  PostgreSQL does a 
good job attracting that sort of user.  I'm sure we have a 
disproportionate number of them relative to your average open-source 
program.  Accordingly, unless a piece of software is very trusted, it's 
hard for me to convince anyone to use it.  (See "why Greg hates the 
disclaimers around the PostgreSQL contrib modules")  That makes it hard 
for me to give Systemtap a serious spin on most of the production 
servers I see my hardest problems on.  That's the reason behind the 
statement I made badly here--regardless of how much I know about it, I 
can't seem to get Systemtap deployed in the places I spent the most time 
working at.


True or false, the "Systemtap is more likely to crash your kernel than 
DTrace" meme is out there.  I think some of that is an unexpectedly bad 
side-effect of its open-source development.  DTrace had the luxury of 
being hidden from the world at large until it was well formed.  Whereas 
a lot of people saw Systemtap in a really early state, formed opinions 
several years ago, and the oldest of those are some of the highest 
ranking pages when you search for information.  I just searched again 
today, and there's mounds of stuff from 2006 and 2007 that surely 
doesn't reflect the current state of things coming back from that.  
Systemtap didn't get a 1.0 release until September 2009.


As someone who has spent a lot of time at the wrong end of the 
"PostgreSQL is slower than MySQL" meme, I shouldn't have just thrown 
this sort of criticism out there without explaining the basis for my 
statement.  I hope this clears up what I meant.  Ultimately I think we 
need both more tools like Systemtap and DTrace, as well as more 
instrumentation inside PostgreSQL, to cover all of the things people 
would like visibility into.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hope for a new PostgreSQL era?

2011-12-12 Thread Greg Smith


On 12/11/2011 11:39 PM, Jayadevan M wrote:
At the db level, Oracle provides "Database replay" feature. that lets 
you replay the production server events in the development/test 
environment.

http://docs.oracle.com/cd/B28359_01/server.111/e12253/dbr_intro.htm
Won't something like this be useful in PostgreSQL? It will let us 
mimic the production environment load and analyze it better.


There are several projects aiming at this goal in various ways:  
http://wiki.postgresql.org/wiki/Statement_Playback


Some of the features currently under development right now will make 
this sort of thing easier to build into the core database.  For example, 
the recent "Command Triggers" feature submission will make it easier to 
catch DDL changes as well as queries for this sort of thing.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us

Re: [GENERAL] Hope for a new PostgreSQL era?

2011-12-10 Thread Greg Smith


On 12/10/2011 09:28 PM, Craig Ringer wrote:
One thing I think would be interesting for this would be to identify 
slow queries (without doing detailed plan timing) and flag them for 
more detailed timing if they're run again within  time. I suspect 
this would only be practical with parameterised prepared statements 
where the query string remained the same, but that'd still be interesting


There are actually two patches sitting in the current PostgreSQL 
CommitFest that allow normalizing query strings in a way that they could 
be handled like this even if not prepared, as part of 
pg_stat_statements.  What you're asking for is basically a hybrid of 
that and auto_explain, with something smarter deciding when the explain 
is triggered.  Interesting idea, I hadn't thought of that heuristic 
before.  It won't be hard to do if the query normalization stuff commits.


Personally I'd choose good performance monitoring over user/query 
priorities any day. With good perf monitoring I can script from the 
outside I have a lot more control, can send alerts, etc etc.


Luckily for you it's hard to do it in any other order.  When I think 
about how we'd have to validate whether query prioritization code was 
operating as expected or not, I imagine some extra monitoring tools 
really need to get built first.  Might as well expose those for people 
like yourself too, once they're built for that purpose.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hope for a new PostgreSQL era?

2011-12-09 Thread Greg Smith


On 12/08/2011 09:48 AM, Satoshi Nagayasu wrote:
For examples, I've been working on investigating PostgreSQL LWLock 
behaviors

precisely for a few weeks, and it could not be obtained within PostgreSQL
itself, therefore, I picked up SystemTap. However, SystemTap could not be
used in a production system, because it often kills the target 
processes. :(

How can I observe LWLocks in the production system?


I decided about a year ago that further work on using SystemTap was a 
black hole:  time goes in, nothing really usable on any production 
server seems to come out.  It can be useful for collecting data in a 
developer context.  But the sort of problems people are more interested 
in all involve "why is the production server doing this?", and as you've 
also discovered the only reasonable answer so far doesn't involve 
SystemTap; it involves DTrace and either Solaris or FreeBSD (or Mac OS, 
for smaller server hardware deployments).  Since those platforms are 
problematic to run database servers on in many cases, that doesn't help 
very much.


I'm planning to put that instrumentation into the database directly, 
which is what people with Oracle background are asking for.  There are 
two underlying low-level problems to solve before even starting that:


-How can the overhead of collecting the timing data be kept down?  It's 
really high in some places.  This is being worked out right now on 
pgsql-hackers, see "Timing overhead and Linux clock sources"


-How do you log the potentially large amount of data collected without 
killing server performance?  Initial discussions also happening right 
now, see "logging in high performance systems".


I feel this will increasingly be the top blocker for performance 
sensitive deployments in the coming year, people used to having these 
tools in Oracle cannot imagine how they would operate without them.  One 
of my big pictures goals is have this available as a compile-time option 
starting in PostgreSQL 9.3 in 2013, piggybacked off the existing DTrace 
support.  And the earlier the better--since many migrations have a long 
lead time, just knowing it's coming in the next version would be good 
enough for some people who are blocked right now to start working on theirs.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Performance degradation 8.4 -> 9.1

2011-11-18 Thread Greg Smith


On 11/17/2011 02:24 PM, Joseph Shraibman wrote:

This query is taking much longer on 9.1 than it did on 8.4.  Why is it
using a seq scan?
   


To answer that question in all cases, it's necessary to know a) the 
query, b) the PostgreSQL version, c) the table definitions including 
what indexes exist, d) the statistics collected about each column, (e)  
the sizes of all the indexes on any referenced table, and (f) the server 
parameters.  Sometimes you can get useful feedback from just the first 
three of those, but no one call guess you why an index is or isn't being 
used without at least knowing the indexes that are defined.  For 
example, it looks like the query is using an index on 
(eventlog_uid,jobid,type).  It probably wants an index on jobid instead, 
but I can't tell whether you don't have one, or if one is there but it's 
not being used for some reason.


How did you build the 9.1 system from the 8.4 data?  There might just be 
a physical difference between the two tables.


In addition to showing the table definition, two other suggestions:

-Show what the better plan on 8.4 looks like, we're just seeing the slow one

-Try running the individual EXISTS parts of this plan on both versions 
and compare.  You might be able to isolate which of them is the source 
of the difference here.


There's a longer guide to the things people tend to find useful at 
http://wiki.postgresql.org/wiki/SlowQueryQuestions ; this question might 
get a better response on the lower volume pgsql-performance mailing list 
too.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] : Looking for a PostgreSQL book

2011-09-29 Thread Greg Smith


Achilleas Mantzios wrote:

code for the cache utils pg_cacheutils is mentioned to exist somewhere but it's 
not there


At this point the pgfincore project is the most popular way to do the 
work that pg_cacheutils was suggested for there.  See 
http://pgfoundry.org/projects/pgfincore/ and 
http://www.pgcon.org/2010/schedule/events/261.en.html for more information.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Materialized views in Oracle

2011-09-22 Thread Greg Smith


On 09/22/2011 01:34 AM, Mike Christensen wrote:
If Oracle was a swimming pool, I would have those little floaty duck 
things on my arms.


Yes, it's too bad the license to get Oracle 11g with Floaties is cost 
prohibitive for most companies.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] heavy swapping, not sure why

2011-08-29 Thread Greg Smith


On 08/29/2011 06:12 PM, Lonni J Friedman wrote:


OK, I'll reduce it to 10GB and see if there's any noticable change in
performance.  thanks
   


I've never heard a report of a Linux system using more than 8GB of 
shared_buffers usefully, and peak performance on systems I've tested has 
sometimes been far less than that even.  (I have one server that's stuck 
at 512MB!)  The only report of even 10GB helping came from a Solaris test.


I doubt this has anything to do with your problem, just pointing this 
out as future guidance.  Until there's a breakthrough in the PostgreSQL 
buffer cache code, there really is no reason to give more than 8GB of 
dedicated memory to the database on Linux via shared_buffers.  You're 
better off letting the OS do caching with it instead.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Whether the function exists a in pgsql table or not?

2011-08-28 Thread Greg Smith


On 08/27/2011 01:21 PM, shuaixf wrote:

In Oracle, when the table has few records, used frequently, it would use
"storage(buffer_pool keep)" to keep the data in cache instead of LRU
algorithm.

if the function exists in a pgsql table or not?
   


There aren't any controls to pin a table into memory in PostgreSQL.  The 
way the database uses both its dedicated buffer cache as well as the OS 
one, it's hard to even define such a UI that would make sense--the 
database doesn't have any control or interaction with the OS cache.


The mechanism for deciding what stays in the PostgreSQL cache is a bit 
more complicated than an LRU, while the one in the OS probably is a 
LRU.  If the table is used frequently, it's very likely to stay in one 
of the two caches anyway.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Streaming Replication: Observations, Questions and Comments

2011-08-24 Thread Greg Smith


On 08/24/2011 11:33 AM, Samba wrote:
One strange thing I noticed is that the pg_xlogs on the master have 
outsized the actual data stored in the database by at least 3-4 times, 
which was quite surprising. I'm not sure if 'restore_command' has 
anything to do with it. I did not understand why transaction logs 
would need to be so many times larger than the actual size of the 
database, have I done something wrong somewhere?


That's common to see.  Systems that regularly UPDATE the same rows often 
can easily end up with a WAL stream much larger than the database.  The 
WAL data contains enough information to replay every point in time from 
the base backup until the current time.  That can be significantly 
larger than the database, which just holds the latest copy of the data.


One of the biggest things that makes your WAL large are the full page 
writes that protect against incomplete writes.  See "question regarding 
full_page_writes" thread happening on this list recently for details.  
Each time you touch a page, per checkpoint, another full copy of that 
page is written out.


What I have to do in a lot of cases is significantly decrease the number 
of checkpoints in order to keep this overhead under control.  The 
default config has a checkpoint every checkpoint_segments of work, and 
every checkpoint_timeout of time.  That makes for a checkpoint every 5 
minutes, and even more often under heavy load.


If you increase checkpoint_segments a whole lot, all of your checkpoints 
will be based on the timeout instead.  Then you can see how WAL load 
decreases as you increase checkpoint_timeout.  I've had to set 
checkpoint_timeout as high as 30 minutes before on busy systems, to 
lower the WAL overhead.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us

Re: [GENERAL] question regarding full_page_write

2011-08-24 Thread Greg Smith


On 08/24/2011 11:12 AM, Martín Marqués wrote:

Why aren't deltas good enough for the first 8Kb? Is there other
information in the first 8Kb that make those more important?
   


The fundamental problem is what's called a torn page.  You write out a 
8K page; only part of it actually makes it to disk; the server crashes.  
What you now have on disk is completely unpredictable.  It's a mix of 
the old new page, but what portion of each you got, that's almost 
random.  You cannot fix it with any delta.  The only way to make sure 
it's back into a good state is to write an entire 8K page, a known 
correct copy, from some point in time.


The way this is done in PostgreSQL, one of those is written out to the 
WAL the first time any page is touched after a checkpoint.  Those become 
the known copy good to recover from any torn page problem.  Then 
database recovery only has to replay activity since that checkpoint 
marker to fix all torn pages.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] question regarding full_page_write

2011-08-22 Thread Greg Smith


On 08/22/2011 05:07 PM, Martín Marqués wrote:

My question regarding your answer is, why is it important for the
first page after a checkpoint and not on other page writes?
   


The first time a page is written after a checkpoint, when 
full_page_writes is on, the entire 8K page is written out to disk at 
that point.  The idea is that if the page is corrupted in any way by a 
partial write, you can restore it to a known good state again by using 
this version.  After that copy, though, additional modifications to the 
page only need to save the delta of what changed, at the row level.  If 
there's a crash, during recovery the full page image will be written, 
then the series of deltas, ending up with the same data as was intended.


This whole mechanism resets again each time a checkpoint finishes, and 
the full page writes start all over again.  One of the main purposes of 
checkpoints are to move forward the pointer of how far back crash 
recovery needs to replay from.  Starting each new checkpoint over again, 
with a full copy of all the data modified going into the WAL, it is part 
of that logic.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] INSERTing rows from external file

2011-08-16 Thread Greg Smith


On 08/16/2011 05:34 PM, Rich Shepard wrote:

  I have a file with 5500 rows formated as 'INSERT INTO 
(column_names) VALUES ;' that I thought I could read using 
psql from
the command line. However, the syntax, 'psql  < 
filename.sql'

throws an error at the beginning of the first INSERT statement.


Sounds like a problem with your file.  Messing up CR/LF characters when 
moving things between Windows and UNIX systems is a popular one.  Proof 
it works:


$ psql -c "create table t(i integer)"
CREATE TABLE
$ cat test.sql
INSERT INTO t(i) VALUES (1);
INSERT INTO t(i) VALUES (2);
INSERT INTO t(i) VALUES (3);
INSERT INTO t(i) VALUES (4);
$ psql < test.sql
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1

You might also try this:

psql -ef filename.sql

Which will show you the command that's being executed interleaved with 
the output; that can be helpful for spotting what's wrong with your 
input file.


P.S. The fast way to get lots of data into PostgreSQL is to use COPY, 
not a series of INSERT statements.  You may want to turn off 
synchronous_commit to get good performance when doing lots of INSERTs.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] backup-strategies for large databases

2011-08-16 Thread Greg Smith


On 08/13/2011 05:44 PM, MirrorX wrote:

at the moment, the copy of the PGDATA folder (excluding pg_xlog folder), the
compression of it and the storing of it in a local storage disk takes about
60 hours while the file size is about 550 GB. the archives are kept in a
different location so that not a problem. so, i dont want even to imagine
how much time the uncompress and copy will take in 'disaster' scenario.
   


If you haven't actually run this test--confirmed that you can uncompress 
the whole thing and get a working copy out of it again--I'd be concerned 
that you haven't tested your backup procedure fully.  You can't really 
tell if a backup is good or not unless you restore it.  And that process 
will get you a read on just how bad the recovery situation will look 
like if it comes to that one day.


One technique I've used to accelerate the situation you're in is to 
always keep a real filesystem copy of the last backup somewhere.  Then, 
rather than archive the main database directly for the base backup, you 
execute rsync to make that secondary copy identical to the one on the 
master.  That should happen quite a bit faster than making a whole new 
backup, so long as you use the --inplace option.  Once the standby copy 
is done, if you want a compressed archive you can then make it from the 
copy--with no extra load on the master.  And you can then copy that 
again to another place too, followed by having it consume WAL files so 
that it eventually turns into a warm standby.  If you want a true 
fail-over here, you're going to have to make one that is replaying WAL 
files as they arrive.



any (file-system) solutions that keep the disks at sync like DRDB
are suitable?so that the disk of the 2nd server would be at sync with the
1st. even if that works, i would still like to have a 3rd backup in the
storage disks so my question remains.
   


I doubt you'll be able to get DRDB to keep up with the volume you've got 
reliably.  The only filesystem level solution I've seen scale nicely to 
handle the exact problem you have is using ZFS snapshots to make some of 
this easier.  It's worth buying a Solaris license for some people to 
have that technology available.


I had been hoping some of the new things in FreeBSD 9.0 would finally 
make it a lot more practical to consider for this sort of thing once 
that ships.  But it looks like the issues around not supporting Intel's 
latest graphics drivers on recent "Sandy Bridge" servers may postpone 
adopting that further for me.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Postgres on SSD

2011-08-12 Thread Greg Smith


On 08/12/2011 04:24 PM, Vick Khera wrote:

2011/8/10 Ondrej Ivanič:
   

Ups! Well spotted Tomas! The actual values are:
random_page_cost = 2
seq_page_cost = 1

 

With the SSD I would set these to the same value of 1.  That's what I do.
   


That probably makes sense on your RAMSAN.  Sequential access on FusionIO 
drives is at least 3X as fast as completely random though, and 
worst-case it can be even slower relative to what a sequential scan can 
deliver.  It's not the >50X difference seen on regular drives, but 
there's an easily measurable gap.  I'm not sure if it's that the flash 
cells deliver stuff faster when you read a sequential series from the 
same cell of flash, or if it's just that there's less physical IOs 
happening.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Postgres on SSD

2011-08-11 Thread Greg Smith


On 08/09/2011 07:17 PM, Ondrej Ivanič wrote:

I'm about to evaluate this SSD card: FusionIO ioDrive Duo [1]. The
main reason for this experiment is to see if SSD can significantly
improve query performance
Database size is around ~1.4TB. Main tables occupied around 1/3
(450GB, ~220mil rows) and aggregated data occupied 2/3 (900GB). All
indexes are on separate table space (~550GB)
   


ioDrive hardware is fast at executing all sorts of I/O, but it 
particularly excels compared to normal drives with really random 
workloads.  For example, I recently tested them in two different 
systems, both head to head against regular 20 disk RAID10 arrays (Dell 
MD units).  At sequential reads and writes, all drives were basically 
the same; >1.2GB/s reads, >600MB/s writes.  The regular drive array was 
actually a bit faster on sequential writes, which is common with SSD 
showdowns.


Your tables are pretty big; not much of them will fit in memory.  If 
your aggregated queries end up executing a lot of sequential scans of 
the data set in order to compute, or for them to be utilized, you will 
probably discover this is barely faster on FusionIO.  And you certainly 
could speed that up for far less money spent on other hardware.


Is there a component to your workload that does a lot of random read or 
write requests?  If so, is that chunk of the data set bigger than RAM, 
but small enough to fit on the FusionIO drive?  Only when all those 
conditions are true does that hardware really make sense.  For example, 
running a 300GB pgbench test on a system with 128GB of RAM, the FusionIO 
drive was almost 10X as fast as the 20 disk array.  And its raw seek 
rate was 20X as fast at all concurrency levels.


But at the same time, tests on database sizes that fit into RAM were 
slower on FusionIO than the regular disk array.  When there's no random 
I/O to worry about, the slower read/write write of the SSD meant it lost 
the small database tests.


You really need to measure your current system carefully to figure out 
just what it's doing as far as I/O goes to make this sort of decision.  
Given what ioDrives cost, if you're not sure how to do that yourself 
it's surely worth hiring a storage oriented database consultant for a 
few days to help figure it out.



XEN host with 16 CPU (Intel(R) Xeon(R) CPU L5520  @ 2.27GHz). CentOS 5.6
80GB RAM
Storage: some Hitachi Fibre channel SAN with two LUNs:
1st LUN has *everything* under $PG_DATA (used 850 GB)
2nd LUN has *all* indexes (index table space) (used 550GB)
   


Make sure you do basic benchmarks of all this hardware before you start 
mixing even more stuff into the mix.  Both Xen hosts and SANs can cause 
all sorts of performance bottlenecks.  It's possible you won't even be 
able to fully utilize the hardware you've already got if it's running 
with a virtual machine layer in there.  I have no idea how a FusionIO 
drive will work in that environment, but I wouldn't expect  it to be 
great.  They need a fast CPU to run well, and some processing is done in 
the driver rather than on the card.



  checkpoint_segments  | 48
  maintenance_work_mem | 256MB
  shared_buffers   | 9GB
  wal_buffers  | 50MB
  work_mem | 256MB
   


checkpoint_segments should be higher, at least 64 and probably 128 to 
256.  shared_buffers should be lower (at most 8GB, maybe even less).  
maintenance_work_mem should be 1 to 2GB on a server with 80GB of RAM.  
There's no proven benefit to increasing wal_buffers over 16MB.


This setting for work_mem can easily allow your server to allocate over 
250GB of RAM for query working memory, if all 100 connections do 
something.  Either reduce that a lot, or decrease max_connections, if 
you want this server to run safely.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] is max connections in a database table somewhere

2011-08-10 Thread Greg Smith


On 08/10/2011 02:46 PM, Geoffrey Myers wrote:

Is the max connections value in a system table somewhere?


If you intend to do anything with the value you probably want one of 
these forms:


SELECT CAST(current_setting('max_connections') AS integer);
SELECT CAST(setting AS integer) FROM pg_settings WHERE 
name='max_connections';


The setting comes back as a text field when using current_setting on the 
pg_settings view (which isn't a real table, under the hood it's calling 
a system function)


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] heavy load-high cpu itilization

2011-07-27 Thread Greg Smith


On 07/27/2011 04:37 AM, Filippos wrote:

P.S i will send a message to the admins, to ask them to move the topic to
the sub-forum of perfomance
   


Don't do that; will just waste their time.  pgsql-general is a mailing 
list, and the "forum" view you're seeing at Nabble is just a web 
interface to it.  They can't move things around there because they don't 
really own the list; they just make a copy of all its messages.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 100 times faster than mysql

2011-07-26 Thread Greg Smith


On 07/26/2011 10:02 AM, Allan Kamau wrote:

If the speed is to be measured purely (and simply) on these numbers,
186/12 yields 15.5 (or maybe 16 if your round it up or 15 if you use
integer division). May be about 15~16 times faster would be more in
line with numbers provided.
   


I guess he did the math on MySQL, too.  Could be worse; could have ran 
into http://bugs.mysql.com/bug.php?id=33704 which, as you can see, is 
totally not a bug.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] heavy load-high cpu itilization

2011-07-26 Thread Greg Smith


On 07/26/2011 01:47 PM, Filippos wrote:

we have a dedicated server (8.4.4, redhat) with 24 cpus and 36 GB or RAM. i
would say that the traffic in the server is huge and the cpu utilization is
pretty high too (avg ~ 75% except during the nights when is it much lower).
i am trying to tune the server a little bit to handle this problem. the
incoming data in the database are about 30-40 GB /day.
   


Well, the first question is where the CPU usage is coming from.  There 
are two basic schools of thought here:


1) Use real-time monitoring utilities like "top -c" and see what is 
gobbling time up.  It's possible to miss what's happening, but if you're 
at 75% a large chunk of the day that doesn't seem likely.


2) Set log_min_duration_statement and the other logging parameters; 
analyze the resulting log files to see where the CPU time is going.


You seem to be focused on the background writer and its checkpoint 
process right now.  That cannot be the source for high CPU usage; at 
most it could fully use one of your 24 cores.  You should fix 
wal_buffers to a reasonable value regardless, but your problem is not in 
that area.


Importing 30-40 GB/day is extremely difficult to do in PostgreSQL.  My 
guess is that most of the server time is spent running the data import 
process itself--even COPY, the most efficient way to get data in, is 
very CPU intensive.  The second layer of problems here that can increase 
CPU usage come from autovacuum taking up a lot of resources to run, 
which it will do all the time given this volume of activity.  And 
there's always the possibility that the queries you're running against 
the data are just taking a long time to execute.


Another layer of problems in this scenario you'll hit eventually is that 
you'll need to have your tables partitioned in order to prune old data 
out efficiently.  Presumably you can't keep up with that rate for very 
long before you have to start dropping older data, and that's really 
hard to do efficiently unless you've used partitions.


P.S. You should upgrade to PostgreSQL 8.4.8 as soon as possible.  There 
is a bug in autovacuum that's been resolved as of 8.4.6 that you are 
very likely to run into:  
http://www.postgresql.org/docs/8.4/static/release-8-4-6.html


P.P.S. The pgsql-performance list would be a more appropriate place to 
have this discussion at.  Some of the people who provide good input over 
there on topics like this don't read pgsql-general, too many messages on 
this list for them.



are there any suggestions what i can do to tune better the server? i can
provide any information you find relevant for the configuration of the
server, the OS, the storage etc
   


There's a chapter on each of these in my PostgreSQL performance book, 
and I'm not aware of any other resource that takes on all of these 
topics usefully.  If you're trying to keep up with this volume of data, 
buying a copy of that should repay itself in time savings--where you can 
look something up rather than trying to figure it out from 
scratch--about once every week.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Building an home computer for best Poker Tracker performance

2011-07-20 Thread Greg Smith


On 07/20/2011 02:14 AM, Stuart Cooper wrote:

Nice to see some poker being discussed on this list. Much more
reputable than Stock Trading.
   


If the casinos did something like what traders call "front-running", 
they'd all be put in jail the next day for cheating their customers.  
Just a day in the life of many trading firms though.



Opponent: AT
Stuart the LuckyBum: 88
Flop: ATT

Myself and opponent checked that flop,


Since they decided to slow-play after flopping three of a kind in a 
limit game, they deserved to lose the hand.  Never should have let you 
see the next card for free; they got greedy and paid for it.



Opponent was pretty surprised to lose, after his third river raise I
began to consider he might even have hold TT for better quads
than mine.
   


I know that sinking feeling well.  My last one involved hitting an 
Ace-high flush on the river, only to see a fourth of that suit appear on 
the river.  When those ragged cards sorted in my head for the first 
time, I realized my suddenly excited opponent must have just finished 
his straight flush.  Oops.  At least I made him pay to see the winning card.


I used to average around 150 hands per hour; that rate can show you one 
of these 990:1 shots every 7 hours of play.  Leaves one with a healthy 
respect for the sharp pointy bit on the end of the bell curve, after 
you've been stabbed with it a few times you start to remember it's there.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Building an home computer for best Poker Tracker performance

2011-07-19 Thread Greg Smith


Bruce Momjian wrote:

Greg, tell me you didn't get involved with Postgres because of Poker
Tracker.  :-O  :-)
  


Nah, both came out of my working on stock trading systems.  I just wrote 
a bit more about this whole subject at 
http://blog.2ndquadrant.com/en/2011/07/pushing-allin-with-postgresql.html 
if anyone would like to see what the actual queries look like against 
the Poker Tracker database.


One of the things you have to come to grips with when writing a trading 
system is that you can never assume something won't happen just because 
it's really unlikely.  Traders have adopted the term "black swan" to 
talk about these sort of completely unexpected things that destroy their 
bankroll.  On-line poker is quite good at teaching you this lesson 
quickly, whereas traders might go a decade or more before encountering a 
similarly unlikely event.


Lose an ace-high flush to a straight flush; lose 4 of a kind to a higher 
4 of a kind; rare, but I watched them both happen to me on multiple 
times when playing.  If I'd bet so much that I'd have been wiped out by 
either event, even though I was *sure* I would win that hand, I'd have 
been done for.  And a couple of times, I've seen the most rare thing of 
all:  the initial 3 cards come out, and I have a hand where I can only 
lose if the other player gets the exact two cards they need.  For example:


Greg:  KK
LuckyBum:  QQ
Flop:  K82

The only way the other player can win here is if the remaining cards are 
QQ, giving them 4 of a kind over my full house at the end of the hand.   
Assuming no other players are involved, that's 45 cards left in the 
deck, and the odds of them getting one of those two followed by the 
other are 2/45 * 1/44.  And yet I've lost this sort of 990:1 long shot 
multiple times.  It definitely gives you a better gut feel for 
"gambler's ruin", one that translates back into stock trading--and into 
thinking about how to really achieve high-availability for a computing 
system, too.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Programmer ( Postgres), Milwaukee - offsite-Remote - onsite

2011-07-19 Thread Greg Smith


On 07/18/2011 05:58 PM, MS Rao wrote:


*Programmer ( Postgres), Milwaukee -- offsite-Remote - onsite*

We are looking for Postgres skilled programmer with the following skills:



Job listings are not appropriate for the pgsql-general mailing list.  
Please direct them to the pgsql-jobs list in the future instead:  
http://archives.postgresql.org/pgsql-jobs/


In addition to that being the policy here, using that list instead means 
that we can block people replying to the whole list with their resumes 
and similar details they didn't mean to make public (which does happen).


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD

Re: [GENERAL] announcements regarding tools

2011-07-19 Thread Greg Smith


On 07/19/2011 09:00 PM, Bruce Momjian wrote:

We have the same problem with people posting to pgsql-jobs where they
don't mention the location of the job.  I usually email the people
privately about this.
   


Used to have.  pgsql-jobs is now moderated because the junk posts were 
starting to outnumber the real ones.  I just ejected an off-topic J2EE 
job posting earlier today that would have made it through until this 
recent change.  Jobs that are too vague on details to be useful should 
get similarly bounced back to the poster, to be reworked before they'll 
be accepted.


It's impractical to do the same thing for something as complicated as 
platform support, at least using negative reinforcement, on the more 
controversial and popular announce list.  The list of rules at 
http://wiki.postgresql.org/wiki/NewsEventsApproval is already too big 
for moderators to be expected to enforce them all correctly in every 
case; adding more isn't likely to do anything useful.  The best I think 
we could do here is update that page to encourage people to list this 
information for their own benefit, and then have some regular announcers 
lead by example, by including the sort of info Scott is asking for into 
their messages.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Building an home computer for best Poker Tracker performance

2011-07-19 Thread Greg Smith

ting for 4 to 8 of them probably.


Two final notes:

-Make sure to follow the basic database tuning guidelines at 
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server to get 
your configuration in the right area.  I think your workload might 
really benefit from turning off synchronous_commit in particular, so try 
both settings there.  It won't kill you to lose a hand or two of history 
if you disable that, and the speed improvement could be large.


-If you want to start up a discussion about optimizing your server, that 
would be better done on the pgsql-performance list than this one.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How to create "auto-increment" field WITHOUT a sequence object?

2011-06-30 Thread Greg Smith


On 06/30/2011 03:01 PM, Dmitry Koterov wrote:


It may not be "fully" gapless. The main cause is to keep uniq_id as 
low as it could be to not to exhaust 10 values too fast.
I think solutions with addition tables look too complicated for this 
case, is there a possiblilty to not to use an additional table?


You may think it's possible to build a gapless design that is less 
complicated by writing some application or server code to enforce it.  
You've already tried this and learned that it's much harder than it 
seems.  Doing this correctly without causing timeout and deadlock issues 
is a hard problem.


Meanwhile, generating a gapless translation table that only includes 
things that have been committed is easy, and you're not likely to run 
into really strange and unexpected bugs in that implementation later.


Given those are the two situations you're comparing here, I would say 
using the extra table is less complicated in every way.  Sure, you're 
adding another table, but the process happening against it is really 
easy.  The alternative doesn't have the extra table, but that doesn't 
make it less complicated.  Complexity needs to consider how difficult a 
program is going to be to debug and maintain.  And in those areas, 
making a single table gapless is quite complicated.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
Comprehensive and Customized PostgreSQL Training Classes:
http://www.2ndquadrant.us/postgresql-training/


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Setup postgres with automatic table and user intitiallisation

2011-06-28 Thread Greg Smith


On 06/28/2011 05:34 AM, Lodron, Gerald wrote:

I have the problem that when i run my batch i always have to let the user type 
in the password, thats nasty... I also tried PGPASSWORD but it does not work, 
it always prints that it is the wrong password...
   


I'm not sure if setting PGPASSWORD in a Windows batch file is going to 
work correctly...have you tried creating a .pgpass file instead?  
http://www.postgresql.org/docs/9.0/interactive/libpq-pgpass.html


Another option to consider is swapping out the default pg_hba.conf file 
with one that trusts local users fully.  You could put that in place, 
signal the server to reload its configuration, execute your setup 
script, then put back a properly secured file and reload again.  There 
will be a brief window where any local user could gain access to the 
database server as a superuser that way.  But if you were that worried 
about locking down security, you probably wouldn't be deploying an 
auto-installer on Windows in the first place.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
Comprehensive and Customized PostgreSQL Training Classes:
http://www.2ndquadrant.us/postgresql-training/


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] rationale behind quotes for camel case?

2011-06-28 Thread Greg Smith


On 06/28/2011 12:09 PM, dennis jenkins wrote:
Example:  Suppose that I have a table called "foo" and another table 
called "barBiz" (or "bar_biz" if you prefer).  Further, both of these 
tables have a "serial" primary key.  Now I want to create a third 
table that represents a many-to-many relationship between "foo" and 
"barBiz".  So far I have been keeping compound-noun table names in 
camel case, but mapping tables separate the base table names with 
underscores.  Thus the table name would be "foo_barBiz".


The options are basically:

1) Use case to help sort this out

2) Use lots of underscores and cope with the ambiguity

3) Pick something to put in the middle to represent relationships 
between things, to make them less ambiguous.  You might name this 
foo_to_barbiz or the compact but expressive foo2barbiz as two examples.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
Comprehensive and Customized PostgreSQL Training Classes:
http://www.2ndquadrant.us/postgresql-training/



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Multi-tenancy in Postgres

2011-06-28 Thread Greg Smith


On 06/28/2011 05:45 PM, Rob Sargent wrote:

I think Greg might be forgetting that some of us don't always get to
choose what we work on.  I was in a shop that decided to go with
multi-tenancy for reason both technical and um, er envious.


There are certainly successful deployments of multi-tenant PostgreSQL 
out there, ones that make sense.  What I was trying to communicate is 
that the particular variation proposed by this academic paper doesn't 
seem the right direction for PostgreSQL development to head in to me.  
This project is stubborn about resolving the problems people actually 
have, and the ones the paper tries to solve are not the ones I've seen 
in my own experiments in multi-tenant deployments.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
Comprehensive and Customized PostgreSQL Training Classes:
http://www.2ndquadrant.us/postgresql-training/


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Multi-tenancy in Postgres

2011-06-28 Thread Greg Smith


Emrul Islam wrote:
I've just read through a paper 
here: http://www.edbt.org/Proceedings/2011-Uppsala/papers/edbt/a12-schiller.pdf 
about multi-tenancy.


They used Postgres for their work and while it is academic and would 
need further work I'm just wondering if anyone in the Postgres team is 
looking at implementing some of the functionality described?


They seem fuzzy about what actual businesses who implement multi-tenant 
environments, such as hosting companies, actually want and find missing 
in PostgreSQL right now.  They've tried to solve a problem I never would 
have considered interesting in the first place.


On the "Shared Machine" side of things, we find complaints like how 
individual PostgreSQL instances use too much power.  See "Latch 
implementation that wakes on postmaster death", currently under 
development aimed at 9.2, aimed right at kicking that one around.


This "Shared Table" approach they spend so much time worrying about and 
improving?  No one cares about that except companies hosting a single 
application on their giant box.  This idea that there are large number 
of tenants running the same application, but whom need to be isolated 
from one another in some way, is not the normal state of things.  Yes, 
it happens on the big servers at Salesforce.com who all run the same 
application; that is not a common situation however.


What the hosting companies actually want from PostgreSQL is a good 
implementation of "Shared Process".  One database install, every tenant 
gets their own schema, tables and are expected to use some resources.  
You can do this right now; I believe the infrastructure at Heroku is 
built that way for example.  How do the ideas in this paper actually 
solve the problems they're seeing with that approach?  I don't know for 
sure, but I don't see anything exciting there.


I makes me kind of sad when people put a lot of work into doing a good 
job on a problem that doesn't really matter very much in the real world, 
and that's the overwhelming feel I get from reading this paper.  
Advanced schema inheritance stuff?  Don't care.  Providing query cost 
constraint limits for individual tenants?  Now that's a useful problem 
to talk about, one that people deploying multi-tenant databases are 
actually being killed by.  And discussing aspects of that problem does 
flare up among the PostgreSQL developers regularly.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
Comprehensive and Customized PostgreSQL Training Classes:
http://www.2ndquadrant.us/postgresql-training/


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] rationale behind quotes for camel case?

2011-06-28 Thread Greg Smith


fluca1...@infinito.it wrote:
first of all I'm not expressing any critique against the use of quotes 
for identifier expressed using camel case. However a lot of new 
postgresql users seems to be unhappy with the use of quotes for camel 
case identifiers, so I'd like to know what is the rationale behind it. 
I mean, is a feature of the partser or is a need for it to work? 
Anyone can point me to an explaination?


The need for quoting and use of CamelCase are only vaguely related to 
one another.  There are situations other than CamelCase where quotes are 
needed, and using CamelCase doesn't necessarily require quotes.


If you follow the end of the documentation at 
http://www.postgresql.org/docs/9.0/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS 
you'll find this explanation of what the quoting issue is all about:


 Quoting an identifier also makes it case-sensitive, whereas unquoted 
names are always folded to lower case. For example, the identifiers FOO, 
foo, and "foo" are considered the same by PostgreSQL, but "Foo" and 
"FOO" are different from these three and each other. (The folding of 
unquoted names to lower case in PostgreSQL is incompatible with the SQL 
standard, which says that unquoted names should be folded to upper case. 
Thus, foo should be equivalent to "FOO" not "foo" according to the 
standard. If you want to write portable applications you are advised to 
always quote a particular name or never quote it.)


So saying you're unhappy with the need for quoting of mixed-case items 
isn't going to get you very far, as the behavior here is all tied up 
with the trivia of the SQL spec.  The specification intends that 
anything other than all upper-case naming requires quoting, and 
PostgreSQL turns that around to say that everything goes to lower-case 
by default.  It's actually possible to leave out the quoting if you 
really want to, but as advised here you have to be consistent about it.  
Once you've created something with a fully quoted name, instead of 
letting it "fold" the name to all lower-case, you must continue to 
reference it that way in the future.  There's nothing stopping you from 
just never quoting anything though.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
Comprehensive and Customized PostgreSQL Training Classes:
http://www.2ndquadrant.us/postgresql-training/


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Oracle / PostgreSQL comparison...

2011-06-24 Thread Greg Smith


On 06/23/2011 10:28 PM, Stephen Frost wrote:

I love how he finishes with the claim that Oracle "keep their finger on
the pulse of where IT is headed", right after admitting that their
client is actually a huge piece of junk.
   


Oracle is able to keep their finger on the pulse of their customers, 
because they have their hands where they can firmly squeeze their...uh, 
wallets.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Help making tablespaces work for my application

2011-06-24 Thread Greg Smith


On 06/23/2011 09:37 AM, Natusch, Paul wrote:


I have an application for which data is being written to many disks 
simultaneously.  I would like to use a postgres table space on each 
disk.   If one of the disks crashes it is tolerable to lose that data, 
however, I must continue to write to the other disks.




Tablespaces are not useful for implementing this idea yet.


1. There is a single WAL log for the entire cluster, located in the 
pg_log subdirectory. If the disk containing the pg_log file crashed, 
does that mean the system would come to a halt.  Is there anyway to 
distribute this data so that WAL is located on the same media as the 
table space? An alternative would be to use raid with the disk that 
stores the pg_log subdirectory but that adds cost to the system.




Loss of the pg_xlog subdirectory and the WAL contained in it normally 
results in catastrophic database failure.  Recommended practice is to 
use a RAID-1 volume to make odds of that failure lower.


2.  If #1 was solved by using the raid approach, what happens if one 
of the disks containing one of my table spaces crashes.  At some point 
postgres will want to write the data from the WAL file to the crashed 
(unavailable) disk.Will postgres will be blocked at this point? 
 Is there some way to notify postgres that a specific disk is no 
longer available and that the entries in the WAL for this disk should 
either be purged or ignored? ( I'm willing to "throw away" the data on 
the crashed disk).




PostgreSQL can't be expected to operate sanely when faced with the loss 
of an individual tablespace.  It may be possible to recover from it, but 
you'll be doing something it's not designed to handle, and that effort 
may not succeed.


Note that any tablespace failure is likely to require taking down the 
database to repair the involved tablespaces, so you're likely to have 
downtime between a component failure and when you notice to take 
action.  The database really does not like having tablespaces just go 
away in the middle of operations.


PostgreSQL 9.1 (not released yet, currently in beta) includes a new 
feature called "unlogged tables" that might make this sort of deployment 
possible.  If you created a tablespace for disposable data and put an 
unlogged table onto it, loss of that tablespace would me much less 
likely to cause a problem.  So long as you recreated a new space for the 
unlogged table after restarting, you could probably recover having only 
lost the data on the crashed disk in this situation.



Clearly using raid on all of the disks would be a solution, but that 
is cost prohibitive.





On a cheap server I can easily RAID-1 mirror a pair of drives on Linux 
using software RAID, and individual drives are $50 to $100 each.  If 
your data isn't worth that much, And even that's not enough to really 
make me feel secure about the data--you really need to keep another copy 
around as a backup, too.  You can treat your data as disposable and 
expect to lose it when any single component fails, or you can include 
some good redundancy practices in the design to reduce odds of a 
failure.  There really isn't really a good solution providing partial 
protection in the middle of those two.


--
Greg Smith   2ndQuadrant usg...@2ndquadrant.comBaltimore, MD
PostgreSQL Training, Services, and 24x7 Supportwww.2ndQuadrant.us
"PostgreSQL 9.0 High Performance":http://www.2ndQuadrant.com/books

Re: [GENERAL] Oracle / PostgreSQL comparison...

2011-06-23 Thread Greg Smith


On 06/23/2011 10:28 PM, Stephen Frost wrote:

Next, PG doesn't even use the same basic technology as Oracle regarding
how transaction isolation and versioning works.  Oracle using rollback
segments to store 'old' rows in, while PG uses a Multi-Version
Concurrency Control (MVCC) system.  They're fundamentally different
things, so the notion that PG is somehow a 'reverse engineered' Oracle
is complete bunk.


I stole some inspiration from this comment for my own response, which I 
just posted to the site.  I'll save a copy here in case the author 
becomes so embarrassed by his mistakes he deletes it:


The idea that PostgreSQL is reverse engineered from Oracle is 
ridiculous. Just a look at the vast differences in the MVCC 
implementation of the two; Oracle's redo logs vs. PostgreSQL WAL are 
completely difference designs.


As for there being no unique features in PostgreSQL, that's completely 
wrong too. A good example is how deep the transactional DDL 
<http://wiki.postgresql.org/wiki/Transactional_DDL_in_PostgreSQL:_A_Competitive_Analysis> 
features go--Oracle has started to catch up recently, but PostgreSQL 
still has a lead there. The ability extend the type system and indexes 
with your own custom items are also better in PostgreSQL than any other 
database. This is why the PostGIS add-on (built using the type extension 
facility) is busy displacing installations of the weaker Oracle Spatial 
at installations all over the world right now.


As for support, there are half a dozen companies in the world you can 
buy PostgreSQL support from at a fraction of the rate Oracle charges for 
it. I routinely fix bugs in the database itself within hours of report 
for my customers, as part of a service contract, which is an option on 
top of the free community support. Because PostgreSQL is open-source, 
there are multiple vendors available who provide this service. With 
Oracle as a closed source product, there can only be one who is capable 
of offering this quality of support. And that single source vendor has 
quite a history of squeezing as many dollars out of its customers as its 
can. Since there is choice among PostgreSQL support companies, you'll 
never get into that position with it.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Re: [GENERAL] Help needed with PostgreSQL clustering/switching from MySQL

2011-06-21 Thread Greg Smith


On 06/21/2011 10:00 AM, Vick Khera wrote:

Postgres has nothing quite like the MySQL cluster mode with NDB. You
will have to re-think your solution if you want to use postgres to
distribute your queries and data across multiple servers.
   


The closest thing to a NDB cluster in PostgreSQL is using PL/Proxy to 
split data across multiple nodes.  Both use similar hash-key methods to 
distribute things across more than one system, and you can run queries 
that return a combined set of results bigger than any single node could 
have handled.  But even that's extremely different from NDB as far as 
what the interface for executing queries is like.


Vick is absolutely right here:  asking about whether PostgreSQL solves 
the very specific problems that MySQL NDB has isn't even the right 
question.  The two don't compare directly at all; different replication 
approach, different node distribution approach, different query 
approach.  You need to return to basics instead:  what is the actual 
business and/or technology need that has to be solved?  From that there 
may be a PostgreSQL solution that makes sense, using its replication and 
query distribution mechanisms.  But it's extremely unlikely that will 
look like a NDB cluster at all, and therefore very unlikely to have the 
same problems at all.  You'll get a whole new mystery set instead!


One of the most common mistakes I see people make when architecting 
database systems is assuming they have to use one of these really 
complicated sharded approaches to make their server perform well.  
Unless you have a massive database or extremely high write volume, it's 
way more trouble than it's worth to go through distributing writes onto 
multiple nodes.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Tuning for a tiny database

2011-06-21 Thread Greg Smith


On 06/21/2011 01:49 AM, CSS wrote:

Some raw numbers: We're only looking at a total of about six tables in
one db.  In total there are going to be well under 10,000 records in ALL
tables.  That might increase to at most 100,000 in the next few years.
Our raw DNS queries/second tops out around 50 qps over three distinct
servers.  Keeping in mind that PowerDNS is doing heavy caching, we
should never really see more than a few db queries per second.


I doubt you really need to do any tuning for this scenario.  I would set 
shared_buffers to a modest value--maybe 256MB--and stop further tuning 
until there's some evidence it's necessary.


If presented with the same problem but with the much harder twist "I 
need to support >10,000 queries/second", I would recommend:


-Populate a prototype with a representative amount of data
-Measure the database size
-Set shared_buffers to that
-Whenever the database is restarted, construct a series of queries that 
forces all the data used regularly into the database's cache

-Use pg_buffercache to confirm what's in there is what you expect

Getting all the data into cache is sometimes harder than expected.  Some 
optimizations in PostgreSQL keep it from caching large amount of tables 
when you do a sequential scan of the contents, as one example that 
complicates things.  But if you get to where this is necessary, building 
such a tool isn't difficult, and there are some projects out there that 
address this particular need:  filling the cache back up with relevant 
data after restart.  This is the main one:


http://pgfoundry.org/projects/pgfincore/
http://www.pgcon.org/2010/schedule/events/261.en.html

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] While converting the Master to Standby , FATAL: timeline 2 of the primary does not match recovery target timeline 1

2011-06-18 Thread Greg Smith


On 06/18/2011 05:22 AM, Nithya Rajendran wrote:



We are doing Master/Standby setup, We unplugged the master server, 
then the standby server became the master (by creating trigger file).


While converting the old master to new slave server, We are getting 
the following error in old master,





You can't make a system that used to be in the replication set join it 
again the way you're trying to do.  The timeline feature you're seeing 
errors from is there to keep you from making mistakes like this.


Even though this system used to be the master, it still needs to go 
through the full procedure for creating a new standby:  start a backup, 
sync all the files from the master, send new archive WAL data over.  The 
fact that this used to be the master doesn't change that.


If the database is large, it may be worthwhile to use rsync and features 
such as its "--inplace" feature to make the copy from new master -> old 
master faster.  Since it has older copies of the files, the copy can go 
faster than one to an empty system would take.  But you can't just 
convert the old master to be a standby of a new master.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Re: [GENERAL] Another RAID controller recommendation question

2011-06-18 Thread Greg Smith


On 06/18/2011 02:46 AM, David Boreham wrote:


We're looking to deploy a bunch of new machines.
Our DB is fairly small and write-intensive. Most of the disk
traffic is PG WAL. Historically we've avoided
RAID controllers for various reasons, but this new deployment will be
done with them (also for various reasons ;)


If the traffic is heavy on WAL, avoiding RAID controllers isn't a great 
practice.  They're by far the best way possible to speed that up.



http://www.newegg.com/Product/Product.aspx?Item=N82E16816101339
manufacturer page :
http://www.supermicro.com/products/system/1U/6016/SYS-6016T-URF4_.cfm?UIO=N 



This a solid basic server model.  The Intel 5520 chipset they're built 
on is nice and fast if you load it up with a bunch of RAM.



these boxes have a proprietary controller slot, with these cards:
http://www.supermicro.com/products/nfo/UIO.cfm#Adapters
specifically this LSI-based one which seems to be the newest/fastest, 
with BBWBC:

http://www.supermicro.com/products/accessories/addon/AOC-USAS2LP-H8iR.cfm


I don't really like the whole proprietary controller slot thing if it 
can be avoided.  We seem to have a lot of customers buying from Dell 
recently, and it's partly because they've made it pretty straightforward 
to swap out their PERC controller.  That makes troubleshooting a broken 
server easier, spare parts are simple to manage, lots of advantages.  
You almost need to stock your own spares for things like the RAID cards 
if they're these propriety slot ones, because you're unlikely to find 
one in an emergency.


That said, the card itself looks like plain old simple LSI MegaRAID.  
Get the battery backup unit, check the battery and cache policy to make 
sure they're sane, and learn how to use megaci to monitor it.  Fast and 
generally trouble free after that initial setup time investment.




These machines are operated in a lights-out mode, and
will handle heavy constant load (hundreds of write txn/s) with 15K SAS 
drives
in a RAID-1 setup (2 drives, or 2 + 2 with data and WAL split between 
spindle groups).


If you can try to measure the exact ratio of database to WAL traffic 
here, that might help guide which of these configurations makes more 
sense.  Hard to answer in a general way.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 2 questions re RAID

2011-06-17 Thread Greg Smith


On 06/17/2011 01:02 PM, Scott Ribe wrote:

1) Is my impression correct that given a choice between Areca&  Highpoint, it's 
a no-brainer to go with Areca?
   


I guess you could call Highpoint a RAID manufacturer, but I wouldn't do 
so.  They've released so many terrible problems over the years that it's 
hard to take the fact that they may have something reasonable you can 
buy now (the 43XX cards I think?)  seriously.



 And, in further digging, I discover that gh is an option for me. Anyone got 
comments on these? (I notice that they use ultracapacitor/flash to protect 
cache...)



Atto is so Mac focused that you're not going to find much experience 
here, for the same reason you didn't get any response to your original 
question.  Their cards are using the same Intel IO Processor (IOP) 
hardware as some known capable cards.  For example, the ExpressSAS R348 
is named that because it has an Intel 348 IOP.  That's the same basic 
processor as on the medium sized Areca boards:  
http://www.areca.us/products/pcietosas1680series.htm  So speed should be 
reasonable, presuming they didn't make any major errors in board design 
or firmware.


The real thing you need to investigate is whether the write cache setup 
is done right, and whether monitoring is available in a way you can talk 
to.  What you want is for the card to run in write-back mode normally, 
degrading to write-through when the battery stops working well.  If you 
don't see that sort of thing clearly documented as available, you really 
don't want to consider their cards.



2) I understand why RAID 5 is not generally recommended for good db 
performance. But if the database is not huge (10-20GB), and the server has 
enough RAM to keep most all of the db cached, and the RAID uses 
(battery-backed) write-back cache, is it sill really an issue?
   


You're basically asking "if I don't write to the database, does the fact 
that write performance on RAID5 is slow matter?"  When asked that way, 
sure, it's fine.  If after applying the write cache to help, your write 
throughput requirements don't ever exceed what a single disk can 
provide, than maybe RAID5 will be fine for you.  Make sure you keep 
shared_buffers low though, because you're not going to be able to absorb 
a heavy checkpoint sync on RAID5.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] how to find a tablespace for the table?

2011-06-17 Thread Greg Smith


On 06/17/2011 06:50 PM, hyelluas wrote:

I'm looking into pg_tables view and only one tablespace is displayed is
pg_global.
All my tables are created in my custom tablespace and that column is empty
for them.
   


I'm not sure what's wrong here, but the query you are trying to use to 
decode this information doesn't look quite right.  pg_tables is just a 
regular query; here is its source code:


CREATE VIEW pg_tables AS
SELECT
N.nspname AS schemaname,
C.relname AS tablename,
pg_get_userbyid(C.relowner) AS tableowner,
T.spcname AS tablespace,
C.relhasindex AS hasindexes,
C.relhasrules AS hasrules,
C.relhastriggers AS hastriggers
FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
 LEFT JOIN pg_tablespace T ON (T.oid = C.reltablespace)
WHERE C.relkind = 'r';

I think that if you start with this and try to experiment from there, 
you may be able to figure out what's going on here a little better.  
This connects up the main relevant tables in the right way.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Postgres performance and the Linux scheduler

2011-06-16 Thread Greg Smith


On 06/16/2011 02:10 PM, Simon Windsor wrote:


Can the performance of Postgres be boosted, especially on busy 
systems,  using the none default DEADLINE Scheduler?





Some system reports moderate improvements in throughput, and sometimes 
larger ones in worst-case latency, when switching from the default to 
the deadline scheduler.  Others report  best performance with the noop 
schedule.  Changes here are not effective in a lot of cases though.


I wrote an article for the first issue of PostgreSQL Magazine that 
mentions this, as the last of the most useful things you can tweak on 
Linux; that's available at  http://pgmag.org/


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Re: [GENERAL] PostgreSQL 9.0 or 9.1 ?

2011-06-16 Thread Greg Smith


On 06/16/2011 10:06 AM, Achilleas Mantzios wrote:

Till the end of July i must have finished all the migration to the new versions.
So i am asking what would be better from your perspective to do?
Go for 9.1? or stick to 9.0 and try to deploy it and take the most out of it?
When is a stable (release) version of 9.1 be available?
Has any one faced any issues migrating from 9.0 to 9.1
   


I would place odds at about 1/3 that 9.1 will be available by the end of 
July.  But you will still need to do testing of your application first 
before deploying onto that version.  Realistically, even the earliest of 
9.1 adopters is unlikely to launch before August.  As such, there's not 
very much experience about the migration available yet, either.


A large number of the new features in 9.1 aim at making certain types of 
development easier.  The must-have features I am hearing demand for from 
my customers (who admittedly care more about replication and performance 
features than most), such that they are postponing some deployments 
until 9.1 ships because 9.0 just doesn't do what they want, are:


-Synchronous replication
-Support for MIN/MAX queries against partitioned tables
-Feedback mechanism to reduce query conflict resolution when using Hot 
Standby

-Much improved monitoring for replication and Hot Standby queries

I'd suggest you take a look at the 9.1 release notes and beta 
announcement:  http://www.postgresql.org/about/news.1313 , 
http://www.postgresql.org/docs/9.1/static/release-9-1.html


And if you don't see a major compelling reason to wait for 9.1, some 
feature in that list that makes your life a lot easier, you really 
should just deploy 9.0 and move on.  The most critical thing fixed in 
9.1 development that may apply to what you're doing--some bug fixes to 
pg_upgrade--have all been backported to 9.0 now.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] You could be a PostgreSQL Patch Reviewer!

2011-06-16 Thread Greg Smith


On 06/15/2011 04:49 PM, Josh Berkus wrote:

You do not have to be a C coder to be a patch reviewer.  Pretty much all
you need to know is:
- how to checkout PostgreSQL from Git
- how to build PostgreSQL from source
- how to apply a patch
   


And you don't even really need to know these things well.  If you're on 
a UNIX system that has git installed, here's a complete example of how 
to build a custom PostgreSQL that includes a patch for review:


cd
git clone git://github.com/gregs1104/peg.git
export PATH="$HOME/peg:$PATH"
mkdir pgwork
peg init test
cd pgwork/src/test
patch -p1 < ~/mytest.patch
. peg build
psql

Just substitute the name of the patch you're using in the "patch" step 
here, and if it applies correctly (one of the first things patch review 
intends to test for) you're done.  This does a local installation of 
PostgreSQL into a tree under your home directory, with reasonable 
defaults for all of the parts related to downloading the source code and 
compiling it.  See the documentation for the peg utility at 
https://github.com/gregs1104/peg for more examples


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] determine client os

2011-06-13 Thread Greg Smith


On 06/13/2011 07:04 AM, Sim Zacks wrote:
I didn't see a function for this, but is there a way in a postgresql 
query to determine the client OS?


A PostgreSQL client application is something that speaks a correct 
protocol to the server.  The server has no concept of what the client is 
other than the fact that it speaks a particular version of its 
language.  It knows the IP address and port number it connected on, but 
that's basically it.  So the idea of a query determining the client OS 
doesn't make sense; all it knows is what the client tells it, and the 
information required to connect to the server and execute queries does 
not include any such details.


If you want information about a client to make its way into a statement 
run on the server, you have to drive that from the direction of the 
client you're using yourself.  If your client is psql for example, you 
might pass client-side information into the program by using the 
-v/--set/--variable substitution mechanism, possibly combined with the 
SQL interpolation facility of psql.  But if your client program is in 
another programming language, you'll have to use some facility in it to 
fill in this information.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Write performance on a large database

2011-06-09 Thread Greg Smith


On 06/09/2011 08:15 AM, Håvard Wahl Kongsgård wrote:
Hi, I have performance issues on very large database(100GB). Reading 
from the database is no problem, but writing(or heavy writing) is a 
nightmare.
I have tried tuning postgresql, but that does not seem to improving 
the writing performance.

To improve the write performance, what are my options?


Well, technically you have performances issues on a medium sized 
database.  The simplest answer to your question is "buy a server with 
96GB of RAM".  If it's still possible to get a server that holds your 
entire database in memory for a moderate investment, it's really not 
large yet.


There are many free guides that discuss various aspects of write 
performance and tuning around them, some of which are excerpts from my 
book which goes over all of this territory:


http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server : Covers 
general server tuning
http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm : 
Discusses the exact way checkpoints work and how people commonly tune them
http://wiki.postgresql.org/wiki/Reliable_Writes : all four of the 
references there cover this area.
http://projects.2ndquadrant.com/talks : "The Write Stuff" presentation 
goes over some of the limitations people run into with high write volume 
applications.


I'd suggest taking a look at those.  If you want to talk more about this 
afterwards, start a new discussion on the pgsql-performance list with 
some of the information recommended at 
http://wiki.postgresql.org/wiki/SlowQueryQuestions :  disk controller 
and disk info, PostgreSQL version, and database server configuration all 
have a lot of impact here.  The contents of pg_stat_bgwriter would be 
interesting too.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Regular disk activity of an idle DBMS

2011-05-29 Thread Greg Smith


On 05/29/2011 02:42 PM, Andrej Podzimek wrote:
I identified the most active process, at least twenty times more 
active than any other process on the system:


postgres  3086  0.1  0.0  34688  2584 ?Ss   03:11   1:16 
postgres: stats collector process


So it's the statistics collector. However, there does not seem to be 
any database activity at all. I tried looking at the numbers returned 
by this query:


select datname, tup_returned, tup_fetched from pg_stat_database ;

Nothing changes there. When OpenFire, Courier-MTA and Apache are 
restarted, a few numbers change, but othrewise they remain unchanged 
pretty long. There is no obvious activity that could trigger a disk 
write 20 times a minute...


There are things that the statistics collector might be updating that 
don't show up in the pg_stat_database totals.  It aims to write updates 
approximately every 500ms, so your write rate sounds normal.  The 
expectation is that the operating system is actually caching most of 
those, so that the actual load on the system is minimal.  So it sounds 
like you've identified the cause here, and it is normal, expected activity.


One thing that can cause statistics overhead to be higher than it should 
be is a larger statistics file than is strictly necessary.  We hear 
reports of those sometimes, I've never been completely clear on all of 
the possible causes that make this happen.  But running "select 
pg_stat_reset();" should clear that out and start fresh again.  That 
will sometimes eliminate situations where the I/O seems larger than it 
should be for people.


If you do that, and there's still activity going on, there's no easy way 
to fix that.  As mentioned in 
http://www.postgresql.org/docs/9.0/static/monitoring-stats.html , it's 
possible to change PGSTAT_STAT_INTERVAL at server compile time to make 
it write statistics less frequently.  There's no easier way to adjust 
that though.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Regular disk activity of an idle DBMS

2011-05-28 Thread Greg Smith


On 05/28/2011 11:02 AM, Andrej Podzimek wrote:
after configuring a new home server with PostgreSQL 9.0.4, I observe 
some regular disk activity, even though the server is completely idle 
(disconnected from the network, no users but one logged in). There are 
very short write bursts once in about 3 seconds.


There are a couple of things that can cause unexpected disk activity:

-autovacuum running in the background.  Setting 
log_autovacuum_min_duration  may help you determine when this is happening.
-checkpoint activity.  Turning on log_checkpoints, as well as looking 
for changes in the pg_stat_bgwriter view, may help explain if this is 
the case.
-Hint bit updates.  Even if you are only reading from a table, in some 
situations write activity can be generated.  See 
http://wiki.postgresql.org/wiki/Hint_Bits for more information.
-Statistics collector updates.  If the one logged in user is doing 
anything at all, they might be generating something here.


Figuring out if the writes are happening from a regular PostgreSQL 
process, or if they are happening via the background writer, might also 
be useful here.  Saving the output from "top -b -c" can be useful for 
this.  The iotop command is very helpful for tracking down this sort of 
problem too.  The background writer process, which also handles 
checkpoints, will have the same process ID once it's started.  So will 
the statistics collector.  If you track I/O to one of those two, it 
should narrow possible causes quite a bit.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] SELECT COUNT(*) execution time on large tables (v9.0.4-1)

2011-05-28 Thread Greg Smith


On 05/27/2011 12:41 PM, Carlos Sotto Maior (SIM) wrote:

I have browsed catalog tables, digging for a real time Row.count but  so far
did not find any.
   


See http://wiki.postgresql.org/wiki/Slow_Counting which shows you where 
the one system count estimate is at, as well as suggesting links to 
where you can find alternate approaches here.  If you need an exact 
count and can't afford to generate a full query to find one, some sort 
of trigger-based approach is likely where you'll need to go.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Inspecting a DB - psql or system tables ?

2011-05-28 Thread Greg Smith


On 05/27/2011 01:24 PM, Andre Majorel wrote:

While parsing the output of psql is cumbersome, accessing the
system tables seems more likely to break whenever a new version
of PostgreSQL comes out.
   


I think you have this backwards.  If there's a change in this area big 
enough to justify changing the format of the system tables, odds are the 
text output from psql is going to be changed too.  psql gets tweaked to 
display information better more often than the internals are altered.


Approaches you can take here, from most robust in the face of changes to 
most fragile, in my mind are:


1) Use information_schema.  If all the info you need is in here, great; 
it may not be though.

2) Use the system catalog data directly
3) Parse text output from psql.

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] max_connections proposal

2011-05-28 Thread Greg Smith


On 05/25/2011 10:58 PM, Craig Ringer wrote:

max_connections = 100   # (change requires restart)
# WARNING: If you're about to increase max_connections above 100, you
# should probably be using a connection pool instead. See:
# http://wiki.postgresql.org/max_connections
#
# Note:  Increasing max_connections costs ~400 bytes of shared memory
# per connection slot, plus lock space (see max_locks_per_transaction).


While tempting to do something this simple, the most useful path to 
follow is probably to nail this head-on and comprehensively in the docs 
instead.  Discussion of this topic on the hackers list seems to have 
concluded that connection pooling isn't as vital to do inside the 
database, as a high priority relative to other development, because it's 
addressed so well via external projects.  Pointing people toward them 
seems quite appropriate given that position.  Really addressing this 
well would take the following steps:


-Add a section to the external projects section of the documentation:  
http://www.postgresql.org/docs/current/interactive/external-projects.html introducing 
connection pooling as a useful type of additional software to add.  
Shouldn't be controversial to suggest pgbouncer and pgpool-II as 
examples there.


-Expand the documentation on max_connections to warn about how snapshot 
visibility overhead makes extremely large numbers of connections impractical


-Also expand the documentation to suggest that CPU switching 
inefficiency may make a much smaller number of connections than expected 
optimal, and point toward the external project section for more 
information about pooling.


-Add a warning to the postgresql.conf suggesting people read the 
documentation for max_connections before increasing this value.


This area pops up enough that I've made a discussion of it part of even 
my shortest talk about PostgreSQL performance issues to be wary of.  
There's a good documentation patch project for somebody here, I just 
haven't had time to get to it yet.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Shared Buffer Size

2011-05-28 Thread Greg Smith


On 05/27/2011 05:33 PM, preetika tyagi wrote:
Hows does the shared buffer in Postgres rely on the Operating System 
cache?
Suppose my RAM is 8 GB and shared_buffer is 24 MB in postgres. And 
there are some dirty pages in shared_buffer and I need to write a 
dirty page back to the disk to bring in a new page. What happens in 
this case? The dirty page will be written to the disk considering the 
shared_buffer size as 24 MB? or it will not be written and will stay 
in RAM which is 8 GB?




In this case, the dirty page will be written to the operating system, 
which will initially store it in its own write cache.  Eventually the 
operating system will write that page out to disk from that cache.  The 
delay before that happens can easily be 30 seconds or more on operating 
systems like Linux.  If PostgreSQL needs to read that data again, it may 
find it still in the OS cache, in which case the read will happen very 
quickly.  Eventually, the dirty data will be written to disk, and if 
it's not used for long enough the OS cache memory will be re-used for 
something else.


When you read a page into the database, and you don't dirty it, it might 
be evicted from the database cache without a write.  If the database 
needs that page again, it will ask the OS for it.  If the OS still has 
it in its own read cache, it may just read it from the cache again, 
without a real disk read happening.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Shared Buffer Size

2011-05-28 Thread Greg Smith


On 05/28/2011 04:42 AM, Carl von Clausewitz wrote:
I've try to get any information about optimal memory config, and 
working, but there wasn't any "optimal memory setting calculator" on 
the internet, just some guide in the posgre documentation 


There's a simple one at https://github.com/gregs1104/pgtune and the 
guide at http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server 
covers many of the most popular questions here too.  The right 
proportion of memory setting is very dependent on workload, which makes 
any sort of calculator hard to create.  pgtune takes a workload type as 
an input to help with that, but the settings that come out should be 
considered starting values only.  You'll need to monitoring how much 
memory is actually being used by the server, as well as the output from 
parameters like log_time_files, to know for sure if things are working well.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Postgresql CBT

2011-05-24 Thread Greg Smith


On 05/24/2011 02:05 AM, sade...@yahoo.com wrote:

Id like to familiarize with postgresql and looking for a decent CBT but not 
able to find it.


Most PostgreSQL training happens in one of the following ways:

-Self-learning using the manual or one of the PostgreSQL books
-In-person training at shared classrooms or on-site at companies who 
hire a trainer

-Training offered as part of the many PostgreSQL conferences
-Webcasts of training material

You can find many of the schedules for these at 
http://www.postgresql.org/about/eventarchive , books are at 
http://www.postgresql.org/docs/books/ , and the large manual is at 
http://www.postgresql.org/docs/manuals/


The closest thing to CBT I know of are the videos recorded of past 
conference and user's group sessions.  See 
http://vimeo.com/channels/postgres and 
http://fosslc.org/drupal/category/community/databases/postgresql for 
some samples.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Postgresql CBT

2011-05-24 Thread Greg Smith


On 05/24/2011 10:49 AM, Scott Marlowe wrote:

On Tue, May 24, 2011 at 8:36 AM, Vick Khera  wrote:
   

On Tue, May 24, 2011 at 10:33 AM, Scott Marlowe  wrote:
 

Id like to familiarize with postgresql and looking for a decent CBT but not 
able to find it. Could someone help pls?
 

CBT?  Please define.

   


my guess is computer based training.
 

Oh good.  My first response from google, with safe search turned off,
was much more distressing... :


Cognitive behavioral therapy is only necessary for people migrating to 
PostgreSQL after using Access as if it were a database for too long.  
That's a very specific type of post-traumatic stress disorder, and mild 
cases can be treated with CBT.  Severe cases will instead require ECT, 
aka electroshock.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-05 Thread Greg Smith


On 05/04/2011 08:31 PM, David Boreham wrote:
Here's my best theory at present : the failures ARE caused by cell 
wear-out, but the SSD firmware is buggy in so far as it fails to boot 
up and respond to host commands due to the wear-out state. So rather 
than the expected outcome (SSD responds but has read-only behavior), 
it appears to be (and is) dead. At least to my mind, this is a more 
plausible explanation for the reported failures vs. the alternative 
(SSD vendors are uniquely clueless at making basic electronics 
subassemblies), especially considering the difficulty in testing the 
firmware under all possible wear-out conditions.


One question worth asking is : in the cases you were involved in, was 
manufacturer failure analysis performed (and if so what was the 
failure cause reported?).


Unfortunately not.  Many of the people I deal with, particularly the 
ones with budgets to be early SSD adopters, are not the sort to return 
things that have failed to the vendor.  In some of these shops, if the 
data can't be securely erased first, it doesn't leave the place.  The 
idea that some trivial fix at the hardware level might bring the drive 
back to life, data intact, is terrifying to many businesses when drives 
fail hard.


Your bigger point, that this could just easily be software failures due 
to unexpected corner cases rather than hardware issues, is both a fair 
one to raise and even more scary.


Intel claims their Annual Failure Rate (AFR) on their SSDs in IT 
deployments (not OEM ones) is 0.6%.  Typical measured AFR rates for 
mechanical drives is around 2% during their first year, spiking to 5% 
afterwards.  I suspect that Intel's numbers are actually much better 
than the other manufacturers here, so a SSD from anyone else can 
easily be less reliable than a regular hard drive still.


Hmm, this is speculation I don't support (non-intel vendors have a 10x 
worse early failure rate). The entire industry uses very similar 
processes (often the same factories). One rogue vendor with a bad 
process...sure, but all of them ??




I was postulating that you only have to be 4X as bad as Intel to reach 
2.4%, and then be worse than a mechanical drive for early failures.  If 
you look at http://labs.google.com/papers/disk_failures.pdf you can see 
there's a 5:1 ratio in first-year AFR just between light and heavy usage 
on the drive.  So a 4:1 ratio between best and worst manufacturer for 
SSD seemed possible.  Plenty of us have seen particular drive models 
that were much more than 4X as bad as average ones among regular hard 
drives.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] SSDD reliability

2011-05-05 Thread Greg Smith


On 05/05/2011 10:35 AM, David Boreham wrote:

On 5/5/2011 8:04 AM, Scott Ribe wrote:


Actually, any of us who really tried could probably come up with a 
dozen examples--more if we've been around for a while. Original 
design cutting corners on power regulation; final manufacturers 
cutting corners on specs; component manufacturers cutting corners on 
specs or selling outright counterfeit parts...


These are excellent examples of failure causes for electronics, but 
they are

not counter-examples. They're unrelated to the discussion about SSD
early lifetime hard failures.


That's really optimistic.  For all we know, these problems are the 
latest incarnation of something like the bulging capacitor plague circa 
5 years ago.  Some part that is unique to the SSDs other than the flash 
cells that there's a giant bad batch of.


I think your faith in PC component manufacturing is out of touch with 
the actual field failure rates for this stuff, which is produced with 
enormous cost cutting pressure driving tolerances to the bleeding edge 
in many cases.  The equipment of the 80's and 90's you were referring to 
ran slower, and was more expensive so better quality components could be 
justified.  The quality trend at the board and component level has been 
trending for a long time toward cheap over good in almost every case 
nowadays.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-04 Thread Greg Smith


On 05/04/2011 03:24 PM, David Boreham wrote:
So if someone says that SSDs have "failed", I'll assume that they 
suffered from Flash cell

wear-out unless there is compelling proof to the contrary.


I've been involved in four recovery situations similar to the one 
described in that coding horror article, and zero of them were flash 
wear-out issues.  The telling sign is that the device should fail to 
read-only mode if it wears out.  That's not what I've seen happen 
though; what reports from the field are saying is that sudden, complete 
failures are the more likely event.


The environment inside a PC of any sort, desktop or particularly 
portable, is not a predictable environment.  Just because the drives 
should be less prone to heat and vibration issues doesn't mean 
individual components can't slide out of spec because of them.  And hard 
drive manufacturers have a giant head start at working out reliability 
bugs in that area.  You can't design that sort of issue out of a new 
product in advance; all you can do is analyze returns from the field, 
see what you screwed up, and do another design rev to address it.


The idea that these new devices, which are extremely complicated and 
based on hardware that hasn't been manufactured in volume before, should 
be expected to have high reliability is an odd claim.  I assume that any 
new electronics gadget has an extremely high failure rate during its 
first few years of volume production, particularly from a new 
manufacturer of that product.


Intel claims their Annual Failure Rate (AFR) on their SSDs in IT 
deployments (not OEM ones) is 0.6%.  Typical measured AFR rates for 
mechanical drives is around 2% during their first year, spiking to 5% 
afterwards.  I suspect that Intel's numbers are actually much better 
than the other manufacturers here, so a SSD from anyone else can easily 
be less reliable than a regular hard drive still.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Question on Wal time lines

2011-05-04 Thread Greg Smith


dabicho wrote:


For restoring a database from wal files, if I omit a target on the 
recovery.conf file, can I make it so the database continues the time 
line instead of starting one?
Or is there a tool to pick the most recent time line from a bunch of 
wal files?




When recovery finishes, you get a new timeline.  That way it's always 
possible to distinguish between a server that's exited recovery, and 
started generating new WAL data, from one that is still sync'd to the 
master and running recovery of the original timeline.  If you don't want 
a new timeline, don't let recovery finish.


As for the options you can tweak, see 
http://www.postgresql.org/docs/9.0/static/continuous-archiving.html#BACKUP-TIMELINES 
and the recovery target parameters at 
http://www.postgresql.org/docs/9.0/static/recovery-target-settings.html  
You can navigate among multiple timelines in a set of WAL files using 
recovery_target_timeline and the other target settings.  It really is 
worth the trouble to run some experiments with these ideas to see what 
you can do, before you're forced to do so by an emergency.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Bidirectional replication

2011-05-04 Thread Greg Smith


Merlin Moncure wrote:

I know some people do some cool, usable things with that stuff,
but the whole concept seems awfully awkward to me.  I suppose I'm a
crotchety, cane shaking fundamentalist...


It's possible--do you sometimes find yourself yelling at young 
developers, telling them to stop replicating in your yard?


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-04 Thread Greg Smith


David Johnston wrote:

Is there any rules-of-thumb on the performance of a PK as a function of key 
length?  I like using varchar based identifiers since I tend to query tables 
directly and writing where clauses is much easier if you can avoid the joins.  
I'm likely better off creating views and querying those but am still curious on 
any basic thoughts on having a 100+ length primary key.
  


The shorter the better, but it may not be as bad as you fear.  The way 
B-tree indexes are built, it isn't that expensive to hold a longer key 
so long as the unique part doesn't average out to be that long.  So if 
you insert "12345" and "12345777", that's 
not going to be much different than navigating "123456" and "123457", 
because once you get that far you've already reached a unique prefix.  
But if your entries have a really long common prefix, like 
"12" and "13", that's going to be more 
expensive to deal with--even though the strings are the same length.


If your identifiers become unique after only a few characters, it may 
not be so bad.  But if they go many characters before you can 
distinguish between any two entries, you're probably not going to be 
happy with the performance or size of the indexes, relative to simple 
integer keys.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Greg Smith


Jeff Davis wrote:

On Mon, 2011-05-02 at 23:07 -0400, Greg Smith wrote:
  
I see this whole area as being similar to SQL injection.  The same way 
that you just can't trust data input by the user to ever be secure, you 
can't trust inputs to your database will ever be unique in the way you 
expect them to be. 



So, don't trust them to be unique then. Make up your own unique
identifier, and use that.
  


If you're making up your own unique identifier, that's closer to a 
surrogate key as far as I'm concerned, even though it doesn't fit the 
strict definition of that term (it doesn't have the subtle idea that 
"surrogate" implies "meaningless").  Now, there is some value to doing 
that well, instead of just using the typical incrementing integer 
"pointer" approach, as you've called it.  But if it's not derived from 
external data you're storing anyway, it's not a true natural key either.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Greg Smith


Merlin Moncure wrote:

If your data modeler that made the the assumptions that a MAC is
unique (a mistake obviously) at least the other tables are protected
from violations of that assumption because the database would reject
them with an error, which is a Good Thing.  Without a uniqueness
constraint you now have ambiguous data which is a Very Bad Thing.
  


With a uniqueness constraint in this situation, the unexpected data--row 
with a non unique MAC--will be rejected and possibly lost when the 
insertion happens.  You say that's a good thing, plenty of people will 
say that's the worst possible thing that can happen.  When dealing with 
external data, it's often impossible to know everything you're going to 
see later at design time.  Approaching that problem with the idea that 
you're going to lose any data that doesn't fit into the original model 
is not what everyone finds reasonable behavior.


I don't think it's possible to decide in a generic way which of these is 
the better approach:  to reject unexpected data and force the problem 
back at the application immediately (commit failure), or to accept with 
with because you're using a surrogate key and discover the problems down 
the line.  Both are valid approaches with a very different type of risk 
associated with them.  I think it's fair to say that real-world data is 
not always well known enough at design time to follow the idea you're 
suggesting though, and that does factor into why there is such a 
preference for surrogate keys in the industry.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-02 Thread Greg Smith


John R Pierce wrote:
otoh, there's plenty of places where natural keys are optimal.   my 
company makes widgets, and we make damn sure our serial #s and part 
numbers are unique, and we use them as PK's for the various tables.


Sure; what I was commenting on is that you normally can't ever trust 
external sources for identifiers.  If you want to come up with your own, 
internally unique keys for things, great.  But one of the goals of using 
a natural key is often to avoid the overhead of storing both that ID and 
some made up internal number, too.  And whether the number is made up by 
the computer (the classic SERIAL or similar surrogate key), or you make 
one up yourself, it's still another chunk of data that gets stored for 
every item.  It's just one that means something more useful in your 
case.  Probably going to take up more space in the process and possibly 
be slower though--part number strings can easily end up longer than 
SERIAL-like integers.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-02 Thread Greg Smith


On 05/02/2011 10:06 PM, Rob Sargent wrote:
You would be surprise how many "bob smith"s where born on the same 
day.  But then they weren't all born in a hospital etc etc etc.


I wouldn't be surprised.  I once lived in a mile-square town (Hoboken, 
that's it's nickname).  In that town were 40K residents and three gyms.  
I forgot my ID card one day when going to mine, and they took my name 
and street name as alternate proof of identity.  Some designer along the 
line figured that was unique enough.  Number of Greg Smiths living on 
that street who were members of that one gym?  Three.


I see this whole area as being similar to SQL injection.  The same way 
that you just can't trust data input by the user to ever be secure, you 
can't trust inputs to your database will ever be unique in the way you 
expect them to be.  So if you build a so-called "natural key" based on 
them, expect that to break one day.  That doesn't mean you can't use 
them as a sort of foreign key indexing the data; it just means you can't 
make them the sole unique identifier for a particular entity, where that 
entity is a person, company, or part.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-02 Thread Greg Smith


On 05/01/2011 06:12 PM, Karsten Hilbert wrote:

Good to know since I'm only a lowly medical doctor not
having much schooling in database matters beyond this list,
the PostgreSQL docs, and the Celko book.
   


This debate exists at all levels of experience, and the only thing that 
changes as you get more experienced people involved is an increase in 
anecdotes on each side.  The sole time I ever found myself arguing with 
Joe Celko is over an article he wrote recommending natural keys, using 
an example from the automotive industry.  Problem was, the specific 
example he gave was flat out wrong.  I was working in automotive MIS at 
the time, and the thing he was saying would never change did, in fact, 
change every year--in only a fraction of a percent of cases, in an 
extremely subtle way that snuck up on people and wreaked much 
confusion.  That's typical for an early natural key design:  you get it 
working fine in V1.0, only to discover months or years down the road 
there's a case you never considered you don't model correctly, and it 
may take some sort of conversion to fix.


The reason why there's a strong preference for surrogate keys is that 
they always work and you can avoid ever needing to come up with a better 
design. if you just use them and forget about it.  The position Merlin 
has advocated here, that there should always be a natural key available 
if you know the data well enough, may be true.  But few people are good 
enough designers to be sure they've made the decision correctly, and the 
downsides of being wrong can be a long, painful conversion process.  
Easier for most people to just eliminate the possibility of making a 
mistake by using auto-generated surrogate keys, where the primary 
problem you'll run into is merely using more space/resources than you 
might otherwise need to have.  It minimizes the worst-case--mistake make 
in the model, expensive re-design--by adding overhead that makes the 
average case more expensive.  Software design usually has enough risks 
that any time you can eliminate one just by throwing some resources at 
it, that's normally the right thing to do.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Postgresql, PSN hack and table limits

2011-05-01 Thread Greg Smith


On 05/01/2011 01:50 AM, Mark Morgan Lloyd wrote:
Somebody is making a very specific claim that Postgres can support a 
limited number of rows



Did you find this via 
http://www.reversecurity.com/2011/04/new-details-from-psn-hack.html ? 
That was the only Google-indexed source leading to it I found.  I just 
left a note there about the silliness of these claims.  I could run more 
than a 10M row PostgreSQL instance on my phone.  Unless there's a new 
16-bit only Vic 20 port of PostgreSQL available or something, it's seems 
unlikely the data had to be partitioned due to any hard limit.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Switching Database Engines

2011-04-30 Thread Greg Smith


On 04/28/2011 12:19 PM, Carlos Mennens wrote:

It seems that the 'mysql2postgres.pl' tool has instructions embedded
into the file so I ran the command as instructed to take the output
file and insert it into my PostgreSQL server and got the following
error message:

$ psql -p 5432 -h db1 -U wiki -f mediawiki_upgrade.pg
Password for user wiki:
BEGIN
SET
SET
SET
psql:mediawiki_upgrade.pg:25: ERROR:  relation "category" does not exist
   


My guess is that you need to run the main MediaWiki installer for 
PostgreSQL first, to create a blank install, in order for the category 
table to exist.  The export tool is aimed to get the data out, not the 
schema to create all the tables.  After you create a blank instance, 
then you do the data export.


If you have additional problems, try running that like this instead:

$ psql -p 5432 -h db1 -U wiki -e -f mediawiki_upgrade.pg


Note the extra "-e" on the command line.  That will show you the line it 
is executing as the script runs, so you'll see the one that fails too.  
Very handy for debugging what's gone wrong in this sort of situation.


I wouldn't fight with this too much though.  Unless you have some really 
customized stuff in your wiki, there really is nothing wrong with the 
idea of dumping everything into XML, creating a blank PostgreSQL-backed 
MediaWiki install, then restoring into that.  That's what I always do in 
order to get a plain text backup of my server, and to migrate a wiki 
from one server to another.  There are all kinds of issues you could 
have left here before this works, trying to do a database-level 
export/reload--encoding, foreign key problems, who knows what else.  The 
database-agnostic export/import into XML avoids all of those.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Looking for Silicon Valley/Peninsula/San Francisco users group

2011-04-29 Thread Greg Smith


On 04/29/2011 06:13 PM, Jeff Davis wrote:

I'm not sure which reference you found, but SFPUG is certainly active
with meetings every month.


http://pugs.postgresql.org/sfpug ; last meeting listed there is January 
2009.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] SSDs with Postgresql?

2011-04-29 Thread Greg Smith


On 04/29/2011 06:42 AM, Scott Marlowe wrote:

I think you misunderstood.  He's not storing 480GB on the drives,
that's how much WAL is moving across it.  It could easily be a single
80GB SSD drive or something like that.
   


Right; that's why you don't necessarily get saved by the fact that 
larger databases must go onto more flash cells, too.  Sometimes, yes, 
but not always.  The WAL is really close to a worst-case for flash:  
lots of redundant information that's constantly overwritten.  It's the 
last thing you want to consider putting onto SSD.  There's a good reason 
why so many of the "enterprise" SSDs try to distinguish themselves with 
redundancy and wear leveling advancements; it's so this sort of workload 
doesn't kill them.


Combine that workload possibility with the limitations of MLC flash, and 
you can see why the lifetimes actually are a serious concern in some 
situations.  Not all of them, of course, but this is why I recommend 
things like directly measuring your WAL volume.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Partitioning an existing table

2011-04-29 Thread Greg Smith


Vick Khera wrote:
I've been trying to get the OSCON folk to accept this talk for several 
years now, to reach a wider audience.  Seems they don't like me... :(


Too specific for OSCON.  It's not you, it's the topic.  I've tried 
submitting far more generic things than that, but still with a 
PostgreSQL tilt to them, and I never get those accepted either. 


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] SSDs with Postgresql?

2011-04-28 Thread Greg Smith


On 04/26/2011 10:30 AM, Toby Corkindale wrote:
I see Intel is/was claiming their SLC SSDs had a *minimum* lifetime of 
2PB in writes for their 64GB disks; for your customer with a 50GB db 
and 20GB/day of WAL, that would work out at a minimum lifetime of a 
million days, or about 273 years!

The cheaper "consumer grade" MLC drives should still last minimum 5 years at 
20GB/day according to their literature. (And what I found was fairly out of date)
That doesn't seem too bad to me - I don't think I've worked anywhere that keeps 
their traditional spinning disks in service beyond 5 years either.
   



The comment I made there was that the 20GB/day system was a very small 
customer.  One busy server, who are also the ones most likely to want 
SSD, I just watched recently chug through 16MB of WAL every 3 
seconds=450GB/day.  Now, you're right that those systems also aren't 
running with a tiny amount of flash, either.  But the write volume 
scales along with the size, too.  If you're heavily updating records in 
particular, the WAL volume can be huge relative to the drive space 
needed to store the result.


As for the idea that I'm just singling out one anecdote, I have 
terabytes of lost data on multiple systems behind my negativity here.  I 
was just pointing out a public failure that included some post-mortem I 
liked.  I'm not sure if I have any happy customers who were early 
adopters of regular SLC or MLC drives really; the disaster rate is very 
close to 100% for the first few generations of those drives I've seen, 
and I've been around 50-ish of them.  I'm hoping the current models 
shipping now are better, getting the write cache stuff sorted out better 
will be a big help.  But it's been a scary technology for database use 
so far.  The published numbers from the manufacturer literature are a 
very rosy best case when you're hitting the disk with this type of workload.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PG 9.0 - Default postgresql.conf?

2011-04-27 Thread Greg Smith


Phoenix Kiula wrote:



What changes does initdb make -- can I make them by myself?



The main thing is that it adjusts shared_buffers down until the server 
will start without exceeding the OS limits.  It also sets some locale 
parameters within the database.



Btw, the default file is FAILING. The server does not restart.

How can I check what the problem is? I don't see any errors at least
on the command line.All I see is:
  


The errors are probably in /var/lib/pgsql/9.0/pgstartup.log

Most of the time the problem is either:

a) shared_buffers and the other memory parameters are set too high for 
the OS settings.
b) There is already an instance running, or some file left behind from 
an earlier instance, that is blocking startup of the new server.

c) A file is missing, corrupted, or has the wrong permissions

All of the common cases identify themselves clearly in the log file.

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Partitioning an existing table

2011-04-26 Thread Greg Smith


On 04/25/2011 10:10 AM, Vick Khera wrote:
Basically, you create your partitions and set up the necessary 
triggers you want (I re-write the app to insert directly into the 
correct partition).  Then all new data starts going into the 
partitions.  Next, write a program that loops over the current master 
table, and moves the data into each partition some small hunk at a 
time, in a transaction.  This can take a long time.  For us, it took 
about 7 days to move O(100m) rows.  Then, when you're done, truncate 
the master table, and enforce that no new data is allowed to be 
inserted into it.


Vick's presentation at 
http://cdn.mailermailer.com/documents/PGCon2008TablePartitioningCaseStudy.pdf 
is still one of the best case studies of how to do this sort of 
migration around.  I was inspired by several of the ideas there when 
doing the partitioning chapter of my book, which is the only place I'm 
aware of covering this in even more detail than his case study.


Cedric's idea for how to do this even more aggressively (multiple 
workers) is what you want if this is a one-time operation you're taking 
the system down for.  In other situations, the gradual migration 
strategy Vick is suggesting is more appropriate.


Some of the other ideas suggested in this thread won't work at all, so 
be careful who you listen to here.  You can't leave copies of the data 
in the parent and put it into the child partition without all sorts of 
potential downsides.  And you really, really want to do this as a proper 
database transaction, which is easiest to express using INSERT instead 
of COPY.  If any step of the migration goes wrong, being able to do 
ROLLBACK and undo the recent bad steps is vital.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 10 missing features

2011-04-26 Thread Greg Smith


On 04/25/2011 04:54 PM, Nicholson, Brad (Toronto, ON, CA) wrote:
The problem is that there is a lot of noise in the add-on space. There 
are lots of things out there that are no longer supported or partially 
supported. There is a fairly high barrier of entry into figuring out 
which tools to use, how to put them together and what you can and 
can't do with them.




Right.  This is why I advise people to start on it as soon as possible, 
to make it part of the initial testing of PostgreSQL.  You have to start 
early on figuring which of the additional packages make sense for you, 
because in some environments you can't easily introduce them later if 
they weren't part of earlier QA.  I tried to work on the barrier to 
entry part in my book, there's a lot of specific recommendations for 
add-ons in there and context about what they are and aren't good for.



If (and I stress the word if) the target is winning over DBA's from the 
commercial crowd this is an important point, as those DBA's are going to be 
used to getting most of what they need in one package along with the DB.
   


Unfortunately that whole line of thinking is fundamentally incompatible 
with how open source databases are built and packaged.  What some people 
don't seem to get is that the "one package" here is "one operating 
system distribution with a good package manager".  It's not like you 
have to build all this stuff from source or anything; in many cases the 
packages are available to add with a quick search and a few clicks.



I do think the areas that are lacking in PG though do come to finer grain 
profiling of tasks.  The ability to isolate CPU and IO utilization of 
particular queries or sets of queries is something I find very useful in the 
commercial DB space that I'd love to see in Postgres.  Same goes for 
troubleshooting locking conflicts if you aren't looking at the system when they 
are happening, and tracing the causes of those locks down to finer grained 
details (IE - am I waiting on buffer eviction or xlog writes).
   


This is all true.  Unfortunately the way I expect this areas to be 
addressed doesn't start with "how can PostgreSQL duplicate the Oracle 
solution to this problem", which is how many of these incoming requests 
for features start.  The alternate question of "how do you provide 
something with the same feature set for PostgreSQL?" is the more useful 
one, and it doesn't lead to the same solutions.  That disconnect is 
important to address.  If people are only willing to work toward or fund 
solving a familiar problem, they may not recognize an alternate solution 
that is just as useful--just not as familiar--that is available or being 
built.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 10 missing features

2011-04-25 Thread Greg Smith


On 04/25/2011 10:48 AM, Andrew Sullivan wrote:

You can see this in certain items in the top 10. Three, four, five,
seven, maybe 8 eight, and ten all seemed to me to be things I've
actually done before, but not using something directly inside
Postgres.
   


The idea that something must ship in the database to be useful is really 
engrained in some people.  I do this talk nowadays about common mistakes 
people make when deploying PostgreSQL, and one of the top items I put on 
there is not actively investigating external tools.


None of the items on this list would be on my own top list of missing 
things in PostgreSQL.  I see "Better fragmentation management" as a 
footnote and there's an intro discussion to that on the blog at 
http://blog.kimiensoftware.com/2011/04/compacting-postgresql-tables/   
Apparently the struggles required to sort out a little 25GB table 
apparently didn't make enough of an impression to put that into its 
proper place, which is way ahead of every item listed on the suggested 
missing feature set.  Query progress is #1?  It's annoying, yes, but so 
not even close to pole position to me.  From reading the blog a bit, it 
sounds like the author is managing lots of smallish (to me) databases, 
so putting so much emphasis on making each individual one easier to 
troubleshoot makes more sense.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Poor performance of btrfs with Postgresql

2011-04-21 Thread Greg Smith


On 04/21/2011 02:22 AM, Toby Corkindale wrote:
I also tested btrfs, and was disappointed to see it performed 
*dreadfully* - even with the recommended options for database loads.


Best TPS I could get out of ext4 on the test machine was 2392 TPS, but 
btrfs gave me just 69! This is appalling performance. (And that was 
with nodatacow and noatime set)


I don't run database performance tests until I've tested the performance 
of the system doing fsync calls, what I call its raw commit rate.  
That's how fast a single comitting process will be able to execute 
individual database INSERT statements for example.  Whether or not 
barriers are turned on or not is the biggest impact on that, and from 
what you're describing it sounds like the main issue here is that you 
weren't able to get btrfs+nobarrier performing as expected.


If you grab 
http://projects.2ndquadrant.it/sites/default/files/bottom-up-benchmarking.pdf 
page 26 will show you how to measure fsync rate directly using 
sysbench.  Other slides cover how to get sysbench working right, you'll 
need to get a development snapshot to compile on your Ubuntu system.


General fsync issues around btrfs are still plentiful it seems.  
Installing packages with dpkg sometimes does that (I haven't been 
following exactly which versions of Ubuntu do and don't fsync), so there 
are bug reports like 
https://bugs.launchpad.net/ubuntu/+source/dpkg/+bug/570805 and 
https://bugs.launchpad.net/ubuntu/+source/dpkg/+bug/607632


One interesting thing from there is an idea I'd never though of:  you 
can link in an alternate system library that just ignore fsync if you 
want to test turning it off above the filesystem level.  Someone has 
released a package to do just that, libeatmydata:  
http://www.flamingspork.com/projects/libeatmydata/


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Poor performance of btrfs with Postgresql

2011-04-21 Thread Greg Smith


On 04/21/2011 06:16 AM, Henry C. wrote:

Since Pg is already "journalling", why bother duplicating (and pay the
performance penalty, whatever that penalty may be) the effort for no real
gain (except maybe a redundant sense of safety)?  ie, use a
non-journalling battle-tested fs like ext2.
   


The first time your server is down and unreachable over the network 
after a crash, because it's run fsck to recover, failed to execute 
automatically, and now requires manual intervention before the system 
will finish booting, you'll never make that mistake again.  On real 
database workloads, there's really minimal improvement to gain for that 
risk--and sometimes actually a drop in performance--using ext2 over a 
properly configured ext3.  If you want to loosen the filesystem journal 
requirements on a PostgreSQL-only volume, use "data=writeback" on ext3.  
And I'd still expect ext4/XFS to beat any ext2/ext3 combination you can 
come up with, performance-wise.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] SSDs with Postgresql?

2011-04-21 Thread Greg Smith


On 04/21/2011 11:33 AM, Florian Weimer wrote:

Is there an easy way to monitor WAL traffic in away? It
does not have to be finegrained, but it might be helpful to know if
we're doing 10 GB, 100 GB or 1 TB of WAL traffic on a particular
database, should the question of SSDs ever come up.
   


You can use functions like pg_current_xlog_location() : 
http://www.postgresql.org/docs/9.0/interactive/functions-admin.html


Save a copy of this periodically:

select now(),pg_current_xlog_location();

And you can see WAL volume over time given any two points from that set 
of samples.


To convert the internal numbers returned by that into bytes, you'll need 
to do some math on them.  Examples showing how that works and code in a 
few languages:


http://archives.postgresql.org/pgsql-general/2010-10/msg00077.php (by hand)
http://munin-monitoring.org/browser/trunk/plugins/node.d/postgres_streaming_.in?rev=3905 
(in Perl)

http://archives.postgresql.org/pgsql-general/2010-10/msg00079.php (in C)
http://postgresql.1045698.n5.nabble.com/How-can-we-tell-how-far-behind-the-standby-is-td3252297.html 
(in bash with bc(!), other links)


What I keep meaning to write is something that does that as part of the 
SQL itself, so it gets pulled out of the database already in bytes.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] SSDs with Postgresql?

2011-04-21 Thread Greg Smith

On 04/20/2011 01:50 AM, Toby Corkindale wrote:
Also, the number of erase cycles you can get, over the whole disk, is
quite large on modern disks!

So large that you'll probably go decades before you wear the disk out,
even with continual writes.

Don't buy into the SSD FUD myths..

There is no FUD being spread here. Particularly given the PostgreSQL
WAL write pattern, it's not impossible to wear out a SSD placed there in
a small number of years. A system with a trivial but not completely
idle workload will generate one 16MB WAL segment every 5 minutes, which
works out to 4.5GB/day of writes. That's the baseline--the reality is
much, much higher than that on most systems. The fact that every row
update can temporarily use more than 8K means that actual write
throughput on the WAL can be shockingly large. The smallest customer I
work with regularly has a 50GB database, yet they write 20GB of WAL
every day. You can imagine how much WAL is generated daily on systems
with terabyte databases.

As for what this translates into in the real world, go read
http://archives.postgresql.org/message-id/BANLkTi=GsyBfq+ApWPR_qCA7AN+NqT=z...@mail.gmail.com
as one worked out sample. Anyone deploying PostgreSQL onto MLC can't
necessarily ignore this issue.

--
Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] SSDs with Postgresql?

2011-04-14 Thread Greg Smith


Henry C. wrote:

I believe this perception that SSDs are less "safe" than failure-prone
mechanical hard drives will eventually change.
  


Only because the manufacturers are starting to care about write 
durability enough to include the right hardware for it.  Many of them 
are less safe right now on some common database tasks.  Intel's gen 1 
and gen 2 drives are garbage for database use.  I've had customers lose 
terabytes of data due to them.  Yes, every system can fail, but these 
*will* fail and corrupt your database the first time there's a serious 
power problem of some sort.  And the idea that a UPS is sufficient to 
protect against that even happening in wildly optimistic.


See http://wiki.postgresql.org/wiki/Reliable_Writes for more background 
here, and links to reading on the older Intel drives.  I summarized the 
situation with their newer 320 series drives at 
http://blog.2ndquadrant.com/en/2011/04/intel-ssd-now-off-the-sherr-sh.html  
Those finally get the write flushing right.  But the random seeks IOPS 
is wildly lower than you might expect on read/write workloads.  My own 
tests and other sources have all come up with around 3500 IOPS as being 
a real-world expectation for the larger sizes of these drives.  Also, it 
is cheap flash, so durability in a server environment won't be great.  
Don't put your WAL on them if you have a high transaction rate.  Put 
some indexes there instead.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Raid Controller Write Cache setting for WAL and Data

2011-02-16 Thread Greg Smith


Piotr Gasidło wrote:

I _can_ afford of loosing some data in case of power failure. But I'm
afraid of having database in unrecoverable state after crash.
  


Then turn off synchronous_commit.  That's exactly the behavior you get 
when it's disabled:  some data loss after a crash, no risk of database 
corruption, and faster performance without needing a controller with a 
battery.


If you've already got a RAID controller that accepts a battery, it would 
be silly not to then buy one though.  The controller is normally 75% of 
the price of the combination, so getting that but not the final piece to 
really make it perform well wouldn't be a good move.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] question regarding full_page_write

2011-02-16 Thread Greg Smith


AI Rumman wrote:

I can't clearly understand what FULL_PAGE_WRITE parameter is stand for.
Documentation suggest that If I make it OFF, then I have the chance 
for DB crash.

Can anyone please tell me how it could be happened?


The database writes to disk in 8K blocks.  If you can be sure that your 
disk drives and operating system will always write in 8K blocks, you can 
get a performance improvement from turning full_page_writes off.  But if 
you do that, and it turns out that when the power is interrupted your 
disk setup will actually do partial writes of less than 8K, your 
database can get corrupted.  Your system needs to ensure that when a 
write happens, either the whole thing goes to disk, or none of it does.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] effective_io_concurrency

2011-02-03 Thread Greg Smith


Yves Weißig wrote:

I was wondering if there is more information about this switch in the
configuration. Does it really work? Where in the source code can I
follow how it works? "sgmgr.c" seems to be an entry point, but where
exactly is it used?
  


Currently the code only kicks in when you're doing a Bitmap Heap Scan, 
which is really helpful for them, but of no help for any other type of 
query.  This style of heap scan already knows in advance exactly what 
blocks it needs from the database, and normally it just asks for them 
one at a time.  That can turn into a fair amount of random I/O, and it's 
done serially:  the next block isn't requested until the last one 
arrives.  What effective_io_concurrency does is advise the operating 
system of the next few blocks the database is going to ask for, before 
the actual read requests, in hopes that it might grab them if it happens 
to be nearby that area of the disk.e


I've only seen this feature work at all on Linux. It might work on BSD 
and Mac OS X systems, it certainly doesn't do anything on Solaris and 
Windows.


The basic idea is that you start with setting the value to the number of 
working drives in the disk array the database is on and see if I/O 
performance goes up and/or query speed drops afterwards. If it does you 
might try further increases beyond that even. As for why there isn't a 
better tuning guide than just those simple guidelines, it's not that 
easy to show a situation where the type of bitmap scan this parameter 
impacts is used on a generated data set, even though it's not that 
uncommon in real-world data. It's hard both to make generic suggestions 
here and to even demonstrate the feature at work.


Moving up the source code chain from smgr, 
src/backend/storage/buffer/bufmgr.c has PrefetchBuffer, and the one 
place the executor calls that is BitmapHeapNext inside 
src/backend/executor/nodeBitmapHeapscan.c


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Book recommendation?

2011-02-01 Thread Greg Smith


Herouth Maoz wrote:

My sysadmin ran into the following book:

PostgreSQL 9.0 High Performance, by Gregory Smith, ISBN 184951030X
http://amzn.com/184951030X
  


That guy's a troublemaker, but I guess he writes OK.

There are three customer reviews at 
http://www.amazon.com/PostgreSQL-High-Performance-Gregory-Smith/product-reviews/184951030X/


and two longer ones at:

http://people.planetpostgresql.org/andrew/index.php?/archives/130-Buy-this-book,-now..html
http://www.postgresonline.com/journal/archives/192-postgresql9highperformance.html

And here's some free samples:

https://www.packtpub.com/article/postgresql-9-reliable-controller-disk-setup
https://www.packtpub.com/article/postgresql-9-balancing-hardware-spending
https://www.packtpub.com/article/server-configuration-tuning-postgresql
https://www.packtpub.com/article/unix-monitoring-tool-for-postgresql
https://www.packtpub.com/article/postgresql-tips-tricks
https://www.packtpub.com/sites/default/files/0301OS-Chapter-2-Database-Hardware.pdf

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] SHMMAX and SHMALL question

2011-01-22 Thread Greg Smith


DM wrote:

RAM = 16GB, what value should i set for shmall?


Given that PostgreSQL rarely sees increasing improvement as 
shared_buffers goes over 50% of RAM, I just use that figure for the 
shmall and then compute shmmax based on the page size to match it.  I 
use the attached script to do all the hard work, haven't found a Linux 
system yet it didn't do the right thing on.  It sounds like you might 
have the math on the relation between the two backwards, look at the 
output and code of this once and that should sort things out for you.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

#!/bin/bash

# Output lines suitable for sysctl configuration based
# on total amount of RAM on the system.  The output
# will allow up to 50% of physical memory to be allocated
# into shared memory.

# On Linux, you can use it as follows (as root):
# 
# ./shmsetup >> /etc/sysctl.conf
# sysctl -p

# Early FreeBSD versions do not support the sysconf interface
# used here.  The exact version where this works hasn't
# been confirmed yet.

page_size=`getconf PAGE_SIZE`
phys_pages=`getconf _PHYS_PAGES`

if [ -z "$page_size" ]; then
  echo Error:  cannot determine page size
  exit 1
fi

if [ -z "$phys_pages" ]; then
  echo Error:  cannot determine number of memory pages
  exit 2
fi

shmall=`expr $phys_pages / 2`
shmmax=`expr $shmall \* $page_size` 

echo \# Maximum shared segment size in bytes
echo kernel.shmmax = $shmmax
echo \# Maximum number of shared memory segments in pages
echo kernel.shmall = $shmall

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] migrate hashname function from 8.1.x to 8.4

2011-01-11 Thread Greg Smith


Nicolas Garfinkiel wrote:


I'm trying to upgrade our pgsql from 8.1 to 8.4, but our system's 
login uses the hashname() function in order to get the proper password 
validation. Now pgsql's 8.4 hashname function is not compatible with 
8.1's function. Do you have any ideas how I can reproduce 8.1 function 
in 8.4?




https://github.com/petere/pgvihash provides the function you're looking for.

I agree with Craig's concerns here, but this may let you convert toward 
a better long-term format more easily.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Re: [GENERAL] PG84 and SELinux

2010-12-05 Thread Greg Smith


Tom Lane wrote:

"James B. Byrne"  writes:
  

I wrote too soon.  What I did was uncomment the ssl option.  I
neglected to change the setting from off to on.



  

When I try to start the server with ssl=on it fails with this error:



  

Auto configuration failed
29006:error:0E065068:configuration file routines:STR_COPY:variable
has no value:conf_def.c:629:line 207



AFAIK there is no place in the standard Postgres sources that could emit
an error message even vaguely like that.


That looks to be the str_copy routine from conf_def.c in the OpenSSL 
code, i.e. line 624 of the version at:


http://code.google.com/p/commitmonitor/source/browse/trunk/common/openssl/crypto/conf/conf_def.c

So guessing something in the SSL autonegotiation is failing here in a 
really unexpected way.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services and Supportwww.2ndQuadrant.us

Re: [GENERAL] Considering Solid State Drives

2010-11-15 Thread Greg Smith


Allan Kamau wrote:

I am now thinking of investing in a SSD (Solid State Drive), and maybe
choosing between "Crucial Technology 256GB Crucial M225 Series
2.5-Inch Solid State Drive (CT256M225)" and "Intel X25-M G2 (160GB) -
Intel MLC".


Both of these are worthless for database applications if you care about 
your data.  In order to perform well, SSDs need to have a write cache to 
buffer small writes.  For PostgreSQL to work as intended, that write 
cache needs to be non-volatile.  It is not in either of those drives.  I 
hear tales of lost PostgreSQL data on Intel SSDs every month, the 
database is lucky to survive a single power outage.


The only relatively inexpensive SSD we've heard about on the Performance 
list that's survived all of the durability on crash tests thrown at it 
is the OCZ Vertex 2 Pro; see 
http://archives.postgresql.org/pgsql-performance/2010-07/msg00449.php 
for a summary.  That avoids this problem by having an Ultracapacitor 
integrated with the drive, to allow orderly processing of the write 
cache if power is lost.  There are other SSD devices that are similarly 
reliable, but the costs are quite a bit higher.


More background about this topic at 
http://wiki.postgresql.org/wiki/Reliable_Writes


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services and Supportwww.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] The first dedicated PostgreSQL forum

2010-11-15 Thread Greg Smith


Tom Lane wrote:

I'm not nearly as concerned about whether there are forums as about
having "rogue" forums outside the postgresql.org domain.  People could
misperceive such things as having some official status


That the site is now mirroring recent news from postgresql.org doesn't 
help with that.  I find it hard to get too excited about yet another 
forum style discussion area when there's already more PostgreSQL talk on 
http://stackoverflow.com/ than I have time to keep up with.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services and Supportwww.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

1 2 3 4 5 6 7 8 >

1 - 100 of 760 matches

Mail list logo