> The most significant impact is that it takes up twice as much space,
> including the primary key index. This means fewer entries per block,
> which means slower scans and/or more blocks to navigate through. Still,
> compared to the rest of the overhead of an index row or a table row, it
> is low
I have a system where it would be very useful for the primary keys for
a few tables to be UUIDs (actually MD5s of files, but UUID seems to be
the best 128-bit type available). What is the expected performance of
using a UUID as a primary key which will have numerous foreign
references to it, versu
For the reasons indicated (that is, MVCC), PG can not do a DISTINCT or the
equivalent
GROUP BY from index values alone.
Ok, that makes sense. Thanks for the help everybody!
If this table is large, perhaps you could denormalize and maintain a
summary table with date (using truncation) and cou
I have a table of messages with paths and inserted dates (among other
things), like so:
CREATE TABLE Messages (
msgkey BIGSERIAL PRIMARY KEY,
path TEXT NOT NULL,
inserted TIMESTAMP WITHOUT TIMEZONE DEFAULT NOW()
);
I run a query to determine which days actually saw emails come in, like
http://www.powerpostgresql.com/Downloads/annotated_conf_80.html is a good
intro to the various parameters you might set, with some valuable hints on
the effective range you should be considering. I'd suggest you use that
to identify the most likely things to increase, then read the manuals at
htt
For scaling you should consider slony. Either hangout on #slony on
Freenode.net or ask on the mailing list if you have questions.
For some reason I had thought slony was really immature, but it
actually looks really usable.
Intel chips => define more. There are Intel boxes known to have issues
To quickly find out a subset of slow queries on your production system, you
can use the pgfouine tool:
http://pgfouine.projects.postgresql.org/
If you then want to make some measurements of PostgreSQL performances with
some different settings and compare them, consider using the tsung tool (and
Unqualified SELECT COUNT(*) FROM foo is one of the most expensive operations
you can do on your system, since the visibility information has to be
checked
on disk for each row. Instead, try real queries on real data, and post here
if some are too slow for you.
Ok, that's a bad example. I'm lear
I'm writing a webmail-type application that is meant to be used in a
corporate environment. The core of my system is a Postgres database
that is used as a message header cache. The two (relevant) tables
being used are pasted into the end of this message. My problem is
that, as the messages tabl