Re: [PERFORM] PostgreSQL to host e-mail?

2007-01-04 Thread Grega Bremec

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160

Charles A. Landemaine wrote:
| I'm building an e-mail service that has two requirements: It should
| index messages on the fly to have lightening search results, and it
| should be able to handle large amounts of space. The server is going
| to be dedicated only for e-mail with 250GB of storage in Raid-5. I'd
| like to know how PostgreSQL could handle such a large amount of data.
| How much RAM would I need? I expect my users to have a 10GB quota per
| e-mail account.
| Thanks for your advice,
|

Hello, Charles.

I'll second people's suggestions to stay away from RAID5; the kind of
workload a mail storage will have is one that is approximately an even mix
of writes (in database terms, INSERTs, UPDATEs and DELETEs) and reads, and
we all know RAID5 is a loser when it comes to writing a lot, at least when
you're building arrays with less than 10-15 drives. I'd suggest you go for
RAID10 for the database cluster and an extra drive for WAL.

Another point of interest I'd like to mention is one particular aspect of
the workflow of an e-mail user: we will typically touch the main inbox a lot
and leave most of the other folders pretty much intact for most of the time.
This suggests per-inbox quota might be useful, maybe in addition to the
overall quota, because then you can calculate your database working set more
easily, based on usage statistics for a typical account. Namely, if the
maximum size of an inbox is x MB, with y% average utilization, and you plan
for z users, of which w% will be typically active in one day, your database
working set will be somewhere in the general area of (x * y%) * (z * w%) MB.
Add to that the size of the indexes you create, and you have a very
approximate idea of the amount of RAM you need to place in your machines to
keep your performance from becoming I/O-bound.

The main reason I'm writing this mail though, is to suggest you take a look
at Oryx, http://www.oryx.com/; They used to have this product called
Mailstore, which was designed to be a mail store using PostgreSQL as a
backend, and has since evolved to a bit more than just that, it seems.
Perhaps it could be of help to you while building your system, and I'm sure
the people at Oryx will be glad to hear from you while, and after you've
built your system.

Kind regards,
- --
~Grega Bremec
~gregab at p0f dot net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFFncGcfu4IwuB3+XoRA9Y9AJ0WA+0aooVvGMOpQXGStzkRNVDCjwCeNdfs
CArTFwo6geR1oRBFDzFRY/U=
=Y1Lf
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] sql-bench

2006-09-14 Thread Grega Bremec
Tom Lane wrote:
> >> It'd be interesting to see what mysql's performance looks like on this
> >> test using innodb tables, which should be compared against fsync = true
> >> ... but I don't know how to change it to get all the tables to be
> >> innodb.)

Just a point (I've taught some MySQL courses before, sorry 'bout that;
if you're not, I am, sort of :)) - the crash-proof version of
transactional tables in MySQL was supposed to be the Berkeley ones, but
(oh, the irony) they're still beta. InnoDB were just supposed to be
optimized to perform well with loads of data and a mediocre amount of
clients, and *finally* support referential integrity and the rest of the
lot.

Anyways... with Oracle buying off all that stuff, don't even know if it
still matters: the incantation is to either add the ENGINE= or TYPE=
clause after each CREATE TABLE statement, which would look like

  CREATE TABLE foo (
...
  ) ENGINE=InnoDB;

or specify the --default-storage-engine or --default-table-type server
startup option (or, alternatively, set the default-storage-engine or
default-table-type option in my.cnf).

The trick being, mysqldump will be quite explicit in CREATE TABLE
statements, so a vi(1) and a regular expression will probably be needed.

Kind regards,
-- 
Grega Bremec
gregab at p0f dot net


signature.asc
Description: This is a digitally signed message part


Re: [PERFORM] Arguments Pro/Contra Software Raid

2006-05-09 Thread Grega Bremec
-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160

Hannes Dorbath wrote:
> Hi,
> 
> I've just had some discussion with colleagues regarding the usage of
> hardware or software raid 1/10 for our linux based database servers.
> 
> I myself can't see much reason to spend $500 on high end controller
> cards for a simple Raid 1.
> 
> Any arguments pro or contra would be desirable.
> 

One pro and one con off the top of my head.

Hotplug. Depending on your platform, SATA may or may not be hotpluggable
(I know AHCI mode is the only one promising some kind of a hotplug,
which means ICH6+ and Silicon Image controllers last I heard). SCSI
isn't hotpluggable without the use of special hotplug backplanes and
disks. You lose that in software RAID, which effectively means you need
to shut the box down and do maintenance. Hassle.

CPU. It's cheap. Much cheaper than your average hardware RAID card. For
the 5-10% overhead usually imposed by software RAID, you can throw in a
faster CPU and never even notice it. Most cases aren't CPU-bound
anyways, or at least, most cases are I/O bound for the better part. This
does raise the question of I/O bandwidth your standard SATA or SCSI
controller comes with, though. If you're careful about that and handle
hotplug sufficiently, you're probably never going to notice you're not
running on metal.

Kind regards,
- --
Grega Bremec
gregab at p0f dot net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFEYHRAfu4IwuB3+XoRA9jqAJ9sS3RBJZEurvwUXGKrFMRZfYy9pQCggGHh
tLAy/YtHwKvhd3ekVDGFtWE=
=vlyC
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Read only transactions - Commit or Rollback

2005-12-20 Thread Grega Bremec

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160

Nörder-Tuitje wrote:
|> We have a database containing PostGIS MAP data, it is accessed
|> mainly via JDBC. There are multiple simultaneous read-only
|> connections taken from the JBoss connection pooling, and there
|> usually are no active writers. We use connection.setReadOnly(true).
|>
|> Now my question is what is best performance-wise, if it does make
|> any difference at all:
|>
|> Having autocommit on or off? (I presume "off")
|>
|> Using commit or rollback?
|>
|> Committing / rolling back occasionally (e. G. when returning the
|> connection to the pool) or not at all (until the pool closes the
|> connection)?
|>
| afaik, this should be completely neglectable.
|
| starting a transaction implies write access. if there is none, You do
| not need to think about transactions, because there are none.
|
| postgres needs to schedule the writing transactions with the reading
| ones, anyway.
|
| But I am not that performance profession anyway ;-)

Hello, Marcus, Nörder, list.

What about isolation? For several dependent calculations, MVCC doesn't
happen a bit with autocommit turned on, right?

Cheers,
- --
~Grega Bremec
~gregab at p0f dot net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFDp+2afu4IwuB3+XoRA6j3AJ0Ri0/NrJtHg4xBNcFsVFFW0XvCoQCfereo
aX6ThZIlPL0RhETJK9IcqtU=
=xalw
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] JFS fastest filesystem for PostgreSQL?

2005-07-14 Thread Grega Bremec

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dawid Kuroczko wrote:
|
| If you still have a chance, could you do tests with other journaling
| options for ext3 (journal=writeback, journal=data)?  And could you
| give figures about performace of other IO elevators?  I mean, you
| wrote that anticipatory is much wore -- how much worse? :)  Could
| you give numbers for deadline,anticipatory,cfq elevators? :)
|
| And, additionally would it be possible to give numbers for bonnie++
| results?  To see how does pgbench to bonnie++ relate?
|

Hello, list.

I've been thinking on this one for a while - I'm not sure as to what
ratio pgbench has with regard to stressing CPU vs. I/O. There is one
thing that's definitely worth mentioning though: in the tests that I've
been doing with bonnie++ and iozone at my former job, while building a
distributed indexing engine, jfs was the one filesystem with the least
strain on the CPU, which might be one of the deciding factors in making
it look good for a particular workload.

I'm afraid I don't have any concrete figures to offer as the material
itself was classified. I can tell though that we've been comparing it
with both ext2 and ext3, as well as xfs, and notably, xfs was the worst
CPU hog of all. The CPU load difference between jfs and xfs was about
10% in favor of jfs in all random read/write tests, and the interesting
thing is, jfs managed to shuffle around quite a lot of data: the
mbps/cpu% ratio in xfs was much worse. As expected, there wasn't much
difference in block transfer tests, but jfs was slightly winning in the
area of CPU consumption and slightly lagging in the transfer rate field.

What is a little bit concerning though, is the fact that some Linux
distributors like SuSE have removed jfs support from their admin tooling
due to technical problems with jfs
(http://your-local-suse-mirror/.../suse/i386/9.3/docu/RELEASE-NOTES.en.html#14)

I'm curious as to what this means - did they have problems integrating
it into their toolchain or are there actual problems going on in jfs
currently?

Kind regards,
- --
Grega Bremec
gregab at p0f dot net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFC1ld4fu4IwuB3+XoRAqEyAJ0TS9son+brhbQGtV7Cw7T8wa9W2gCfZ02/
dWm/E/Dc99TyKbxxl2tKaZc=
=nvv3
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Filesystem

2005-06-08 Thread Grega Bremec
 another disk drive, and you
have several contendants to that storage unit, you may also want to
release contention a bit by using larger logbufs and logbsize settings,
to provide for more slack in others when a particular needs to spill
buffers to disk.

All of these ideas share one common thought: you can tune a filesystem
so it helps in reducing the amount of iowait. The filesystem itself can
help preventing unnecessary work performed by the disk and eliminating
contention for the bandwidth of the transport subsystem. This can be
achieved by improving internal organization of the filesystem to better
suite the requirements of a typical database workload, and eliminating
the (undesired part of the) book-keeping work in a your filesystem.

Hope to have helped.

Kind regards,
- --
Grega Bremec
gregab at p0f dot net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFCpvdcfu4IwuB3+XoRAiQQAJ4rnnFYGW42U/SnYz4LGmgEsF0s1gCfXikL
HT6EHWeTvQfd+s+9DkvOQpI=
=V+E2
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] 'Fastest' PC's are slowest in the house

2005-06-01 Thread Grega Bremec

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Justin Davis wrote:
| I have five PC's accessing a PG database that is mounted on a
| Dell Windows 2003 server.  The PC's are accessing the database with a
| Fujitsu cobol program via ODBC (all machines have same (newest) ODBC
| driver from PG).  2 of the machines are the newest I have and both
| pretty identically configured but are very slow by comparison to the
| others.  My colleagues and I are still in the exploration / decision
| process, we have been working with and learning the database about 2
months.
|
| I'm looking to see if anyone knows of O/S or hardware issues right off
| the bat or can recommend a debug method, log checking, etc. path we
| might follow.
|
| The program in question reads the PG database and displays matching
| query results on a cobol screen, for the point of this topic that is all
| it is doing.  We run the same query from each PC which returns 15
| records out of a 6,000 record customer DB.
|
| The machines:
|
| - 2 are 2.0 Ghz Dells with 512 Ram & XP SP2 - they take just over 2
minutes
| - 1 AMD 2.4 with 256 Ram & XP SP2 - just under 2 secs.
| - 1 AMD 900 Mhz with 256 Ram & XP SP 1 - just under 2 secs
| - 1 Intel 266 Mhz with 256 Ram & Windows 2000 - 11-13 secs
|

Hello, Justin.

While re-reading your post, I was (still) under the impression that
those machines are all client machines and that there is only one
database they are all accessing. Is my impression true?

If so, then I'm afraid there must be some other issue you've been
hitting, because from the viewpoint of a postmaster, it is completely
irrelevant who the client is. Unless so, can you please provide some
evidence that the issue at hand really has to do with the PostgreSQL
query shipping to those Dells (profiling, for example), so we have
something to work from?

My assertion though is that there's either an issue in the ODBC layer,
or the COBOL program you're running (be it your code or the runtime).

While at it, and completely unrelated, I'm not sure that, both from the
performance and reliability viewpoint, running production PostgreSQL on
a Windows machine may be the best possible decision. If you have the
luxury of experimenting, and unless your side-goal is to run-proof the
Windows version of PostgreSQL, I'd suggest you try a couple of
alternatives, such as Linux, BSD or even Solaris, whichever you feel
will offer you better future support.

If you choose to run it on Windows afterall, I'd kindly advise you to do
your best to stay on the safe side of the story with a double-checked
backup strategy, solely because the Windows version of PostgreSQL is a
new product and not widely used in production environments, so there is
not much expertise yet in the specifics of keeping it performant, stable
and most of all, how to tackle things after the worst has happened.

Kind regards,
- --
Grega Bremec
gregab at p0f dot net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFCnqfgfu4IwuB3+XoRApSRAJ0aJYEIEnJZlw2TeLtSO/1+qmoLHACbBAjS
LahS3A/YMgVthkvnQ3AJcXg=
=Cl6f
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Swapping and Kernel 2.6

2005-05-15 Thread Grega Bremec
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Guillaume Nobiron wrote:
| Hi,
|
| Environment :
|  - Fedora Core 2 (kernel 2.6.10)
|  - postgresql 7.4.7
|
| I am running a huge query with several joins that needs around 900
| Mbytes of Memory (System RAM is 512 Mbytes).
|
| When the system starts to swap the postgres process, CPU consumed drops
| to around 2% (instead of around 50% on another system with kernel 2.4).
| The query was still working after more than 4 hours, spending the time
| with 98% if IO wait.
|
| So I tried to run the postmaster with the environment variable
| LD_ASSUME_KERNEL=2.4.1. With this, the query last 7 minutes !!!
|
| Why swapping is so bad with the new kernel ?
|
Hello, Guillaume.
Your swapping issue may not necessarily have to do with bad, perhaps
just slightly mistuned for your usage profile, virtual memory
management. I think /proc/sys/vm/swappiness only appeared in the 2.6
series of kernels (or late in 2.4), but it has rather significant effect
on the way kernel handles pages that are swapped out to disk, and most
importantly, those that have been swapped back in again, yet still
occupy the swap space - this new mechanism usually results in a certain
amount of memory being reserved swap cache, which leaves less where it's
more important to have it. You might want to read more about it in this
(rather lengthy) kerneltrap articles: http://kerneltrap.org/node/3000
and http://kerneltrap.org/node/3828
What we'd probably need here though (after you've verified playing with
swappiness doesn't help), is a couple of minutes worth of typical memory
manager behaviour while this transaction of yours is taking place,
especially where swapin/swapout goes mad. Try running "vmstat 5" while
it's running and see if there are any interesting patterns, also, be
sure to include enough context before and after such events (up to half
a minute on each side should do).
My guess is, you'd do well with an extra gigabyte or so of memory -
512MB really isn't much nowadays. Why make I/O even worse bottleneck
than it needs to be by forcing pages of active memory in and out? :)
Kind regards,
- --
~Grega Bremec
~gregab at p0f dot net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (GNU/Linux)
iD8DBQFCh8XHfu4IwuB3+XoRAicxAJwI0FzZIpXpxlJlZMXVJUJaqdj0EgCfRNuw
Dr58jtIgHDtjq/LCjd2Kr1s=
=iLle
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] sequence scan on PK

2005-05-08 Thread Grega Bremec
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Tom Lane wrote:
| Jeroen van Iddekinge <[EMAIL PROTECTED]> writes:
|
|>>You could tweak with several settings to get it to do an index scan
|>>earlier, but these would probably break other queries. You don't need to
|>>tune for 100 rows, morelike 100k or 100M.
|
|
|>Which settings shoud I change for this?
|
|
| I'd agree with John's response: if you change any settings based on just
| this one test case, you're a fool.  But usually random_page_cost is the
| best knob to twiddle if you wish to encourage indexscans.
|
Perhaps just a small comment - before starting the tuning process, you
want to make sure the query planner has the right ideas about the nature
of data contained in your indexed column.
Sometimes, if you insert reasonably sized batches of records containing
the same value for that column (for example in a multicolumn key where
you usually retrieve by only one column), statistics collector (used to)
get out of sync with reality with regard to cardinality of data, because
the default snapshot is too small to provide it with objective insight.
If you're intimate with your data, you probably want to increase
statistics target on that column and/or do some other statistics-related
magic and ANALYZE the table again; that alone can mean the difference
between a sequential and an index scan where appropriate, and most
importantly, you don't need to distort the database's understanding of
your hardware to achieve optimal plans (provided you have the value set
to proper values, of course), so you won't get bitten where you don't
expect it. :)
Again, this might not pertain to a 100-row table, but is a good thing
[tm] to know when optimizing. I personally would prefer to look at that
aspect of optimizer's understanding of data before anything else.
Hope this helps.
Regards,
- --
Grega Bremec
gregab at p0f dot net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (GNU/Linux)
iD8DBQFCfvizfu4IwuB3+XoRAhI1AJ92uhoh0u9q7/XPllH37o5KXlpJdwCfQ+2b
sJhq4ZWDdZU9x4APoGOsMes=
=Tq99
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] which dual-CPU hardware/OS is fastest for PostgreSQL?

2005-01-11 Thread Grega Bremec
...and on Mon, Jan 10, 2005 at 08:31:22PM -0800, Joshua D. Drake used the 
keyboard:
> 
> >
> >RAID controllers tend to use i960 or StrongARM CPUs that run at speeds
> >that _aren't_ all that impressive.  With software RAID, you can take
> >advantage of the _enormous_ increases in the speed of the main CPU.
> >
> >I don't know so much about FreeBSD's handling of this, but on Linux,
> >there's pretty strong indication that _SOFTWARE_ RAID is faster than
> >hardware RAID.
> > 
> >
> Unless something has changed though, you can't run raid 10
> with linux software raid and raid 5 sucks for heavy writes.
> 
> J

Hello, Joshua.

Things have changed. :)

From 2.6.10's drivers/md/Kconfig:

config MD_RAID10
tristate "RAID-10 (mirrored striping) mode (EXPERIMENTAL)"
depends on BLK_DEV_MD && EXPERIMENTAL
---help---
  RAID-10 provides a combination of striping (RAID-0) and
  mirroring (RAID-1) with easier configuration and more flexable
  layout.
  Unlike RAID-0, but like RAID-1, RAID-10 requires all devices to
  be the same size (or atleast, only as much as the smallest device
  will be used).
  RAID-10 provides a variety of layouts that provide different levels
  of redundancy and performance.

  RAID-10 requires mdadm-1.7.0 or later, available at:

  ftp://ftp.kernel.org/pub/linux/utils/raid/mdadm/

There is a problem, however, that may render software RAID non-viable
though. According to one of my benchmarks, it makes up for an up to
10% increase in system time consumed under full loads, so if the original
poster's application is going to be CPU-bound, which might be the case,
as he is looking for a machine that's strong on the CPU side, that may
be the "too much" bit.

Of course, if Opteron is being chosen for the increase in the amount of
memory it can address, this is not the issue.

HTH,
-- 
Grega Bremec
gregab at p0f dot net


pgpte0vXuMyG4.pgp
Description: PGP signature


Re: [PERFORM] Hardware purchase question

2005-01-03 Thread Grega Bremec
...and on Mon, Jan 03, 2005 at 03:44:44PM -0500, Mitch Pirtle used the keyboard:
>
> You are right, I now remember that setup was originally called "RAID
> 10 plus 1", and I believe is was an incorrect statement from an
> overzealous salesman ;-)
>

Just an afterthought - that could well be the unfortunate consequence of
salesmen specializing in sales as an act rather than the goods they were
selling - it might be that he/she was referring to the specifics of the
concrete configuration they were selling you (or trying to sell you),
which should, in the case you were mentioning, probably be called "a
RAID10 array with a hotspare drive" - that is, it would be preconfigured
to, upon the failure of one of array members, detect the failed drive and
automatically replace it with one that has been sitting there all the time,
doing nothing but waiting for one of its active companions to fail.

But this already falls into the category that has, so far, probably
caused the vast majority of misunderstandings, failed investments and
grey hair in RAID, namely data safety, and I don't feel particularly
qualified for getting into specifics of this at this moment, as it
happens to be 2AM, I had a couple of beers (my friend's birthday's due)
and I'm dying to get some sleep. :)

HTH, cheers,
-- 
Grega Bremec
gregab at p0f dot net


pgp3e62chGpdT.pgp
Description: PGP signature


Re: [PERFORM] PG Logging is Slow

2004-12-19 Thread Grega Bremec
...and on Mon, Dec 20, 2004 at 03:17:11PM +1100, Theo Galanakis used the 
keyboard:
> Under postgres 7.3 logging is incredibly slow!
> 
> I have applied the following settings:
> 
> syslog = 2
> syslog_facility = 'LOCAL0'
> syslog_ident = 'postgres'
>  
>  log_connections =  true 
> log_duration =  true 
> log_pid =  true 
> log_statement =  true 
> log_timestamp =  true 
>  
> This severely impacted the performance of our production system, a search
> page which took 1-3 seconds now takes over 30, is this normal?
>  
> I need to get some performance indicators from our production db, however I
> cant turn on logging with such performance degradation.
>  

Hi Theo,

One thing you should be sure about is that whichever logfile you have
configured for the local0 facility is being written to asynchronously.
Synchronous logging is REALLY expensive.

If you're using the standard syslogd, you can achieve that by prefixing
the filename in syslogd.conf with a dash. For example,

local0.*/var/log/postgresql.log

would become

local0.*-/var/log/postgresql.log

One other option would be to turn off syslog logging completely and let
postmaster take care of the log on its own, which may or may not be
possible for you, depending on the policy in effect (remote logging, etc.).

Hope this helped,
-- 
Grega Bremec
gregab at p0f dot net


pgpjQ91yejeKo.pgp
Description: PGP signature


Re: [PERFORM] \d output to a file

2004-12-15 Thread Grega Bremec
...and on Wed, Dec 15, 2004 at 06:38:22AM -0800, sarlav kumar used the keyboard:
> Hi All,
>  
> I would like to write the output of the \d command on all tables in a 
> database to an output file. There are more than 200 tables in the database. I 
> am aware of \o command to write the output to a file. But, it will be tough 
> to do the \d for each table manually and write the output to a file. Is there 
> a command/ way in which I can achieve this without having to do it for each 
> table?
> Any help in this regard would be really appreciated.
>  

Hello Sarlav.

You don't say which platform you're doing this on. If it's Windows, someone
else will have to advise you; if it's a UNIX-like platform though, the
following simple shell script should be helpful in achieving what you want:

---CUT-HERE---
#!/bin/bash
if [ -z "$1" ]; then
echo "Please specify a database to query."
exit 1
fi
DATABASE=$1
MYTABLES="`echo '\t\a\dt' | psql -q ${DATABASE} | cut -f 2 -d '|'`"

for table in ${MYTABLES}; do
echo '\d '${table}
done | psql ${DATABASE}
---CUT-HERE---

You can store this script into a file called, for example, describe.sh and
invoke it like so:

$ ./describe.sh mydatabase > description.txt

It should then do what you want.

Should you have additional arguments to specify to psql, such as a host,
a username, a password and so on, it is easy to modify the script to do
that. Just supply those arguments in places where the "psql" command is
used.

Hope this helped,
-- 
Grega Bremec
gregab at p0f dot net


pgpTPZRwRibTV.pgp
Description: PGP signature


Re: [PERFORM] high load caused by I/O - a hint

2004-08-18 Thread Grega Bremec
...and on Wed, Aug 18, 2004 at 10:18:19AM +0200, eleven used the keyboard:
> Hello,
> 
> This is not strictly PostgreSQL performance hint, but may be
> helpful to someone with problems like mine.
> 
> As I earlier posted, I was experiencing very high load average
> on one of my Linux database servers (IBM eServer 345, SCSI disks on LSI 
> Logic controller) caused by I/O bottleneck.
> 
> INSERTs were really slow, even after many days of
> tweaking PostgreSQL configuration. The problem appeared to be
> in the Linux kernel itself - using acpi=ht and noapic boot parameters
> solved my performance problems. Load average dropped below 1.0
> (before, it was as high as ten in peak) and the database
> works much, much faster.

Hello,

Did you try with acpi=noidle? This proved to be of help on many an
occasion before, and you don't have to give up any functionality over
it. It's just that the ACPI BIOS is broken and overloads the system
with idle calls.

Other than that, general guidelines would be, don't combine APM and
ACPI, and rather use proper SMP code for hyperthreaded machines than
just the ACPI CPU enumeration feature.

There's also a new option with 2.6.8.1, called CONFIG_SCHED_SMT that
is supposed to handle some cases SMP code had problems with better,
at the cost of slight overhead in other areas.

My advice would be, if you have an option to choose between APM and
ACPI, go for ACPI. It's the future, it's being developed actively,
it does a whole lot more than APM (that was really only about power
management), and last but not least, I've been using it for four
years on over fifty SMP machines and I never ever had a problem
beyond the scope of what noidle could fix (knocks-on-wood). :)

HTH,
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software & Media
http://www.noviforum.si/


pgpBeCvzHEtfV.pgp
Description: PGP signature


Re: [PERFORM] NAS, SAN or any alternate solution ?

2004-07-20 Thread Grega Bremec
> 
> Oh, and not to forget - the price for a 3ware 9500S-12, the version
> we're testing ranges between EUR1000 and EUR1500, depending on the
> contract you have with the reseller and the intended use of the
> device. SATA disks are dirt-cheap nowadays, as has been mentioned
> before.
> 

Correction, EUR500 and EUR1000, VAT not included. :)

Sorry for the mix-up.
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software & Media
http://www.noviforum.si/


pgpLKhS3Qk0se.pgp
Description: PGP signature


Re: [PERFORM] NAS, SAN or any alternate solution ?

2004-07-20 Thread Grega Bremec
...and on Tue, Jul 20, 2004 at 09:52:56AM +0200, [EMAIL PROTECTED] used the keyboard:
> Hi all,
> 
> I've been searching the list for a while but couldn't find any up-to-date 
> information relating to my problem.
> We have a production server with postgresql on cygwin that currently deels 
> with about 200 Gigs of data (1 big IDE drive). We plan to move to linux 
> for some reasons I don't have to explain.
> Our aim is also to be able to increase our storage capacity up to 
> approximately 1 or 2 terabytes and to speed up our production process. As 
> we are a small "microsoft addicted" company , we have some difficulties to 
> choose the best configuration that would best meet our needs.
> Our production process is based on transaction (mostly huge inserts) and 
> disk access is the main bottlle-neck.
> 
> Our main concern is hardware related :
> 
> Would NAS or SAN be good solutions ? (I've read that NAS uses NFS which 
> could slow down the transfer rate ??)
> Has anyone ever tried one of these with postgresql ? 
> 
> I would appreciate any comments.
> Thanks in advance.

Hello Simon,

We're testing 3ware Escalade 9000, which is a hardware-raid SATA
controller with VERY good support for Linux (including direct access
for S.M.A.R.T. applications, which is a serious problem with other
RAID controllers), featuring RAID levels 0, 1, 10, 5, JBOD, up to
12 SATA channels (that's 3ware Escalade 9500S-12, they also come in
4- and 8-channel versions, up to four cards can be fitted into a
system), up to 1GB battery-backed ECC RAM (128MB out-of-the-box)
and most of all, excellent tuning guides that actually manage to
exceed the scope of merely making you come up with good benchmark
results for that controller in a specific test environment.

Our preliminary tests show that a setup of four 250GB SATA Maxtors
that aren't really qualified as fast drives, in RAID5 can deliver
block writes of 50MB/s, rewrites at about 35MB/s and reads of
approximately 180MB/s, which is rougly 2.5-times the performance
of previous Escalades.

You can find more info on Escalade 9000 series, benchmarks and
other stuff here:

http://www.3ware.com/products/serial_ata9000.asp
http://www.3ware.com/products/benchmarks_sata.asp

http://www.3ware.dk/fileadmin/3ware/documents/Benchmarks/Linux_kernel_2.6_Benchmarking.pdf

Oh, and not to forget - the price for a 3ware 9500S-12, the version
we're testing ranges between EUR1000 and EUR1500, depending on the
contract you have with the reseller and the intended use of the
device. SATA disks are dirt-cheap nowadays, as has been mentioned
before.

I do agree on the reliability of cache-usage setting those drives
report though, it may or may not be true. But one never knows that
for sure with SCSI drives either. At least you can assert that
proper controller cache sizing with drives that usually feature
8MB (!!!) cache, will mostly ensure that even the largest amount
of data that could fit into a hard disk cache of the entire array
(96MB) will still be available in the controller cache after a
power failure, for it to be re-checked and ensured it is properly
written.

Hope this helps,
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software & Media
http://www.noviforum.si/


pgptdbYC1z9Fk.pgp
Description: PGP signature


Re: [PERFORM] Quad processor options

2004-05-11 Thread Grega Bremec
...and on Tue, May 11, 2004 at 03:02:24PM -0600, scott.marlowe used the keyboard:
> 
> If you get the LSI megaraid, make sure you're running the latest megaraid 
> 2 driver, not the older, slower 1.18 series.  If you are running linux, 
> look for the dkms packaged version.  dkms, (Dynamic Kernel Module System) 
> automagically compiles and installs source rpms for drivers when you 
> install them, and configures the machine to use them to boot up.  Most 
> drivers seem to be slowly headed that way in the linux universe, and I 
> really like the simplicity and power of dkms.
> 

Hi,

Given the fact LSI MegaRAID seems to be a popular solution around here, and
many of you folx use Linux as well, I thought sharing this piece of info
might be of use.

Running v2 megaraid driver on a 2.4 kernel is actually not a good idea _at_
_all_, as it will silently corrupt your data in the event of a disk failure.

Sorry to have to say so, but we tested it (on kernels up to 2.4.25, not sure
about 2.4.26 yet) and it comes out it doesn't do hotswap the way it should.

Somehow the replaced disk drives are not _really_ added to the array, which
continues to work in degraded mode for a while and (even worse than that)
then starts to think the replaced disk is in order without actually having
resynced it, thus beginning to issue writes to non-existant areas of it.

The 2.6 megaraid driver indeed seems to be a merged version of the above
driver and the old one, giving both improved performance and correct
functionality in the event of a hotswap taking place.

Hope this helped,
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software & Media
http://www.noviforum.si/


pgptjSeTHjtN2.pgp
Description: PGP signature


Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-05-06 Thread Grega Bremec
...and on Thu, Apr 22, 2004 at 06:59:10AM -0700, Eduardo Almeida used the keyboard:


>
> To reference, Sun has java 64bits just to IA64 and
> Solaris Sparc 64 not to Opteron.
> 

As I mentioned, that is true for the 1.4.x release of the JVMs. We have been
testing some JCA builds of 1.5.0 on x86_64 so far, but it is too unstable for
any kind of serious work.

Cheers,
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software & Media
http://www.noviforum.si/


pgpNaspEVJ49h.pgp
Description: PGP signature


Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-22 Thread Grega Bremec
...and on Thu, Apr 22, 2004 at 05:53:18AM -0700, Eduardo Almeida used the keyboard:
> 
> - The configuration of the machine is:
> Dual opteron 64 bits model 240
> 4GB RAM
> 960 GB on RAID 0
> Mandrake Linux 64 with Kernel 2.6.5  (I compiled a
> kernel for this test)
> Java SDK  java version "1.4.2_04"
> PostgreSQL JDBC pg74.1jdbc3.jar
> 
> - The TPC-H configuration is:
> TPC-H 2.0.0
> 100GB
> load using flat files
> Refresh functions using java
> 

I'll just add for the reference, to those that aren't aware of it, the Java
virtual machine for x86_64 only exists in the 1.5 branch so far, and it's so
utterly unstable that most every notable shuffling around in the memory
crashes it. :)

Hence the 1.4.2_04 is a 32-bit application running in 32-bit mode.

I won't be getting into how much this affects the benchmarks as I didn't
really get into how CPU- and memory-intensive the refresh functions are in
these, so as I said - let's keep it a reference.

Cheers,
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software & Media
http://www.noviforum.si/


pgp0.pgp
Description: PGP signature


Re: [PERFORM] [ADMIN] Raw devices vs. Filesystems

2004-04-07 Thread Grega Bremec
...and on Wed, Apr 07, 2004 at 09:09:16AM -0700, Josh Berkus used the keyboard:
> 
> Does it work, though?   Without Oracle admin tools?

Hello, Josh. :)

Well, as I said, that's why I was asking - I'm willing to give it a go
if nobody can prove me wrong. :)

> > Now, if both goals can be achieved in one go, hell, I'm willing to try
> > it out myself in an attempt to extract off of it, some performance
> > indicators that could be compared to other database performance tests
> > sent to both this and the PERFORM mailing list.
> 
> Hey, any test you wanna run is fine with us.I'm pretty sure that OCFS 
> belongs to Oracle, though, patent & copyright, so we couldn't actually use it 
> in practice.

I thought you knew - OCFS, OCFS-Tools and OCFSv2 have not only been open-
source for quite a while now - they're released under the GPL.

http://oss.oracle.com/projects/ocfs/
http://oss.oracle.com/projects/ocfs-tools/
http://oss.oracle.com/projects/ocfs2/

I don't know what that means to you (probably nothing good, as PostgreSQL
is released under the BSD license), but it most definitely can be considered
a good thing for the end user, as she can download it, compile, and set it
up on her disks, without the need to pay Oracle royalties. :)

> If your intention in this test is to show the superiority of raw devices, let 
> me give you a reality check: barring some major corporate backing getting 
> involved, we can't possibly implement our own PG-FS for database support.  We 
> already have a TODO list which is far too long for our developer pool, and 
> implementing a custom FS either takes a large team (OCFS) or several years of 
> development (Reiser).   

Not really - I was just thinking about something not-entirely-a-filesystem
and POK!, OCFS sprang to mind. It omits many POSIX features that slow down
a traditional filesystem, yet it does know the concept of inodes and most
of all, it's _really_ heavy on caching. As such, it sounded quite promising
to me, but trial, I think, is the best test.

The question does spring up though, that Steve raised in another post - just
for the record, what POSIX semantics can a postmaster live without in a
filesystem?

Cheers,
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software & Media
http://www.noviforum.si/


pgp0.pgp
Description: PGP signature


Re: [PERFORM] [ADMIN] Raw devices vs. Filesystems

2004-04-07 Thread Grega Bremec
...and on Wed, Apr 07, 2004 at 01:26:02AM -0400, Tom Lane used the keyboard:
> 
> After that, we get to implement our own filesystem-equivalent management
> of disk space allocation, disk I/O scheduling, etc.  Are we really
> smarter than all those kernel hackers doing this for a living?  I doubt it.
> 
> After that, we get to re-optimize all the existing Postgres behaviors
> that are designed to sit on top of a standard Unix buffering filesystem
> layer.
> 
> After that, we might reap some performance benefits.  Or maybe not.
> There's not a heck of a lot of hard evidence that we would --- and
> what there is traces to twenty-year-old assumptions about disk drive
> and OS behavior, which are quite unlikely to still apply today.
> 
> Personally, I have a lot of more-promising projects to pursue...
> 

Has anyone tried PostgreSQL on top of OCFS? Personally, I'm not sure it
would even work, as Oracle clearly state that OCFS was _never_ meant to
be a fully fledged UNIX filesystem with POSIX features such as correct
timestamp updates, inode changes, etc., but OCFSv2 brings some features
that might lead one into thinking they're about to make it suitable for
uses beyond that of just having Oracle databases sitting on top of it.

Furthermore, this filesystem would be a blazing one stop solution for
all replication issues PostgreSQL currently suffers from, as its main
design goal was to present "a consistent file system image across the
servers in a cluster".

Now, if both goals can be achieved in one go, hell, I'm willing to try
it out myself in an attempt to extract off of it, some performance
indicators that could be compared to other database performance tests
sent to both this and the PERFORM mailing list.

So, anyone? :)

Cheers,
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software & Media
http://www.noviforum.si/


pgp0.pgp
Description: PGP signature


Re: [PERFORM] Linux filesystem shootout

2003-10-09 Thread Grega Bremec
I should at least read the URLs before re-posting info.

My bad, I'm utterly sorry about this... :-(

Cheers,
-- 
Grega Bremec
Sistemska administracija in podpora
grega.bremec-at-noviforum.si
http://najdi.si/
http://www.noviforum.si/


pgp0.pgp
Description: PGP signature


Re: [PERFORM] Linux filesystem shootout

2003-10-09 Thread Grega Bremec
...and on Thu, Oct 09, 2003 at 04:42:53PM +0530, Shridhar Daithankar used the keyboard:
>
> http://www.ussg.iu.edu/hypermail/linux/kernel/0310.1/0208.html
> 
> Shridhar

My $0.1:

I just stumbled across an interesting filesystem comparison table today,
comparing ext2/ext3/reiser/reiser4/jfs/xfs on a single UP P2/450 machine
with an old UDMA2 Seagate.

Now however archaic this box may have been, I think that the tests still
bear some objectivity, as it's a comparison test and not some "how much
can we squeeze out of xyz" type of bragging.

The tests were done using bonnie++ and IOZone and are essentially just a
couple of tables listing the average results achieved by each of those
tests.

Also, ext3, reiser and reiser4 were tested in a couple of different
configurations (reiser4 extents, reiser notail, ext3 journal, ordered and
writeback mode).

Oh, i shouldn't forget - the address is http://fsbench.netnation.com/ :)

Cheers,
-- 
Grega Bremec
Sistemska administracija in podpora
grega.bremec-at-noviforum.si
http://najdi.si/
http://www.noviforum.si/


pgp0.pgp
Description: PGP signature


Re: [PERFORM] PostgreSQL vs. MySQL

2003-07-05 Thread Grega Bremec
...and on Sat, Jul 05, 2003 at 12:24:18AM +0200, Bjoern Metzdorf used the keyboard:
> >> Afaik, your original posting said postgresql was 3 times slower than
> >> mysql and that you are going to leave this list now. This implied
> >> that you have made your decision between postgresql and mysql,
> >> taking mysql because it is faster.
> >
> > Well, that shows what you get for making implications.  The client is
> > sticking with postgres and we are coding around the issue in other
> > ways.
> 
> As many other guys here pointed out, there are numerous ways to tune
> postgresql for maximum performance. If you are willing to share more
> information about your particular project, we might be able to help you out
> and optimize your application, without the need to code around the issue as
> much as you may be doing right now.
> Even if it is not possible for you to share enough information, there are a
> lot of places where you can read about performance tuning (if not in the
> docs then in the archives).
> 

Also, I should think the clients would not be too offended if Brian posted
some hint about the actual quantity of data involved here, both the total
expected database size and some info about the estimated "working set" size,
such as a sum of sizes of tables most commonly used in JOIN queries and the
percentage of data being shuffled around in those. Are indexes big? Are
there any multicolumn indexes in use? Lots of sorting expected? Lots of
UPDATEs/INSERTs/DELETEs?

Also, it would be helpful to know just how normalized the database is, to
provide some advice about possible query optimization, which could again
prove helpful in speeding the machinery up.

Another useful piece of information would be the amount of memory consumed
by other applications vs. the amount of memory reserved by the OS for cache,
and the nature of those other applications running - are they big cache
consumers, such as Apache with static content and a large load would be,
or do they keep a low profile?

I think this would, in combination with the information already posted, such
as the amount of memory and I/O subsystem info, at least enable us to advise
about the recommended shared_buffers, effective_cache_size, sort_mem,
vacuum_mem, and others, without compromising the intellectual property of
Brian's clients.

> > over and out.

I CC'd this post over to you, Brian, 'cause this signoff made me rather
unsure as to whether or not you're still on the list. Hope you don't mind.

Sincerely,
-- 
Grega Bremec
System Administration & Development Support
grega.bremec-at-noviforum.si
http://najdi.si/
http://www.noviforum.si/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])