Re: [PERFORM] What`s wrong with JFS configuration?

2007-04-26 Thread Cosimo Streppone

Jim Nasby wrote:


On Apr 25, 2007, at 8:51 AM, Paweł Gruszczyński wrote:
where u6 stores Fedora Core 6 operating system, and u0 stores 3 
partitions with ext2, ext3 and jfs filesystem.


Keep in mind that drives have a faster data transfer rate at the 
outer-edge than they do at the inner edge [...]


I've been wondering from time to time if partitions position
can be a (probably modest, of course) performance gain factor.

If I create a partition at the beginning or end of the disk,
is this going to have a determined platter physical position?

I remember having heard that every manufacturer has its own
allocation logic.

Has anyone got some information, just for curiosity?

--
Cosimo


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL in virtual machine

2007-03-13 Thread Cosimo Streppone

Andreas Tille wrote:


Are there any experiences
about reasonable performance increasing strategies? Are there any
special things to regard in a VM?


Not directly about Postgresql, but I'm seeing evidence that upgrading
from vmware 2.5.3 to 3.0.1 seems to have solved disk access
performance issues (measured with simple dd runs).

With vmware 2.5.3 + RedHat Enterprise 4.0 I measured a sequential
read performance on 1-2 Gb files of less than 10 Mbytes/sec on a
IBM FastT600 SAN volume partition.

After the upgrade to 3.0 I had feedback from sysadmins that issue
was solved, but I didn't have the opportunity to repeat the read
tests yet.

--
Cosimo


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] int4 vs varchar to store ip addr

2007-01-29 Thread Cosimo Streppone

Nicolas wrote:

I have an authorization table that associates 1 customer IP to a service 
IP to determine a TTL (used by a radius server).


table auth
 client varchar(15);
 service varchar(15);
 ttl int4;
client and service are both ip addr.

The number of distinct clients can be rather large (say around 4 
million) and the number of distinct service around 1000.


there's a double index on ( client , service ).


It comes to mind another solution... I don't know if it is better or worse,
but you could give it a try.
Store IP addresses as 4 distinct columns, like the following:

CREATE TABLE auth (
client_ip1 shortint,
client_ip2 shortint,
client_ip3 shortint,
client_ip4 shortint,
servicevarchar(15),
ttlint4,
);

And then index by client_ip4/3/2/1, then service.

CREATE INDEX auth_i1 ON auth (client_ip4, client_ip3, client_ip2, 
client_ip1);

or:

CREATE INDEX auth_i1 ON auth (client_ip4, client_ip3, client_ip2, 
client_ip1, service);

I'm curious to know from pg internals experts if this could be a
valid idea or is totally non-sense.

Probably the builtin ip4 type is better suited for these tasks?

--
Cosimo

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-16 Thread Cosimo Streppone

Alexander Staubo wrote:


On Dec 15, 2006, at 17:53 , Ron wrote:


At 09:50 AM 12/15/2006, Greg Smith wrote:


On Fri, 15 Dec 2006, Merlin Moncure wrote:

The slower is probably due to the unroll loops switch which can  
actually hurt code due to the larger footprint (less cache  coherency).


The cache issues are so important with current processors that I'd  
suggest throwing -Os (optimize for size) into the mix people  test.  


So far I have been compiling PostgreSQL and running my pgbench script  
manually, but this makes me want to modify my script to run pgbench  
automatically using all possible permutations of a set of compiler  flags.


I don't know if it's practical, but this link comes to mind:

http://clusty.com/search?query=acovea

--
Cosimo

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-13 Thread Cosimo Streppone

Michael Stone wrote:


On Tue, Dec 12, 2006 at 01:42:06PM +0100, Cosimo Streppone wrote:

-O0 ~ 957 tps
-O1 -mcpu=pentium4 -mtune=pentium4 ~ 1186 tps
-O2 -mcpu=pentium4 -mtune=pentium4 ~ 1229 tps
-O3 -mcpu=pentium4 -mtune=pentium4 ~ 1257 tps
-O6 -mcpu=pentium4 -mtune=pentium4 ~ 1254 tps

I'm curious now to get the same tests run with
a custom-cflags-compiled glibc.


I'd be curious to see -O2 with and without the arch-specific flags, 
since that's mostly what the discussion is about.


I run the same tests only for:

1) '-O2'
2) '-O2 -march=pentium4 -mtune=pentium4 -mcpu=pentium4'
   (so no more doubts here, and thanks for gcc hints :-)

and I obtained respectively an average of 1238 (plain -O2)
vs. 1229 tps on 9 runs.
Disk subsystem is a standard desktop SATA, no more than that.

I tried also recompiling *only* pgbench with various options, but as
I expected (and hoped) nothing changed.

Interesting, eh?

--
Cosimo


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-12 Thread Cosimo Streppone

Luke Lonergan wrote:


Can you try this with just -O3 versus -O2?


Thanks to Daniel for doing these tests.
I happen to have done the same tests about 3/4 years ago,
and concluded that gcc flags did *not* influence performance.

Moved by curiosity, I revamped those tests now on a test
machine (single P4 @ 3.2 Ghz, with 2Mb cache and 512 Mb Ram).

Here are the results:

http://www.streppone.it/cosimo/work/pg/gcc.png

In short: tests executed with postgresql 8.2.0,
gcc version 3.4.3 20041212 (Red Hat 3.4.3-9.EL4),
tps figures computed as average of 9 pgbench runs (don't ask why 9... :-),
with exactly the same commands given by Daniel:

-O0 ~ 957 tps
-O1 -mcpu=pentium4 -mtune=pentium4 ~ 1186 tps
-O2 -mcpu=pentium4 -mtune=pentium4 ~ 1229 tps
-O3 -mcpu=pentium4 -mtune=pentium4 ~ 1257 tps
-O6 -mcpu=pentium4 -mtune=pentium4 ~ 1254 tps

I'm curious now to get the same tests run with
a custom-cflags-compiled glibc.

--
Cosimo

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Looking for hw suggestions for high concurrency OLTP app

2006-12-11 Thread Cosimo Streppone

Hi all,

I'd like to get suggestions from all you out there for
a new Postgresql server that will replace an existing one.

My performance analysis shows very *low* iowaits,
and very high loads at times of peak system activity.
The average concurrent processes number is 3/4, with peaks of 10/15.
*Sustained* system load varies from 1.5 to 4, while peak load
reaches 20 and above, always with low iowait%.
I see this as a clear sign of more processors power need.

I'm aware of the context-switching storm problem, but here
the cs stays well under 50,000, so I think it's not the problem
here.

Current machine is an Acer Altos R700 (2 Xeon 2.8 Ghz, 4 Gb RAM),
similar to this one (R710):
http://www.acer.co.uk/acereuro/page9.do?sp=page4dau34.oid=7036UserCtxParam=0GroupCtxParam=0dctx1=17CountryISOCtxParam=UKLanguageISOCtxParam=enctx3=-1ctx4=United+Kingdomcrc=334044639

So, I'm looking for advice on a mid-range OLTP server with
the following (flexible) requirements:

- 4 physical processors expandable to 8 (dual core preferred),
  either Intel or AMD
- 8 Gb RAM exp. to at least 32 Gb
- Compatibility with Acer S300 External storage enclosure.
  The old server uses that, and it is boxed with 15krpm hdds
  in RAID-10, which do perfectly well their job.
  The enclosure is connected via 2 x LSI Logic PCI U320 controllers
- On-site, same-day hardware support contract
- Rack mount

Machines we're evaluating currently include:
- Acer Altos R910
- Sun Fire V40Z,
- HP Integrity RX4640
- IBM eServer 460

Experiences on these machines?
Other suggestions?

Thanks.

--
Cosimo


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Context switch storm

2006-11-14 Thread Cosimo Streppone

Richard Huxton wrote:

Cosimo Streppone wrote:

Richard Huxton wrote:


The average context switching for this server as vmstat shows is 1
but when the problem occurs it goes to 25.



I seem to have the same exact behaviour for an OLTP-loaded 8.0.1 server


upgrade from 8.0.1 - the most recent is 8.0.9 iirc
[...]
Are you seeing a jump in context-switching in top? You'll know when you 
do - it's a *large* jump. That's the key diagnosis. Otherwise it might 
simply be your configuration settings aren't ideal for that workload.




Sorry for the delay.

I have logged vmstat results for the last 3 days.
Max context switches figure is 20500.

If I understand correctly, this does not mean a storm,
but only that the 2 Xeons are overloaded.
Probably, I can do a good thing switching off the HyperThreading.
I get something like 12/15 *real* concurrent processes hitting
the server.

I must say I lowered shared_buffers to 8192, as it was before.
I tried raising it to 16384, but I can't seem to find a relationship
between shared_buffers and performance level for this server.

Well, the client I saw it with just bought a dual-opteron server and 
used their quad-Xeon for something else. However, I do remember that 8.1 
seemed better than 7.4 before they switched. Part of that might just 
have been better query-planning and other efficiences though.


An upgrade to 8.1 is definitely the way to go.
Any 8.0 - 8.1 migration advice?

Thanks.

--
Cosimo


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Context switch storm

2006-11-14 Thread Cosimo Streppone

Merlin wrote:


On 11/14/06, Jim C. Nasby [EMAIL PROTECTED] wrote:


On Tue, Nov 14, 2006 at 09:17:08AM -0500, Merlin Moncure wrote:
 On 11/14/06, Cosimo Streppone [EMAIL PROTECTED] wrote:
 I must say I lowered shared_buffers to 8192, as it was before.
 I tried raising it to 16384, but I can't seem to find a relationship
 between shared_buffers and performance level for this server.

 My findings are pretty much the same here.
 [...]

BTW, shared_buffers of 16384 is pretty low by today's standards


Can you think of a good way to construct a test case that would
demonstrate the difference?


Not sure of actual relevance, but some time ago I performed
(with 8.0) several pg_bench tests with 1,5,10,20 concurrent
clients with same pg configuration except one parameter for
every run.

In one of these tests I run pgbench with shared_buffers starting
at 1024 and doubling it to 2048, ..., until 16384.
I found the best performance in terms of transactions per second
around 4096/8192.

That said, I don't know if pgbench stresses the database
like my typical oltp application does.

And also, I suspect that shared_buffers should not be
evaluated as an absolute number, but rather as a number relative to
maximum main memory (say 1/2 the total ram, 1/3, 2/3, ...).

--
Cosimo


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Easy read-heavy benchmark kicking around?

2006-11-08 Thread Cosimo Streppone

Merlin Moncure wrote:


On 11/8/06, Markus Schaber [EMAIL PROTECTED] wrote:

Hi, Brian,

Brian Hurt wrote:

 So the question is: is there an easy to install and run, read-heavy
 benchmark out there that I can wave at them to get them to fix the
 problem?

For sequential read performance, use dd. Most variants of dd I've seen
output some timing information, and if not, do a time dd
if=/your/device of=/dev/null bs=1M on the partition.


we had a similar problem with a hitachi san, the ams200.  Their
performance group refused to admit the fact that 50mb/sec dd test was
a valid performance benchmark and needed to be addressed.


 [...]


oh, the unit also lost a controller after about a week of
operation...the unit being a HITACHI SAN, the AMS200.

any questions?


Yes, one.
What was that unit?

;-)

--
Cosimo

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Context switch storm

2006-11-06 Thread Cosimo Streppone

Andreas Kostyrka wrote:


The solution for us has been twofold:

upgrade to the newest PG version available at the time while we waited
for our new Opteron-based DB hardware to arrive.


Do you remember the exact Pg version?

--
Cosimo


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Context switch storm

2006-11-03 Thread Cosimo Streppone

Richard Huxton wrote:


[EMAIL PROTECTED] wrote:

Hi,

We've migrated one of our servers from pg 7.4 to 8.1 and from times
to times (4 hours) the server start doing a lot of context switching
and all transactions become very slow.

The average context switching for this server as vmstat shows is 1
but when the problem occurs it goes to 25.


You'll tend to see it when you have multiple clients and most queries 
can use RAM rather than disk I/O. My understanding of what happens is 
that PG requests data from RAM - it's not in cache so the process gets 
suspended to wait. The next process does the same, with the same result. 
  You end up with lots of processes all fighting over what data is in 
the cache and no-one gets much work done.


Does this happen also with 8.0, or is specific to 8.1 ?
I seem to have the same exact behaviour for an OLTP-loaded 8.0.1 server
when I raise `shared_buffers' from 8192 to 4.
I would expect an increase in tps/concurrent clients, but I see an average
performance below a certain threshold of users, and when concurrent users
get above that level, performance starts to drop, no matter what I do.

Server logs and io/vm statistics seem to indicate that there is little
or no disk activity but machine loads increases to 7.0/8.0.
After some minutes, the problem goes away, and performance returns
to acceptable levels.

When the load increases, *random* database queries show this slowness,
even if they are perfectly planned and indexed.

Is there anything we can do?

--
Cosimo


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Update on high concurrency OLTP application and Postgres

2006-09-23 Thread Cosimo Streppone

Christian Storm wrote:


At the moment, my rule of thumb is to check out the ANALYZE VERBOSE
messages to see if all table pages are being scanned.

  INFO: mytable: scanned xxx of yyy pages, containing ...

If xxx = yyy, then I keep statistics at the current level.
When xxx is way less than yyy, I increase the numbers a bit
and retry.

It's probably primitive, but it seems to work well.



What heuristic do you use to up the statistics for such a table?


No heuristics, just try and see.
For tables of ~ 10k pages, I set statistics to 100/200.
For ~ 100k pages, I set them to 500 or more.
I don't know the exact relation.


Once you've changed it, what metric do you use to

 see if it helps or was effective?

I rerun an analyze and see the results... :-)
If you mean checking the usefulness, I can see it only
under heavy load, if particular db queries run in the order
of a few milliseconds.

If I see normal queries that take longer and longer, or
they even appear in the server's log ( 500 ms), then
I know an analyze is needed, or statistics should be set higher.

--
Cosimo



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Update on high concurrency OLTP application and Postgres 8 tuning

2006-09-20 Thread Cosimo Streppone

Hi all,

I was searching tips to speed up/reduce load on a Pg8 app.
Thank you for all your suggestions on the matter.
Thread is archived here:

http://www.mail-archive.com/pgsql-performance@postgresql.org/msg18342.html

After intensive application profiling and database workload analysis,
I managed to reduce CPU load with application-level changes.

For database overload in presence of many concurrent
transactions, I found that just doing an ANALYZE on sensible
relations makes the situation better.

I scheduled a cron job every hour or so that runs an analyze on the
4/5 most intensive relations and sleeps 30 seconds between every
analyze.

This has optimized db response times when many clients run together.
I wanted to report this, maybe it can be helpful for others
out there... :-)

--
Cosimo


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Update on high concurrency OLTP application and Postgres

2006-09-20 Thread Cosimo Streppone

Andrew wrote:


On Wed, Sep 20, 2006 at 11:09:23AM +0200, Cosimo Streppone wrote:

I scheduled a cron job every hour or so that runs an analyze on the
4/5 most intensive relations and sleeps 30 seconds between every
analyze.


This suggests to me that your statistics need a lot of updating.


Agreed.


You _might_ find that setting the statistics to a higher number on some
columns of some of your tables will allow you to analyse less
frequently.


At the moment, my rule of thumb is to check out the ANALYZE VERBOSE
messages to see if all table pages are being scanned.

  INFO: mytable: scanned xxx of yyy pages, containing ...

If xxx = yyy, then I keep statistics at the current level.
When xxx is way less than yyy, I increase the numbers a bit
and retry.

It's probably primitive, but it seems to work well.

 [...] ANALYSE will impose an I/O load.

In my case, analyze execution doesn't impact performance
in any noticeable way. YMMV of course.

--
Cosimo


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] High concurrency OLTP database performance tuning

2006-08-31 Thread Cosimo Streppone

Good morning,

I'd like to ask you some advice on pg tuning in a high
concurrency OLTP-like environment.
The application I'm talking about is running on Pg 8.0.1.
Under average users load, iostat and vmstat show that iowait stays
well under 1%. Tables and indexes scan and seek times are also good.
I can be reasonably sure that disk I/O is not the *main* bottleneck
here.

These OLTP transactions are composed each of 50-1000+ small queries, on
single tables or 2/3 joined tables. Write operations are very frequent,
and done concurrently by many users on the same data.

Often there are also queries which involve record lookups like:

  SELECT DISTINCT rowid2 FROM table
  WHERE rowid1IN (long_list_of_numerical_ids) OR
refrowid1 IN (long_list_of_numerical_ids)

These files are structured with rowid fields which link
other external tables, and the links are fairly complex to follow.
SQL queries and indexes have been carefully(?) built and tested,
each with its own explain analyze.

The problem is that under peak load, when n. of concurrent transactions
raises, there is a sensible performance degradation.
I'm looking for tuning ideas/tests. I plan to concentrate,
in priority order, on:

- postgresql.conf, especially:
 effective_cache_size (now 5000)
 bgwriter_delay (500)
 commit_delay/commit_siblings (default)
- start to use tablespaces for most intensive tables
- analyze the locks situation while queries run
- upgrade to 8.1.n
- convert db partition filesystem to ext2/xfs?
  (now ext3+noatime+data=writeback)
- ???

Server specs:
   2 x P4 Xeon 2.8 Ghz
   4 Gb RAM
   LSI Logic SCSI 2x U320 controller
6 disks in raid 1 for os, /var, WAL
   14 disks in raid 10 for db on FC connected storage

Current config is now (the rest is like the default):
  max_connections = 100
  shared_buffers = 8192
  work_mem = 8192
  maintenance_work_mem = 262144
  max_fsm_pages = 20
  max_fsm_relations = 1000
  bgwriter_delay = 500
  fsync = false
  wal_buffers = 256
  checkpoint_segments = 32
  effective_cache_size = 5000
  random_page_cost = 2

Thanks for your ideas...

--
Cosimo


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Monitoring Postgresql performance

2005-09-29 Thread Cosimo Streppone

Arnau wrote:


Hi all,

  I have been googling a bit searching info about a way to monitor 
postgresql (CPU  Memory, num processes, ... ) and I haven't found 
anything relevant. I'm using munin to monitor others parameters of my 
servers and I'd like to include postgresql or have a similar tool. Any 
of you is using anything like that? all kind of hints are welcome :-)


Probably, as you said, this is not so much relevant,
as it is something at *early* stages of usability :-)
but have you looked at pgtop?

The basic requirement is that you enable your postmaster
stats collector and query command strings.

Here is the first announcement email:
  http://archives.postgresql.org/pgsql-announce/2005-05/msg0.php

And its home on the CPAN:
  http://search.cpan.org/dist/pgtop/pgtop

--
Cosimo


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] tricky query

2005-06-28 Thread Cosimo Streppone

Merlin Moncure wrote:


I need a fast way (sql only preferred) to solve the following problem:
I need the smallest integer that is greater than zero that is not in the
column of a table.

I've already worked out a query using generate_series (not scalable) and
pl/pgsql.  An SQL only solution would be preferred, am I missing
something obvious?


Probably not, but I thought about this brute-force approach... :-)
This should work well provided that:

- you have a finite number of integers. Your column should have a biggest
  integer value with a reasonable maximum like 100,000 or 1,000,000.
  #define YOUR_MAX 9

- you can accept that query execution time depends on smallest integer found.
  The bigger the found integer, the slower execution you get.

Ok, so:

Create a relation integers (or whatever) with every single integer from 1 to 
YOUR_MAX:


  CREATE TABLE integers (id integer primary key);
  INSERT INTO integers (id) VALUES (1);
  INSERT INTO integers (id) VALUES (2);
  ...
  INSERT INTO integers (id) VALUES (YOUR_MAX);

Create your relation:

  CREATE TABLE merlin (id integer primary key);
  and fill it with values

Query is simple now:

  SELECT a.id FROM integers a
LEFT JOIN merlin b ON a.id=b.id
WHERE b.id IS NULL
 ORDER BY a.id LIMIT 1;

Execution times with 100k tuples in integers and
99,999 tuples in merlin:

  \timing
  Timing is on.
  select i.id from integers i left join merlin s on i.id=s.id where s.id is 
null order by i.id limit 1;

   9

  Time: 233.618 ms
  insert into merlin (id) values (9);
  INSERT 86266614 1
  Time: 0.579 ms
  delete from merlin where id=241;
  DELETE 1
  Time: 0.726 ms
  select i.id from integers i left join merlin s on i.id=s.id where s.id is 
null order by i.id limit 1;

   241

  Time: 1.336 ms
  

--
Cosimo


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] tricky query

2005-06-28 Thread Cosimo Streppone

John A Meinel wrote:

John A Meinel wrote:

Merlin Moncure wrote:


I need the smallest integer that is greater than zero that is not in the
column of a table.  In other words, if an 'id' column has values
1,2,3,4,6 and 7, I need a query that returns the value of 5.



 [...]


Well, I was able to improve it to using appropriate index scans.
Here is the query:

SELECT t1.id+1 as id_new FROM id_test t1
   WHERE NOT EXISTS
   (SELECT t2.id FROM id_test t2 WHERE t2.id = t1.id+1)
   ORDER BY t1.id LIMIT 1;


I'm very interested in this tricky query.
Sorry John, but if I populate the `id_test' relation
with only 4 tuples with id values (10, 11, 12, 13),
the result of this query is:

  cosimo= create table id_test (id integer primary key);
  NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index 'id_test_pkey' 
for table 'id_test'

  CREATE TABLE
  cosimo= insert into id_test values (10); -- and 11, 12, 13, 14
  INSERT 7457570 1
  INSERT 7457571 1
  INSERT 7457572 1
  INSERT 7457573 1
  INSERT 7457574 1
  cosimo= SELECT t1.id+1 as id_new FROM id_test t1 WHERE NOT EXISTS (SELECT 
t2.id FROM id_test t2 WHERE t2.id = t1.id+1) ORDER BY t1.id LIMIT 1;

   id_new
  
   15
  (1 row)

which if I understand correctly, is the wrong answer to the problem.
At this point, I'm starting to think I need some sleep... :-)

--
Cosimo


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] faster search

2005-06-11 Thread Cosimo Streppone

Steve Atkins wrote:


On Fri, Jun 10, 2005 at 01:45:05PM -0400, Clark Slater wrote:


Hi-

Would someone please enlighten me as
to why I'm not seeing a faster execution
time on the simple scenario below?



 [...]


Create an index on (productlistid, typeid, partnumber) then

  select * from test where productlistid=3 and typeid=9
   order by productlistid, typeid, partnumber LIMIT 15;



Clark, try also adding (just for testing) partnumber to your
where clause, like this:

   select * from test where productlistid=3 and typeid=9
   and partnumber='foo' order by productlistid,
   typeid, partnumber;

and check output of explain analyze.

I had experiences of planner bad use of indexes when attribute
types were integer and cardinality was low (a single attribute
value, like typeid=9 selects one or few rows).
However, this was on 7.1.3, and probably is not relevant to your case.

--
Cosimo


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Adaptec/LSI/?? RAID

2005-06-03 Thread Cosimo Streppone

Stacy White wrote:


We're in the process of buying another Opteron server to run Postgres, and
based on the suggestions in this list I've asked our IT director to get an
LSI MegaRaid controller rather than one of the Adaptecs.

But when we tried to place our order, our vendor (Penguin Computing) advised
we find LSI does not work well with 4GB of RAM. Our engineering find that
LSI card could cause system crashes. One of our customer ... has found that
Adaptec cards works well on PostGres SQL


Probably, your vendor is trying to avoid problems at all, but
one of our customers is not a pretty general case, and
we find LSI does not work well, but is there a documented reason?

Anyway, my personal experience has been with an Acer Altos R701 + S300
external storage unit, equipped with LSI Logic Megaraid U320 aka
AMI Megaraid aka LSI Elite 1600
(honestly, these cards come with zillions of names and subnames, that
I don't know exactly how to call them).

This system was configured in various ways. The final layout is
3 x RAID1 arrays (each of 2 disks) and 1 x RAID10 array (12 disks).
This configuration is only available when you use 2 LSI cards (one
for each S300 scsi bus).

The system behaves pretty well, with a sustained sequential write rate
of 80Mb/s, and more importantly, a quite high load in our environment
of 10 oltp transactions per second, without any problems and
`cat /proc/loadavg`  1.

I don't like the raid configuration system of LSI, that is
counter-intuitive for raid 10 arrays. It got me 4 hours and
a tech support call to figure out how to do it right.

Also, I think LSI cards don't behave well with particular
raid configurations, like RAID 0 with 4 disks, or RAID 10
with also 4 disks. It seemed that these configurations put
the controller under heavy load, thus behaving unreasonably
worse than, for example, 6-disks-RAID0 or 6-disks-RAID1.
Sorry, I can't be more scientific on this.

For Adaptec, I don't have any direct experience.

--
Cosimo


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] postgresql-8.0.1 performance tuning

2005-05-31 Thread Cosimo Streppone

Martin Fandel wrote:

i'm trying to tune my postgresql-db but i don't know if the values are 
I use the following environment for the postgres-db:


I assumed you're running Linux here, you don't mention it.


# Hardware 
cpu: 2x P4 3Ghz
ram: 1024MB DDR 266Mhz


I think 1Gb RAM is quite minimal, nowadays.
Read below.


partitions:
/dev/sda3  23G  9,6G   13G  44% /
/dev/sda1  11G  156M  9,9G   2% /var
/dev/sdb1  69G   13G   57G  19% /var/lib/pgsql

/dev/sda is in raid 1  (2x 35GB / 1upm / sca)
/dev/sdb is in raid 10 (4x 35GB / 1upm / sca)


I've seen good performance boost (and machine load lowered)
switching to 15k rpm disks.


# Config 
/etc/sysctl.conf:
kernel.shmall = 786432000
kernel.shmmax = 786432000


I think you have a problem here.
kernel.shmmax should *not* be set to an amount of RAM, but
to maximum number of shared memory pages, which on a typical linux system
is 4kb. Google around:

  http://www.google.com/search?q=kernel.shmall+tuning+postgresql+shared+memory


/etc/fstab:
/dev/sdb1 /var/lib/pgsql reiserfs acl,user_xattr,noatime,data=writeback 1 2


I use similar settings on ext3 (which I'm told it is slower than reiser
or xfs or jfs).

I indicate the values I use for a machine with 4Gb RAM
and more 15 krpm disks but layout similar to yours.
(3 x RAID1 arrays for os, logs, ... and 1 x RAID10 array with 12 disks)

For Pg configuration (others please comment on these values,
it is invaluable to have feedback from this list).


/var/lib/pgsql/data/postgresql.conf
superuser_reserved_connections  = 2
shared_buffers  = 3000

16384


work_mem= 131072

32768


maintenance_work_mem= 131072

262144


max_fsm_pages   = 2

20


fsync   = true

false


commit_delay= 0
commit_siblings = 5

If you have an high transactions volume, you should
really investigate on these ones.


effective_cache_size= 1

4


random_page_cost= 4

Check out for unwanted seq scans. If you have really fast
disks, you should experiment lowering a little this parameter.


max_locks_per_transaction   = 64

512


I'm really new at using postgres. So i need some experience to set this
parameters in the postgresql- and the system-config. I can't find standard
calculations for this. :/ The postgresql-documentation doesn't help me to
set the best values for this.


There's no such thing as standard calculations :-)


The database must be high-availble. I configured rsync to sync the complete
/var/lib/pgsql-directory to my hot-standby

 [...]
In my tests the synchronization works fine. I synchronised the hole 
consistent.

 [...]
 Is this solution recommended? Or must i use archived wal's with

real system-snapshots?


In some situations, I also used rsync to do the job.
Obviously, always stop the postmaster before syncing.

Maybe you can look at slony, if you haven't yet.

  http://www.slony.info

--
Cosimo

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] PostgreSQL strugling during high load

2005-05-13 Thread Cosimo Streppone
Mindaugas Riauba wrote:
The vacuum cost parameters can be adjusted to make vacuums fired
by pg_autovacuum less of a burden.  I haven't got any specific numbers
to suggest, but perhaps someone else does.
  It looks like that not only vacuum causes our problems. vacuum_cost
seems to lower vacuum impact but we are still noticing slow queries storm.
We are logging queries that takes 2000ms to process.
  And there is quiet periods and then suddenly 30+ slow queries appears in
log within the same second. What else could cause such behaviour?
I've seen that happen when you're placing (explicitly or
*implicitly*) locks on the records you're trying to update/delete.
If you're willing to investigate, `pg_locks' system view holds
information about db locks.
--
Cosimo
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-28 Thread Cosimo Streppone
Cosimo Streppone wrote:
Merlin Moncure wrote:
 If everything is working the way it's supposed to, 8.0 should be faster
 than 7.1 (like, twice faster) for what you are probably trying to do.
In the next days I will be testing the entire application with the
same database only changing the backend from 7.1 to 8.0, so this is
a somewhat perfect condition to have a real-world benchmark
of Pg 8.0 vs 7.1.x performances.
The next days have come. I did a complete migration to Pg 8.0.1
from 7.1.3. It was a *huge* jump.
The application is exactly the same, also the database structure
is the same. I only dumped the entire 7.1.3 db, changed the backend
version, and restored the data in the 8.0.1 db.
The performance level of Pg 8 is at least *five* times higher
(faster!) than 7.1.3 in query-intensive transactions,
which is absolutely astounding.
In my experience, Pg8 handles far better non-unique indexes
with low cardinality built on numeric and integer types, which
is very common in our application.
--
Cosimo
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-02 Thread Cosimo Streppone
Merlin Moncure wrote:
 [...]
  (...DBI + DBD::Pg), so that switching to 8.0 should
 automatically enable the single-prepare, multiple-execute behavior,
 saving a lot of query planner processing, if I understand correctly.
[...]
I know that the perl people were pushing for certain features into the
libpq library (describing prepared statements, IIRC).  I think this
stuff made it into 8.0...have no clue about DBD::pg.
For the record: yes, DBD::Pg in CVS ( 1.32) has support
for server prepared statements.
If everything is working the way it's supposed to, 8.0 should be faster
than 7.1 (like, twice faster) for what you are probably trying to do.
In the next days I will be testing the entire application with the
same database only changing the backend from 7.1 to 8.0, so this is
a somewhat perfect condition to have a real-world benchmark
of Pg 8.0 vs 7.1.x performances.
--
Cosimo
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-01 Thread Cosimo Streppone
Merlin Moncure wrote:
Corollary: use pl/pgsql.  It can be 10 times or more faster than query
by query editing.
Merlin, thanks for your good suggestions.
By now, our system has never used stored procedures approach,
due to the fact that we're staying on the minimum common SQL features
that are supported by most db engines.
I realize though that it would provide an heavy performance boost.
You also have the parse/bind interface
This is something I have already engineered in our core classes
(that use DBI + DBD::Pg), so that switching to 8.0 should
automatically enable the single-prepare, multiple-execute behavior,
saving a lot of query planner processing, if I understand correctly.
--
Cosimo
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-01 Thread Cosimo Streppone
Alex Turner wrote:
To be honest I've used compaq, dell and LSI SCSI RAID controllers and
got pretty pathetic benchmarks from all of them.
I also have seen average-low results for LSI (at least the 1020 card).
2xOpteron 242, Tyan S2885 MoBo, 4GB Ram, 14xSATA WD Raptor drives:
2xRaid 1, 1x4 disk Raid 10, 1x6 drive Raid 10.  2x3ware (now AMCC)
Escalade 9500S-8MI.
Thanks, this is precious information.
I would be interested in starting a site listing RAID benchmarks under
linux.  If anyone is interested let me know.  I would be interested in
at least some bonnie++ benchmarks, and perhaps other if people would
like.
I have used also tiobench [http://tiobench.sourceforge.net/]
Any experience with it?
--
Cosimo
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-01-31 Thread Cosimo Streppone
Hi all,
I've been following this list for nearly a year now.
I've always managed to get PostgreSQL 7.1.x right for the job,
which in my case is a large and complex oltp system,
run under Pg for 6 years now.
We were already planning the switch from 7.1 to 7.4 (or even 8.0).
The last project we're facing with has a transaction volume that is
something we've never dealt with. By transaction I mean
something involving 10 to 10,000 (and more) sql queries
(a complex mix of insert/ update/ delete/ select).
I'd like to ask:
1) What kind of performance gain can I expect switching from
   7.1 to 7.4 (or 8.0)? Obviously I'm doing my own testing,
   but I'm not very impressed by 8.0 speed, may be I'm doing
   testing on a low end server...
2) The goal is to make the db handle 100 tps (something like
   100 users). What kind of server and storage should I provide?
   The actual servers our application runs on normally have
   2 Intel Xeon processors, 2-4 Gb RAM, RAID 0/1/5 SCSI
   disk storage with hard drives @ 10,000 rpm
3) Highest I/O throughput SCSI adapters? Adaptec?
4) Is it correct to suppose that multiple RAID 1 arrays
   can provide the fastest I/O ?
   I usually reserve one RAID1 array to db data directory,
   one RAID1 array to pg_xlog directory and one RAID1 array
   for os and application needs.
5) OS and Pg specific tuning?
   Usually I modify shared memory settings and most of postgresql.conf
   available settings for 7.1, like `effective_cache', `shared_buffers',
   `wal_buffers', `wal_files', and so on.
--
Cosimo
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-01-31 Thread Cosimo Streppone
Tom Lane wrote:
Cosimo writes:
1) What kind of performance gain can I expect switching from
   7.1 to 7.4 (or 8.0)? Obviously I'm doing my own testing,
   but I'm not very impressed by 8.0 speed, may be I'm doing
   testing on a low end server...
Most people report a noticeable speedup in each new release
 [...]
I'm surprised that you're not seeing any gain at all.
What was your test case exactly?  Have you perhaps tuned your app
so specifically to 7.1 that you need to detune it?
We tend to use the lowest common SQL features that will allow
us to work with any db, so probably the problem is the opposite,
there is no pg-specific overtuning.
Also, the real pg load, that should be my ideal test case,
is somewhat difficult to reproduce (~ 50 users with handhelds
and browser clients).
Another good test is a particular procedure that opens
several (~1000) subsequent transactions, composed of many
repeated selection queries with massive write loads on 6/7
different tables, as big as 300/400k tuples.
Every transaction ends with either commit or rollback state
Indexing here should be ok, for I've analyzed every single query
also under database stress.
Probably one big issue is that I need to vacuum/reindex too often
to keep db performances at a good(tm) level. I realize that this
has been addressed in several ways with newer PGs.
However, I need to do a lot of application and performance
tests and do them more seriously. Then I'll report the results here.
--
Cosimo
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[PERFORM] select count(*) on large tables

2004-04-08 Thread Cosimo Streppone
Hello,

I've followed the last discussion about the particular case of
select count(*)s on large tables being somewhat slow.
I've seen also this issue already on the todo list, so I know
it is not a simple question.
This problem arises for me on very large tables, which I mean
starting from 1 million rows and above.
The alternative solution I tried, that has an optimal
speed up, unfortunately is not a way out, and it is based
on EXPLAIN SELECT count(*) output parsing, which
is obviously *not* reliable.
The times always get better doing a vacuum (and eventually
reindex) of the table, and they slowly lower again.
Is there an estimate time for this issue to be resolved?
Can I help in some way (code, test cases, ...)?
--
Cosimo
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match