Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Claus Guttesen
 There are some results here that show PostgreSQL is slower in some cases
 than Monet and MySQL. Of course these results were published immediately
 prior to 8.2 being released, plus run out-of-the-box, so without even
 basic performance tuning.

 Would anybody like to repeat these tests with the latest production
 versions of these databases (i.e. with PGSQL 8.3), and with some
 sensible tuning settings for the hardware used? It will be useful to get
 some blind tests with more sensible settings.

 http://monetdb.cwi.nl/projects/monetdb//SQL/Benchmark/TPCH/

 Multiple runs from different people/different hardware is useful since
 they help to iron-out differences in hardware and test methodology. So
 don't worry if you see somebody else doing this also.

Here is another graph: http://tweakers.net/reviews/649/7

Without monetdb though.

-- 
regards
Claus

When lenity and cruelty play for a kingdom,
the gentlest gamester is the soonest winner.

Shakespeare

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Luke Lonergan
Hi Simon,

Note that MonetDB/X100 does not have a SQL optimizer, they ran raw
hand-coded plans.  As a consequence, these comparisons should be taken as an
executor-executor test and we/you should be sure that the PG planner has
generated the best possible plan.

That said, we've already done the comparisons internally and they've got a
good point to make about L2 cache use and removal of unnecessary
abstractions in the executor.  We've been aware of this since 2005/6 and
have been slowly working these ideas into our/PG executor.

Bottom line: it's a good thing to work to get close to the X100/Monet
executor with a more general purpose DB.  PG is a looong way from being
comparable, mostly due to poor L2 D-cache locality and I-cache thrashing in
the executor.  The only way to cure this is to work on more rows than one at
a time.

- Luke 


On 2/4/08 10:37 AM, Simon Riggs [EMAIL PROTECTED] wrote:

 Can I ask for some help with benchmarking?
 
 There are some results here that show PostgreSQL is slower in some cases
 than Monet and MySQL. Of course these results were published immediately
 prior to 8.2 being released, plus run out-of-the-box, so without even
 basic performance tuning.
 
 Would anybody like to repeat these tests with the latest production
 versions of these databases (i.e. with PGSQL 8.3), and with some
 sensible tuning settings for the hardware used? It will be useful to get
 some blind tests with more sensible settings.
 
 http://monetdb.cwi.nl/projects/monetdb//SQL/Benchmark/TPCH/
 
 Multiple runs from different people/different hardware is useful since
 they help to iron-out differences in hardware and test methodology. So
 don't worry if you see somebody else doing this also.
 
 Thanks,


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PERFORM] Benchmark Data requested

2008-02-04 Thread Simon Riggs
Can I ask for some help with benchmarking?

There are some results here that show PostgreSQL is slower in some cases
than Monet and MySQL. Of course these results were published immediately
prior to 8.2 being released, plus run out-of-the-box, so without even
basic performance tuning.

Would anybody like to repeat these tests with the latest production
versions of these databases (i.e. with PGSQL 8.3), and with some
sensible tuning settings for the hardware used? It will be useful to get
some blind tests with more sensible settings.

http://monetdb.cwi.nl/projects/monetdb//SQL/Benchmark/TPCH/

Multiple runs from different people/different hardware is useful since
they help to iron-out differences in hardware and test methodology. So
don't worry if you see somebody else doing this also.

Thanks,

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Simon Riggs
On Mon, 2008-02-04 at 10:47 -0800, Luke Lonergan wrote:

 Note that MonetDB/X100 does not have a SQL optimizer, they ran raw
 hand-coded plans.  As a consequence, these comparisons should be taken as an
 executor-executor test and we/you should be sure that the PG planner has
 generated the best possible plan.

If it doesn't then I'd regard that as a performance issue in itself.

 That said, we've already done the comparisons internally and they've got a
 good point to make about L2 cache use and removal of unnecessary
 abstractions in the executor.  We've been aware of this since 2005/6 and
 have been slowly working these ideas into our/PG executor.

 Bottom line: it's a good thing to work to get close to the X100/Monet
 executor with a more general purpose DB.  PG is a looong way from being
 comparable, mostly due to poor L2 D-cache locality and I-cache thrashing in
 the executor.  

You maybe right, but I want to see where it hurts us the most.

 The only way to cure this is to work on more rows than one at a time.

Do you have any results to show that's true, or are you just referring
to the Cray paper? (Which used fixed length tuples and specific vector
hardware).

(With regard to benchmarks, I'd rather not download Monet at all. Helps
avoid legal issues around did-you-look-at-the-code questions.)

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Greg Smith

On Mon, 4 Feb 2008, Simon Riggs wrote:


Would anybody like to repeat these tests with the latest production
versions of these databases (i.e. with PGSQL 8.3)


Do you have any suggestions on how people should run TPC-H?  It looked 
like a bit of work to sort through how to even start this exercise.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Jignesh K. Shah

Doing it at low scales is not attractive.

Commercial databases are publishing at scale factor of 1000(about 1TB) 
to 1(10TB) with one in 30TB space. So ideally right now tuning 
should start at 1000 scale factor.


Unfortunately I have tried that before with PostgreSQL the few of the 
problems are as follows:


Single stream loader of PostgreSQL takes hours to load data. (Single 
stream load... wasting all the extra cores out there)


Multiple table loads ( 1 per table) spawned via script  is bit better 
but hits wal problems.


To avoid wal problems, I  had created tables and load statements within 
the same transaction, faster but cannot create index before load or it 
starts writing to wal... AND if indexes are created after load, it takes 
about a day or so to create all the indices required. (Its single 
threaded and creating multiple indexes/indices at the same time could 
result in running out of temporary DISK space since the tables are so 
big. Which means 1 thread at a time is the answer for creating tables 
that are really big. It is slow.


Boy, by this time most people doing TPC-H in high end give up on 
PostgreSQL.


I have not even started Partitioning of tables yet since with the 
current framework, you have to load the tables separately into each 
tables which means for the TPC-H data you need extra-logic to take 
that table data and split it into each partition child table. Not stuff 
that many people want to do by hand.


Then for the power run that is essentially running one query at a time 
should essentially be able to utilize the full system (specially 
multi-core systems), unfortunately PostgreSQL can use only one core. 
(Plus since this is read only and there is no separate disk reader all 
other processes are idle) and system is running at 1/Nth capacity (where 
N is the number of cores/threads)


(I am not sure here with Partitioned tables, do you get N processes 
running in the system when you scan the partitioned table?)


Even off-loading work like fetching the data into bufferpool into 
separate processes will go big time with this type of workloads.


I would be happy to help out if folks here want to do work related to 
it. Infact if you have time, I can request a project in one of the Sun 
Benchmarking center to see what we can learn with community members 
interested in understanding where PostgreSQL performs and fails.


Regards,
Jignesh

Greg Smith wrote:

On Mon, 4 Feb 2008, Simon Riggs wrote:


Would anybody like to repeat these tests with the latest production
versions of these databases (i.e. with PGSQL 8.3)


Do you have any suggestions on how people should run TPC-H?  It looked 
like a bit of work to sort through how to even start this exercise.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Simon Riggs
On Mon, 2008-02-04 at 15:09 -0500, Greg Smith wrote:
 On Mon, 4 Feb 2008, Simon Riggs wrote:
 
  Would anybody like to repeat these tests with the latest production
  versions of these databases (i.e. with PGSQL 8.3)
 
 Do you have any suggestions on how people should run TPC-H?  It looked 
 like a bit of work to sort through how to even start this exercise.

The link referred to a few different scale factors, so you could try
those. But anything that uses the hardware you have to its full
potential is valuable.

Everybody's test method is going to be different, whatever I say...

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Jignesh K. Shah

Hi Simon,

I have some insight into TPC-H on how it works.

First of all I think it is a violation of TPC rules to publish numbers 
without auditing them first.  So even if I do the test to show the 
better performance of PostgreSQL 8.3, I cannot post it here or any 
public forum without doing going through the process. (Even though it 
is partial benchmark as they are just doing the equivalent of the 
PowerRun of TPCH) Maybe the PR of PostgreSQL team should email 
[EMAIL PROTECTED] about them and see what they have to say about that comparison.


On the technical side:

Remember all TPC-H queries when run sequentially on PostgreSQL uses only 
1 core or virtual CPU so it is a very bad for system to use it with 
PostgreSQL (same for MySQL too).


Also very important unless you are running the UPDATE FUNCTIONS which 
are separate queries, all these Q1-Q22 Queries are pure READ-ONLY 
queries. Traditionally I think PostgreSQL does lack READ-SPEEDs 
specially since it is bottlenecked by the size of the reads it does 
(BLOCKSIZE). Major database provides multi-block parameters to do 
multiple of reads/writes in terms of blocksizes to reduce IOPS and also 
for read only they also have READ-AHEAD or prefetch sizes which is 
generally bigger than multi-block or extent sizes to aid reads.


Scale factor is in terms of gigs and hence using max scale of 5 (5G) is 
pretty useless since most of the rows could be cached in modern day 
systems. And comparing with 0.01 is what 10MB? Size of recent L2 cache 
of Intel is probably bigger than that size.


If you are doing tuning for TPC-H Queries  focus on few of them:
For example Query 1 is very Join intensive  and if your CPU is not 100% 
used then you have a problem in your IO to solve before tuning it.


Another example is Query 16 is literally IO scan speed, many people use 
it to see if the database can scan at line speeds of the storage, 
ending up with 100% CPU means the database cannot process that many rows 
(just to bring it in).


In essence  each query does some combination of system features to 
highlight the performance. However since it is an old benchmark, 
database companies end up re-engineering their technologies to gain 
advantage in this benchmark (Hence its time for a successor in work 
called TPC-DS which will have more than 100 such queries)


Few of the technologies that have really helped gain ground in TPC-H world
* Hash and/or Range Partitioning of tables  ( PostgreSQL 8.3 can do that 
but the setup cost of writing schema is great specially since data has 
to be loaded in separate tables)
* Automated Aggregated Views  - used by optmiziers - database technology 
to update more frequently used aggregations in a smaller views
* Cube views Index - like bitmap but multidimensional (I think ..but not 
sure)


That said, is it useful to be used in Regression testing in PostgreSQL 
farms.  I would think yes.. specially Q16


Hope this helps.
Regards,
Jignesh




Simon Riggs wrote:

Can I ask for some help with benchmarking?

There are some results here that show PostgreSQL is slower in some cases
than Monet and MySQL. Of course these results were published immediately
prior to 8.2 being released, plus run out-of-the-box, so without even
basic performance tuning.

Would anybody like to repeat these tests with the latest production
versions of these databases (i.e. with PGSQL 8.3), and with some
sensible tuning settings for the hardware used? It will be useful to get
some blind tests with more sensible settings.

http://monetdb.cwi.nl/projects/monetdb//SQL/Benchmark/TPCH/

Multiple runs from different people/different hardware is useful since
they help to iron-out differences in hardware and test methodology. So
don't worry if you see somebody else doing this also.

Thanks,

  


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Greg Smith

On Mon, 4 Feb 2008, Jignesh K. Shah wrote:

Doing it at low scales is not attractive.  Commercial databases are 
publishing at scale factor of 1000(about 1TB) to 1(10TB) with one in 
30TB space. So ideally right now tuning should start at 1000 scale 
factor.


I think what Simon was trying to get at is some sort of debunking of 
Monet's benchmarks which were running in-memory while not giving 
PostgreSQL any real memory to work with.  What you're talking about is a 
completely separate discussion which is well worth having in addition to 
that.


I'm well aware of how painful it is to generate+load+index even single TB 
worth of data with PostgreSQL right now because I've been doing just that 
for weeks now (there's two processing phases in there as well for me that 
take even longer, but the raw operations are still a significant portion 
of the total time).


I would be happy to help out if folks here want to do work related to 
it. Infact if you have time, I can request a project in one of the Sun 
Benchmarking center to see what we can learn with community members 
interested in understanding where PostgreSQL performs and fails.


Sounds like a good 8.4 project.  Maybe pick this topic back up at the East 
convention next month, we could talk about it then.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Gregory Stark

Jignesh K. Shah [EMAIL PROTECTED] writes:

 Then for the power run that is essentially running one query at a time should
 essentially be able to utilize the full system (specially multi-core systems),
 unfortunately PostgreSQL can use only one core. (Plus since this is read only
 and there is no separate disk reader all other processes are idle) and system
 is running at 1/Nth capacity (where N is the number of cores/threads)

Is the whole benchmark like this or is this just one part of it?

Is the i/o system really able to saturate the cpu though?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Luke Lonergan
Hi Simon,

On 2/4/08 11:07 AM, Simon Riggs [EMAIL PROTECTED] wrote:

 executor-executor test and we/you should be sure that the PG planner has
 generated the best possible plan.
 
 If it doesn't then I'd regard that as a performance issue in itself.

Agreed, though that's two problems to investigate - I think the Monet/X100
stuff is clean in that it's a pure executor test.
 
 You maybe right, but I want to see where it hurts us the most.

You'll see :-)
 
 The only way to cure this is to work on more rows than one at a time.
 
 Do you have any results to show that's true, or are you just referring
 to the Cray paper? (Which used fixed length tuples and specific vector
 hardware).

No paper referenced, just inference from the results and their (and others)
conclusions about locality and re-use.  It's a similar enough situation to
scientific programming with vector processors versus cache based superscalar
that these are the right conclusions.  We've done the profiling to look at
cache misses and have some data to back it up as well.
 
 (With regard to benchmarks, I'd rather not download Monet at all. Helps
 avoid legal issues around did-you-look-at-the-code questions.)

None of us have looked at the code or downloaded it.  There are a number of
presentations out there for Monet/X100 to see what their results are.

- Luke


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Luke Lonergan
Hi Greg,

On 2/4/08 12:09 PM, Greg Smith [EMAIL PROTECTED] wrote:

 Do you have any suggestions on how people should run TPC-H?  It looked
 like a bit of work to sort through how to even start this exercise.

To run TPC-H requires a license to publish, etc.

However, I think you can use their published data and query generation kit
to run the queries, which aren't the benchmark per-se.  That's what the
Monet/X100 people did.

- Luke


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Greg Smith

On Mon, 4 Feb 2008, Luke Lonergan wrote:


However, I think you can use their published data and query generation kit
to run the queries, which aren't the benchmark per-se.  That's what the
Monet/X100 people did.


Right; I was just hoping someone might suggest some relatively 
standardized way to do that via PostgreSQL.  I read Simon's original note 
and was afraid that multiple people might end up duplicating some 
non-trivial amount of work just to get the kits setup and running, or get 
frustrated not expecting that part and just give up on the whole idea.


I'm very interested in this particular topic (non-trivial database 
micro-benchmarks) but have no time to spare this week to hack on this one 
myself.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Jignesh K. Shah



Gregory Stark wrote:

Incidentally we found some cases that Solaris was particularly bad at. Is
there anybody in particular that would be interested in hearing about them?
(Not meant to be a knock on Solaris, I'm sure there are other cases Linux or
BSD handle poorly too)


  


Send me the details, I can file bugs for Solaris on behalf of the 
community. Since I am involved in lot of PostgreSQL testing on Solaris 
this year, I have a small list myself (mostly related to file system 
stuff though).


I know one regarding bonnie rewriting blocks that you sent out. (I still 
havent done anything about it yet but finally have some test machines 
for such work instead of using my workstation to test it out :-)


But I am really interested in seeing which one hits PostgreSQL 
performance/usability.


Thanks in advance.

Regards,
Jignesh


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Benchmark Data requested

2008-02-04 Thread Jignesh K. Shah


TPC-H has two runs
PowerRun which is single stream (Q1-22 RF1, RF2)
And Throughput Runs which has N (depends on scale) running 
simultaneously in a mixed sequence of the same queries and the two 
update functions. During throughput run you can expect to max out CPU... 
But commerial databases generally have PowerRuns running quite well even 
on multi-cores ( Oracle (without RAC have published with 144 cores on 
Solaris)


As for IO system saturating the CPU its two folds
Kernel fetching in the data which saturates at some value
and in this case PostgreSQL reading the data and putting it in its 
bufferpool


An example of how I use it is as follows:
Do a select query on a table such that it results in table scan without 
actually returning any rows back
Now keep throwing hardware (better storage) till it saturates the CPU. 
That's the practical max you can do with the CPU/OS combination 
(considering unlimited storage bandwidth). This one is primarily used in 
guessing how fast one of the queries in TPC-H will complete.


In my tests with PostgreSQL, I generally reach the CPU limit without 
even reaching the storage bandwidth of the underlying storage.

Just to give numbers
Single 2Gb Fiber Channel port can practically go upto 180 MB/sec
Single 4Gb ports have proven to go upto 360-370MB/sec
So to saturate a FC port, postgreSQL has to be able to scan 370MB/sec 
without saturating the CPU.
Then comes software stripping which allows multiple ports to be stripped 
over increasing the capacity of the bandwidth... Now scanning has to be 
able to drive Nx370MB/sec (all on single core).


I had some numbers and I had some limitations based on cpu frequency, 
blocksize ,etc but those were for 8.1 days or so..


I think to take PostgreSQL a bit high end, we have to first scale out 
these numbers.
Doing some sorts of test in PostgreSQL farms for every release actually 
does help people see the amount of data that it can drive through...


We can actually work on some database operation metrics to also guage 
how much each release is improving over older releases.. I have ideas 
for few of them.


Regards,
Jignesh


Gregory Stark wrote:

Jignesh K. Shah [EMAIL PROTECTED] writes:

  

Then for the power run that is essentially running one query at a time should
essentially be able to utilize the full system (specially multi-core systems),
unfortunately PostgreSQL can use only one core. (Plus since this is read only
and there is no separate disk reader all other processes are idle) and system
is running at 1/Nth capacity (where N is the number of cores/threads)



Is the whole benchmark like this or is this just one part of it?

Is the i/o system really able to saturate the cpu though?

  


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings