Re: [GENERAL] transaction limits?

2005-10-21 Thread Mark Rae
On Fri, Oct 21, 2005 at 12:25:36PM +0200, Nicolas Barbier wrote:
 On 10/21/05, Richard Huxton dev@archonet.com wrote:
  jeff sacksteder wrote:
   Are there known limits to how many rows can be inserted by one 
   transaction,

  Well, the system will need to be able to roll back the transaction, ...

 Just by not indicating that a transaction did commit, others will keep
 ignoring its rows. There is nothing to rollback here, thanks to MVCC.

Also, I've just done 300 million row INSERTs, from a 'mysqldump', inside
a single transaction. So there doesn't appear to be any implementation
problems that would stop you from reaching the theoretical table limits.

-Mark

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] PostgreSQL on Dual Processors, Dual-Core AMD Chips

2005-10-19 Thread Mark Rae
On Wed, Oct 19, 2005 at 12:32:36AM -0700, William Yu wrote:
 Expect to need to upgrade to later Linux cores though. Previous kernel 
 on this server was 2.6.9+ (FC3 64-bit) -- promptly kernel panic'd upon 
 install of the DCs. FC3 installer did the same thing. Went to FC4 
 (2.6.11+) and it has run perfect for the past 3 months. We'll probably 
 do a yum update to 2.6.13+ to take advantage of the better NUMA code now 
 that we've successfully done this update on a less critical server.

You should definitely do the upgrade to 2.6.13, the previous
versions had problems were they would allocate memory to
the wrong bank of memory when using dual core CPUs. So you
would find only about a 20% improvement over a single core.

With the newer kernels you should find that a dual core will 
be giving you about 80% increase over a single core.

-Mark

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] PostgreSQL on Dual Processors, Dual-Core AMD Chips

2005-10-19 Thread Mark Rae
On Wed, Oct 19, 2005 at 09:55:18AM -0700, William Yu wrote:
 Mark Rae wrote:
 With the newer kernels you should find that a dual core will 
 be giving you about 80% increase over a single core.
 
 I'm not experiencing this problem right now because I have NUMA disabled 
 in the BIOS. :) 

That is the recommended 'fix' :-D

 I'm not sure NUMA will help that much for Postgres due 
 to Postgres shared memory + OS caching architecture -- 

The 80% figure is for general data processing code. I actually got
a 72% performance increase using postgres for read only queries.
I would think that any write activity would still be limited by
how fast you can write the WAL to disk.

-Mark

---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] Database Comments

2005-10-07 Thread Mark Rae
Hi,

Using COMMENT ON DATABASE only allows you to create a comment
for your current database which is then only visible from that database.
However, if I manually insert appropriate values into pg_description
in other databases I can make comments appear for another database.

e.g. If some users have access to databases A,BC, I can insert
values so that the users can see descriptions for each database
from any of them.

Is it safe to do this, or can it have unexpected side effects?


Also is there any intention to have a 'COMMENT ON [ROLE|USER|GROUP]'
facility. It would be quite useful to be able to do something like

COMMENT ON USER jbloggs IS 'Joe Bloggs [EMAIL PROTECTED]'

-Mark

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] [PERFORM] Projecting currentdb to more users

2005-07-12 Thread Mark Rae
On Tue, Jul 12, 2005 at 01:41:14PM -0500, Scott Marlowe wrote:
 On Tue, 2005-07-12 at 13:24, Mohan, Ross wrote:
  From AMD's suit against Intel. Perhaps relevant to some PG/AMD issues. 
 Well, this is, right now, just AMD's supposition about Intel's
 behaviour, I'm not sure one way or the other if Intel IS doing this. 

I think its more a case of AMD now having solid evidence to back
up the claims. 

This discovery, and that fact that you could get round it by
toggling some flags, was being discussed on various HPC mailing 
lists around about the beginning of this year.

-Mark


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] [PERFORM] Projecting currentdb to more users

2005-07-12 Thread Mark Rae
On Tue, Jul 12, 2005 at 03:11:35PM -0500, Scott Marlowe wrote:
 On Tue, 2005-07-12 at 15:06, Mark Rae wrote:
  I think its more a case of AMD now having solid evidence to back
  up the claims. 
 
 Wow!  That's pretty fascinating.  So, is the evidence pretty
 overwhelming that this was not simple incompetence, but real malice?


I suppose that depends on the exact nature of the 'check'.

As far as I was aware it was more a case of 'I don't recognise this
processor, so I'll do it the slow but safe way'.

However from what AMD are claiming, it seems to be more of a 
'Its an AMD processor so I'll be deliberately slow and buggy'


Having said that, I have tried compiling PG with the intel compiler 
in the past, and haven't noticed any real difference. But in a database
there isn't much scope for vectorization and pipelining
compared with numerical code, which is where the Intel compiler
makes the greatest difference.

-Mark

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[GENERAL] SMP scaling

2005-03-18 Thread Mark Rae
On Tue, Mar 15, 2005 at 07:00:25PM -0500, Bruce Momjian wrote:
 Oh, you have to try CVS HEAD or a nightly snapshot.  Tom made a major
 change that allows scaling in SMP environments.

Ok, I've done the tests comparing 8.0.1 against a snapshot from the 16th
and the results are impressive.
As well as the 16CPU Altix, I've done comparisons on two other 4CPU
machines which previously didn't scale as well as expected.


Clients1  2  3  4  6  8 12 16 
32 64
---
Altix pg-8.0.1  1.00   2.02   2.98   3.97   5.87   7.23   7.51   5.54   
4.68   5.10
Altix pg-20050316   1.00   1.97   2.86   3.68   5.29   6.90   9.00   9.88  
10.03   9.94
AMD64 pg-8.0.1  1.00   1.87   2.77   3.34   2.73   2.57   2.58   2.62
AMD64 pg-20050316   1.00   1.95   2.84   3.69   3.61   3.66   3.70   3.69
IA64  pg-8.0.1  1.00   1.97   2.91   3.82   2.91   2.92   2.94   2.98
IA64  pg-20050316   1.00   1.98   2.95   3.87   3.80   3.78   3.86   3.90

Altix == 16x 1.6GHz Itanium2192GB memory
AMD64 ==  4x 2.2GHz Opteron 848   8GB memory
IA64  ==  4x 1.5GHz Itanium2 16GB memory


The altix still only scales up to 10x rather than 16x, but that probably 
is the NUMA configuration taking effect now. 
Also this machine isn't set up to run databases, so only has 1 FC I/O card, 
which means a CPU can end up being 4 hops away from the memory and disk. 

As the database is so small (8GB), relative to the machine, the data will 
be on average 2 hops away. This gives an average of 72% of the speed of
local memory, based on previous measurements of speed vs hops. 

So getting 63% of the theoretical maximum database throughput is pretty good.


-Mark

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] SMP scaling

2005-03-18 Thread Mark Rae
On Fri, Mar 18, 2005 at 10:38:24AM -0500, Tom Lane wrote:
 Hey, that looks pretty sweet.  One thing this obscures though is whether
 there is any change in the single-client throughput rate --- ie, is 1.00
 better or worse for CVS tip vs 8.0.1?

Here are the figures in queries per second.

Clients1  2  3  4  6  8 12 16 
32 64
---
AMD64 pg-8.0.1  6.80  12.71  18.82  22.73  18.58  17.48  17.56  17.81
AMD64 pg-20050316   6.80  13.23  19.32  25.09  24.56  24.93  25.20  25.09
IA64  pg-8.0.1  3.72   7.32  10.81  14.21  10.81  10.85  10.92  11.09
IA64  pg-20050316   3.99   7.92  11.78  15.46  15.17  15.09  15.41  15.58
Altix pg-8.0.1  3.66   7.37  10.89  14.53  21.47  26.47  27.47  20.28  
17.12  18.66
Altix pg-20050316   3.83   7.55  10.98  14.10  20.27  26.47  34.50  37.88  
38.45  38.12

So, it didn't make any difference for the Opteron, but the two 
Itanium machines were 5% and 7% faster respectively.

-Mark

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] SMP scaling

2005-03-18 Thread Mark Rae
On Fri, Mar 18, 2005 at 01:31:51PM -0500, Tom Lane wrote:
 BTW, although I know next to nothing about NUMA, I do know that it is
 configurable to some extent (eg, via numactl).  What was the
 configuration here exactly, and did you try alternatives?  Also,
 what was the OS exactly?  (I've heard that RHEL4 is a whole lot better
 than RHEL3 in managing NUMA, for example.  This may be generic to 2.6 vs
 2.4 Linux kernels, or maybe Red Hat did some extra hacking.)

The Altix uses a 2.4.21 kernel with SGI's own modifications
to support up to 256 CPUs and their NUMALink hadware.
(Some of which has become the NUMA code in the 2.6 kernel)

Even with the numa support, which makes sure any memory allocated by 
malloc or the stack ends up local to the processor which originally 
called it, and then continues to schedule the process on that CPU,
there is still the problem that all table accesses* go through 
the shared buffer cache which resides in one location.
[* is this true in all cases?]

I was about to write a long explaination about how the only way 
to scale out to this size would be to have separate buffer caches in
each memory domain, and this would then require some kind of cache
coherency mechanism. But after reading a few bits of documentation, 
it looks like SGI already have a solution in the form of 
symmetric data objects. 

In particular, the symmetric heap, an area of shared memory 
which is replicated across all memory domains with the
coherency being handled in hardware.

So it looks like all that might be needed is to replace the
shmget calls in src/backend/port with the equivalent SGI functions.

-Mark

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] prelimiary performance comparison pgsql vs mysql

2005-03-15 Thread Mark Rae
On Mon, Mar 14, 2005 at 06:04:01PM -0500, Chris Browne wrote:
 [EMAIL PROTECTED] (Mark Rae) writes:
 Clients  1 2 3 4 6 812163264   
 128
 --
 mysql-4.1.1   1.00  1.41  1.34  1.16  0.93  1.03  1.01  1.00  0.94  0.86  
 0.80
 pg-7.4.1  0.65  1.27  1.90  2.48  2.45  2.50  2.48  2.51  2.49  2.39  
 2.38
 
 Could you elaborate on what the measures are here?  I don't quite
 follow what 0.8 means as compared to 2.38.

The figures are generated from a representative set of ~500 queries which
each client issues in a random order. The clients all connect in parallel
and the elapsed time taken for them all to complete is recorded.

The numbers given above are these times converted into relative throughput
figures. The baseline being a single mysql client, a performance of
1.00 is equivalent to an average of 5.82 queries per second.

i.e. 
with a single client postgres runs at 65% the speed of mysql

by the time 3 clients are running, postgres is getting through the
queries 1.90/1.34=1.42 times faster

and once we get to 32 clients, mysql is tripping over itself and is
actually running slower than if the clients ran one after another.


Looking at how the database scales,
i.e. all figures are relative to the speed of a single client connection.

Clients  1 2 3 4 6 812163264   128
--
Theoretical   1.00  2.00  3.00  4.00  4.00  4.00  4.00  4.00  4.00  4.00  4.00
mysql-4.1.1   1.00  1.41  1.34  1.16  0.93  1.03  1.01  1.00  0.94  0.86  0.80
pg-7.4.1  1.00  1.94  2.91  3.79  3.75  3.82  3.80  3.84  3.82  3.66  3.64

The theoretical line being is how much we should expect things to scale
given that this is a 4 cpu machine.


-Mark

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] prelimiary performance comparison pgsql vs mysql

2005-03-15 Thread Mark Rae
On Tue, Mar 15, 2005 at 12:51:03PM -0800, Jeff Davis wrote:
 Be careful assuming that. DB benchmarks are hard to do in a general
 sense. His results probably indicate a general trend, but you should
 test your application yourself to get a real result. His pattern of SQL
 queries might be very different from yours.

Very true. 

You may have noticed that I had a very low query rate of 5.8 queries
per second, because some of the queries have 12 tables to join and 
take about 20s to run. This tends to work in postgres' favour.
If you have lots have simple queries, it will be better for mysql
and the break even point will be higher. 


Also, while on the subject of scaling. I had the opportunity
to try postgres on a 16CPU Altix and couldn't get it to scale
more than about 4x, whereas Oracle got up to about 12x faster

I assume this is because of the NUMA architecture. I was also
told that Oracle had made no special optimizations to accomodate it.

My guess is that because postgres allocates all its shared
buffers as a contiguous chunk, it puts all the load on one
memory bank.
Oracle on the other hand, seems to use lots of smaller regions
which would probably be spread throughout the physical memory.

Perhaps one of the developers could comment on how difficult
it would be to change the shared buffer handling to use multiple
segments. As I'd definitely be willing to give it a go.

-Mark

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] prelimiary performance comparison pgsql vs mysql

2005-03-15 Thread Mark Rae
On Tue, Mar 15, 2005 at 06:46:50PM -0500, Bruce Momjian wrote:
 Mark Rae wrote:
  Also, while on the subject of scaling. I had the opportunity
  to try postgres on a 16CPU Altix and couldn't get it to scale
  more than about 4x, whereas Oracle got up to about 12x faster
  
 We have had some major SMP improvements in current CVS.  Were you
 testing that or 8.0.X?

It tried it with 8.0.0rc3, and had previously tried a 7.4 version

-Mark


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] prelimiary performance comparison pgsql vs mysql

2005-03-15 Thread Mark Rae
On Tue, Mar 15, 2005 at 07:00:25PM -0500, Bruce Momjian wrote:
 Oh, you have to try CVS HEAD or a nightly snapshot.  Tom made a major
 change that allows scaling in SMP environments.

Ok, I'll give it a try in the next couple of days when there is
some free time available on the machine.

-Mark

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] prelimiary performance comparison pgsql vs mysql

2005-03-14 Thread Mark Rae
On Mon, Mar 14, 2005 at 06:52:58AM -0500, Rick Schumeyer wrote:
 Below are some PRELIMINARY results in comparing the performance of pgsql and
 mysql.
 ...
 I have not yet done any testing of transactions, multiple concurrent
 processes, etc.
 

I would say that doing the concurrency tests is probably the most
important factor in comparing other databases against MySQL, as
MySQL will almost always win in single-user tests.

E.g. here are some performance figures from tests I have done in the past.
This is with a 6GB databse on a 4CPU Itanium system running a mixture of
read-only queries, but it is fairly typical of the behaviour I have seen. 
The Oracle figures also scaled in a similar way to postgres.

Clients   1 2 3 4 6 812163264   128
---
mysql-4.1.11.00  1.41  1.34  1.16  0.93  1.03  1.01  1.00  0.94  0.86  0.80
pg-7.4.1   0.65  1.27  1.90  2.48  2.45  2.50  2.48  2.51  2.49  2.39  2.38

-Mark

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] PostgreSQL 8.0.0 Release Candidate 4

2005-01-10 Thread Mark Rae
On Fri, Jan 07, 2005 at 12:04:00PM -0400, Marc G. Fournier wrote:
 A current list of *known* supported platforms can be found at:
   http://developer.postgresql.org/supported-platforms.html
 We're always looking to improve that list, so we encourage anyone that is 
 running a platform not listed to please report on any success or failures 
 with Release Candidate 4.

I can confirm the following two architectures successfully
completed the regression tests, running
$ gmake MAX_CONNECTIONS=10 check

Machine:   SGI Altix 350
Processor: Itanium2 
OS:SGI ProPack 3SP1 for Linux, Build 301r3-0407280007
Compiler:  gcc 3.2.3

Machine:   HP/Compaq ES45
Processor: Alpha ev68
OS:Compaq Tru64 UNIX V5.1B (Rev. 2650)
Compiler:  Compaq C v6.5-011


-Mark

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings