date:20040202

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-02 Thread Christopher Kings-Lynne


Um, wrong.   We don't lock rows for SELECT.
Unless you meant something else?   Am I not following you?
I mean row level shared read lock.  eg. a lock that says, you can read 
but you cannot delete.

It's what postgres needs to alleviate its foreign key trigger deadlock 
problems.

Chris

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-02 Thread Josh Berkus

Chris,

> > Which does a shared lock on a row as opposed to a write lock, hence 
> > avoiding nasty foreign key deadlocks...
> 
> Um, wrong.   We don't lock rows for SELECT.

Unless you meant something else?   Am I not following you?

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-02 Thread Christopher Kings-Lynne

Out of interest, what is it about this particular task that's so hard? 


Keeping track of multiple lockers in a fixed amount of disk space.
Why not look at how InnoDB does it?  Or is that not applicable?

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-02 Thread Tom Lane

Christopher Kings-Lynne <[EMAIL PROTECTED]> writes:
>> No, but Chris is correct that we could do with having some kind of
>> shared lock facility at the row level.

> Out of interest, what is it about this particular task that's so hard? 

Keeping track of multiple lockers in a fixed amount of disk space.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-02 Thread Christopher Kings-Lynne

Um, wrong.   We don't lock rows for SELECT.
No, but Chris is correct that we could do with having some kind of
shared lock facility at the row level.
Out of interest, what is it about this particular task that's so hard? 
(Not that I could code it myself).  But surely you can use the same sort 
of thing as the FOR UPDATE code path?

Chris

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-02 Thread Tom Lane

Josh Berkus <[EMAIL PROTECTED]> writes:
>> Hey at least I noticed that InnoDB has one essential feature we don't:
>> SELECT ... IN SHARE MODE;
>> 
>> Which does a shared lock on a row as opposed to a write lock, hence 
>> avoiding nasty foreign key deadlocks...

> Um, wrong.   We don't lock rows for SELECT.

No, but Chris is correct that we could do with having some kind of
shared lock facility at the row level.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-02 Thread Josh Berkus

Chris,

> Hey at least I noticed that InnoDB has one essential feature we don't:
> 
> SELECT ... IN SHARE MODE;
> 
> Which does a shared lock on a row as opposed to a write lock, hence 
> avoiding nasty foreign key deadlocks...

Um, wrong.   We don't lock rows for SELECT.

-- 
-Josh Berkus

__AGLIO DATABASE SOLUTIONS___
Josh Berkus
   Complete information technology  [EMAIL PROTECTED]
and data management solutions   (415) 565-7293
   for law firms, small businesses   fax 621-2533
and non-profit organizations.   San Francisco


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-02 Thread Christopher Kings-Lynne

Seriously, I am tired of this kind of question. You gotta get bold 
enough to stand up in a "meeting" like that, say "guy's, you can ask me 
how this compares to Oracle ... but if you're seriously asking me how 
this compares to MySQL, call me again when you've done your homework".
Hey at least I noticed that InnoDB has one essential feature we don't:

SELECT ... IN SHARE MODE;

Which does a shared lock on a row as opposed to a write lock, hence 
avoiding nasty foreign key deadlocks...

Chris

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-02 Thread scott.marlowe

On Tue, 3 Feb 2004, Christopher Kings-Lynne wrote:

> > One more thing that annoyed me.  If you started a process, such as a 
> > large DDL operation, or heaven forbid, a cartesian join (what?  I never 
> > do that!).
> 
> I believe InnoDB also has O(n) rollback time.  eg. if you are rolling 
> back 100 million row changes, it takes a long, long time.  In PostgreSQL 
> rolling back is O(1)...

Actually, it takes signifigantly longer to rollback than to roll forward, 
so to speak, so that if you inserted for 10,000 rows and it took 5 
minutes, it would take upwards of 30 times as long to roll back.

This is from the docs:

http://www.mysql.com/documentation/mysql/bychapter/manual_Table_types.html#InnoDB_tuning

Point 8:

# Beware of big rollbacks of mass inserts: InnoDB uses the insert buffer 
to save disk I/O in inserts, but in a corresponding rollback no such 
mechanism is used. A disk-bound rollback can take 30 times the time of the 
corresponding insert. Killing the database process will not help because 
the rollback will start again at the database startup. The only way to get 
rid of a runaway rollback is to increase the buffer pool so that the 
rollback becomes CPU-bound and runs fast, or delete the whole InnoDB 
database. 

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-02 Thread Christopher Kings-Lynne

One more thing that annoyed me.  If you started a process, such as a 
large DDL operation, or heaven forbid, a cartesian join (what?  I never 
do that!).
I believe InnoDB also has O(n) rollback time.  eg. if you are rolling 
back 100 million row changes, it takes a long, long time.  In PostgreSQL 
rolling back is O(1)...

Chris

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Increasing number of PG connections.

2004-02-02 Thread scott.marlowe

On Mon, 2 Feb 2004, Kevin Barnard wrote:

> On 2 Feb 2004 at 13:58, scott.marlowe wrote:
> 
> > what do you mean at 2 GB?  Is that how much is in kernel cache plus 
> > buffer, plus used, plus etc???  Could you give us the top of top output to 
> > make sure?  If most of that is kernel cache, then that's fine.  
> 
> 2GB was total system memory.  We upgraded to 4GB to prior to increasing the 
> number of connections.

Oh, ok.  I thought you meant the system was using 2 gigs of RAM for 
postgresql

> Here's the top of top
> 
>  16:14:17  up 2 days, 16:15,  1 user,  load average: 7.60, 6.56, 4.61
> 730 processes: 721 sleeping, 9 running, 0 zombie, 0 stopped
> CPU states:  cpuusernice  systemirq  softirq  iowaitidle
>total0.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
>cpu000.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
>cpu010.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
>cpu020.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
>cpu030.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
> Mem:  3747644k av, 3298344k used,  449300k free,   0k shrd,  147880k buff
>   2158532k active, 760040k inactive
> Swap: 1048088k av,   0k used, 1048088k free 2262156k cached

when you have a high load but load CPU usage, you are usually I/O bound.

> The DB is pretty close to max connections at this point in time.  I don't know why 
> CPU shows 0% in every bucket.  It looks like I can increase the number of 
> connections a little from here.  This is a fairly standard Fedora install.  It's 
> using 
> version 2.4.22 of the Kernel.  Postgres is a complied version using 7.4.1

On this machine you could probably handle even more.  What I want is to 
get your page return times down enough so you don't need to increase the 
number of connections.  I.e. if you've got 2 second response times and you 
drop those to 0.2 seconds, then you won't need as many processes to handle 
the load (theoretically... :-)

> > experience has been that individual postgresql backends only weigh in at a 
> > mega byte at most, and they share buffer, so 700 connections can be 
> > anywhere from 300meg to 1 gig.  the rest would be buffer memory.  It's not 
> > a good idea to give up too much to shared buffers, as the database isn't 
> > as good at caching as the kernel.
> 
> OK I take this as I should keep shared buffers around 2x connections then correct?

Not really.  What happens is that if the shared buffers are so large that 
they are as large as or god forbid, larger than the kernel cache, then the 
kernel cache becomes less effective.  The general rule of thumb is 25% of 
memory, or 256 Megs, whichever is less.  The real test is that you want 
enough shared_buffers so that all the result sets currently being smooshed 
up against each other in joins, sorts, etc... can fit in postgresql's 
shared buffers, or at least the buffers can hold a fair chunk of it.  So, 
the number of buffers can be anywhere from a few thousand, up to 4 or 
5, sometimes even higher.  But for most tuning you won't be needing to 
be above 32768, which is 256 Megs of ram.

> > What do you have in postgresql.conf?  sort_mem, shared_buffers, etc???
> 
> Here is what I have that is not set from the defaults.
> 
> max_connections = 700
> shared_buffers = 1500
> sort_mem = 512
> random_page_cost = 2
> stats_start_collector = true
> stats_command_string = true
> stats_block_level = true
> stats_row_level = true
> 
> 
> > sort_mem can be a real killer if it lets the processes chew up too much 
> > memory.  Once sort_mem gets high enough to make the machine start swapping 
> > it is doing more harm than good being that high, and should usually be 
> > lowered a fair bit.
> 
> I dropped it down to 512 as you can see.  Should I be running with all of the stats 
> on?  
> I am no longer using pg_autovacuum.  I seem to be getting better results with an 
> hourly Vacuum anaylse.

Seeing as how top shows 2262156k kernel cache, you can afford to give up a 
fair bit more than 512k per sort.  I generally run 8192 (8 meg) but I 
don't handle 700 simos.  Try running it a little higher, 2048, 4096, 
etc... and see if that helps.  Note you can change sort_mem and just do a 
pg_ctl reload to make the change, without interrupting service, unlike 
shared_buffers, which requires a restart.

> > How many disks in your RAID5?  The more the better.  Is it hardware with 
> > battery backed cache?  If you write much to it it will help to have 
> > battery backed cache on board.  If it's a megaraid / LSI board, get the 
> > megaraid2 driver, it's supposedly much faster.
> 
> 4 disk IBM ServeRAID 5i with battery backed cache.

Do you have the cache set to write back or write through?  Write through 
can be a performance killer.  But I don't think your RAID is the problem, 
it looks to me like postgresql is doing a lot of I/O.  When you run top, 
d

Re: [PERFORM] Increasing number of PG connections.

2004-02-02 Thread Kevin Barnard

On 2 Feb 2004 at 13:58, scott.marlowe wrote:

> what do you mean at 2 GB?  Is that how much is in kernel cache plus 
> buffer, plus used, plus etc???  Could you give us the top of top output to 
> make sure?  If most of that is kernel cache, then that's fine.  

2GB was total system memory.  We upgraded to 4GB to prior to increasing the 
number of connections.

Here's the top of top

 16:14:17  up 2 days, 16:15,  1 user,  load average: 7.60, 6.56, 4.61
730 processes: 721 sleeping, 9 running, 0 zombie, 0 stopped
CPU states:  cpuusernice  systemirq  softirq  iowaitidle
   total0.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
   cpu000.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
   cpu010.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
   cpu020.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
   cpu030.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
Mem:  3747644k av, 3298344k used,  449300k free,   0k shrd,  147880k buff
  2158532k active, 760040k inactive
Swap: 1048088k av,   0k used, 1048088k free 2262156k cached

The DB is pretty close to max connections at this point in time.  I don't know why 
CPU shows 0% in every bucket.  It looks like I can increase the number of 
connections a little from here.  This is a fairly standard Fedora install.  It's using 
version 2.4.22 of the Kernel.  Postgres is a complied version using 7.4.1

> experience has been that individual postgresql backends only weigh in at a 
> mega byte at most, and they share buffer, so 700 connections can be 
> anywhere from 300meg to 1 gig.  the rest would be buffer memory.  It's not 
> a good idea to give up too much to shared buffers, as the database isn't 
> as good at caching as the kernel.

OK I take this as I should keep shared buffers around 2x connections then correct?

> 
> What do you have in postgresql.conf?  sort_mem, shared_buffers, etc???

Here is what I have that is not set from the defaults.

max_connections = 700
shared_buffers = 1500
sort_mem = 512
random_page_cost = 2
stats_start_collector = true
stats_command_string = true
stats_block_level = true
stats_row_level = true

> sort_mem can be a real killer if it lets the processes chew up too much 
> memory.  Once sort_mem gets high enough to make the machine start swapping 
> it is doing more harm than good being that high, and should usually be 
> lowered a fair bit.

I dropped it down to 512 as you can see.  Should I be running with all of the stats 
on?  
I am no longer using pg_autovacuum.  I seem to be getting better results with an 
hourly Vacuum anaylse.

> How many disks in your RAID5?  The more the better.  Is it hardware with 
> battery backed cache?  If you write much to it it will help to have 
> battery backed cache on board.  If it's a megaraid / LSI board, get the 
> megaraid2 driver, it's supposedly much faster.

4 disk IBM ServeRAID 5i with battery backed cache.

> You may find it hard to get postgresql to use any more memory than you 
> have, as 32 bit apps can only address 2 gigs anyway, but the extra can 
> certainly be used by the kernel as cache, which will help.

Isn't that only true for each indivdual process space.  Shouldn't each process have 
access at most 2GB.  If each backend is in it's own process space is this really a 
limit 
since all of my queries are pretty small.

I have been monitoring the system has it gets up to load.  For most of the time the 
sytem sits around 100-300 connections.  Once it ramps up it ramps up hard.  Top 
starts cycling at 0 and 133% CPU for irq, softirq and iowait.  The system stays at 700 
connections until users give up.  I can watch bandwidth utilization drop to almost 
nothing right before the DB catches up.

--
Kevin Barnard
Speed Fulfillment and Call Center
[EMAIL PROTECTED]
214-258-0120

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Increasing number of PG connections.

2004-02-02 Thread scott.marlowe

On Mon, 2 Feb 2004, Kevin Barnard wrote:

> I am running a Dual Xeon hyperthreaded server with 4GB RAM RAID-5.  The only 
> thing running on the server is Postgres running under Fedora.  I have a 700 
> connection limit.
> 
> The DB is setup as a backend for a very high volume website.  Most of the queries 
> are simple, such as logging accesses, user login verification etc.  There are a few 
> bigger things suchas reporting etc but for the most part each transaction lasts less 
> then a second.  The connections are not persistant (I'm using pg_connect in PHP)
> 
> The system was at 2 GB with a 400 connection limit.  We ran into problems because 
> we hit the limit of connections during high volume.

what do you mean at 2 GB?  Is that how much is in kernel cache plus 
buffer, plus used, plus etc???  Could you give us the top of top output to 
make sure?  If most of that is kernel cache, then that's fine.  My 
experience has been that individual postgresql backends only weigh in at a 
mega byte at most, and they share buffer, so 700 connections can be 
anywhere from 300meg to 1 gig.  the rest would be buffer memory.  It's not 
a good idea to give up too much to shared buffers, as the database isn't 
as good at caching as the kernel.

What do you have in postgresql.conf?  sort_mem, shared_buffers, etc???

sort_mem can be a real killer if it lets the processes chew up too much 
memory.  Once sort_mem gets high enough to make the machine start swapping 
it is doing more harm than good being that high, and should usually be 
lowered a fair bit.

How many disks in your RAID5?  The more the better.  Is it hardware with 
battery backed cache?  If you write much to it it will help to have 
battery backed cache on board.  If it's a megaraid / LSI board, get the 
megaraid2 driver, it's supposedly much faster.

You may find it hard to get postgresql to use any more memory than you 
have, as 32 bit apps can only address 2 gigs anyway, but the extra can 
certainly be used by the kernel as cache, which will help.

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: inserting large number of rows was: Re: [PERFORM] Increasing

2004-02-02 Thread Bill Moran

I must have missed this post when it was made earlier.  Pardon the noise if
my suggestion has already been made.
Unlike MySQL (and possibly other database servers) PostgreSQL is faster when
inserting inside a transaction.  Depending on the method in which you are
actually adding the records.
In my own experience (generating a list of INSERT statements from a perl
script and using psql to execute them) the difference in performance was
incredibly dramatic when I added a "BEGIN WORK" at the beginning and
"COMMIT WORK" at the end.
scott.marlowe wrote:
On Mon, 2 Feb 2004, Qing Zhao wrote:


I am new here. I have a question related to this in some way.

Our web site needs to upload a large volume of data into Postgres at a 
time. The performance deterioates as number of rows becomes larger.  
When it reaches 2500 rows, it never come back to GUI. Since the tests 
were run through GUI, my suspision is
that it might be caused by the way the application server talking to 
Postgres server, the connections, etc.. What might be the factors 
involved here? Does anyone know?


Actually, I'm gonna go out on a limb here and assume two things:

1. you've got lotsa fk/pk relationships setup.
2. you're analyzing the table empty before loading it up.
What happens in this instance is that the analyze on an empty, or nearly 
so, table, means that during the inserts, postgresql thinks you have only 
a few rows.  At first, this is fine, as pgsql will seq scan the  
tables to make sure there is a proper key in both.  As the number of 
rows increases, the planner needs to switch to index scans but doesn't, 
because it doesn't know that the number of rows is increasing.

Fix:  insert a few hundred rows, run analyze, check to see if the explain 
for inserts is showing index scans or not.  If not, load a few more 
hundred rows, analyze, rinse, repeat.

Also, look for fk/pk mismatches.  I.e. an int4 field pointing to an int8 
field.  That's a performance killer, so if the pk/fk types don't match, 
see if you can change your field types to match and try again.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


--
Bill Moran
Potential Technologies
http://www.potentialtech.com
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Bulk Record upload (was Re: [PERFORM] Increasing number of PG connections)

2004-02-02 Thread Richard Huxton

On Monday 02 February 2004 19:39, Qing Zhao wrote:
> I am new here. I have a question related to this in some way.

Hmm - no real connection I can see - might have been better to start a new 
thread rather than replying to this one. Also, it is usually considered best 
practice not to quote large amounts of the previous message if you're not 
replying to it,

> Our web site needs to upload a large volume of data into Postgres at a
> time. The performance deterioates as number of rows becomes larger.
> When it reaches 2500 rows, it never come back to GUI. Since the tests
> were run through GUI, my suspision is
> that it might be caused by the way the application server talking to
> Postgres server, the connections, etc.. What might be the factors
> involved here? Does anyone know?

You don't really give us enough information. What GUI are you talking about? 
How are you loading this data - as a series of INSERT statements, text-file 
with separators, from Access/MySQL etc?

In general, the fastest way to add a large number of rows is via the COPY sql 
command. Next best is to batch your inserts together into larger transactions 
of say 100-1000 inserts.

Two other things to be aware of are: use of VACUUM/ANALYZE and configuration 
tuning (see http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php).

PG shouldn't have a problem with inserting a few thousand rows, so I suspect 
it's something to do with your application/GUI setup.

Hope that helps, if not try turning on statement logging for PG and then we 
can see what commands your GUI is sending.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

inserting large number of rows was: Re: [PERFORM] Increasing number of PG connections.

2004-02-02 Thread scott.marlowe

On Mon, 2 Feb 2004, Qing Zhao wrote:

> I am new here. I have a question related to this in some way.
> 
> Our web site needs to upload a large volume of data into Postgres at a 
> time. The performance deterioates as number of rows becomes larger.  
> When it reaches 2500 rows, it never come back to GUI. Since the tests 
> were run through GUI, my suspision is
> that it might be caused by the way the application server talking to 
> Postgres server, the connections, etc.. What might be the factors 
> involved here? Does anyone know?

Actually, I'm gonna go out on a limb here and assume two things:

1. you've got lotsa fk/pk relationships setup.
2. you're analyzing the table empty before loading it up.

What happens in this instance is that the analyze on an empty, or nearly 
so, table, means that during the inserts, postgresql thinks you have only 
a few rows.  At first, this is fine, as pgsql will seq scan the  
tables to make sure there is a proper key in both.  As the number of 
rows increases, the planner needs to switch to index scans but doesn't, 
because it doesn't know that the number of rows is increasing.

Fix:  insert a few hundred rows, run analyze, check to see if the explain 
for inserts is showing index scans or not.  If not, load a few more 
hundred rows, analyze, rinse, repeat.

Also, look for fk/pk mismatches.  I.e. an int4 field pointing to an int8 
field.  That's a performance killer, so if the pk/fk types don't match, 
see if you can change your field types to match and try again.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-02 Thread Adam Ruth

Josh,

I evaluated MySQL + InnoDB briefly for a project, once.  I didn't get 
very far because of some severe limitations in MySQL.

I had to import all of the data from an existing database (MS SQL).  
One of the tables was about 8 million rows, 10 fields, and had 5 
indexes.  I found it quite impossible to import into MySQL.  I would 
import the data into a table with no indexes, then perform a bunch of 
manipulation on it (I wasn't just converting from MS SQL, but also 
needed to alter quite a bit of the structure).  After the manipulation, 
I would drop some columns and build the indexes.  It took MySQL over 4 
days to do this!

What I found out was that any DDL changes to a table in MySQL actually 
does this:  create a new table, copy all of the data over, then drop 
the old table and rename the new one.  Whenever I added a new index, 
MySQL would go through the process of rebuilding each previous index.  
Same thing when adding or dropping columns.

I could not find a way to import all of the data in a reasonable amount 
of time.  For comparison, it took less that 45 minutes to import all of 
the data in to PostgreSQL (that's ALL of the data, not just that one 
table).

Needless to say (but I'll say it anyway :-), I didn't get any farther 
in my evaluation, there was no point.

One more thing that annoyed me.  If you started a process, such as a 
large DDL operation, or heaven forbid, a cartesian join (what?  I never 
do that!).  There's no way to cancel it with InnoDB.  You have to wait 
for it to finish.  Hitting ctrl+c in their command line tool only kills 
the command line tool, the process continues.  Even if you stop the 
database and restart it (including with a hard boot), it will pick 
right up where it left off and continue.  That proved to be way too 
much of a pain for me.

Disclaimer:  I'm not a real MySQL expert, or anything.  There could be 
ways of getting around this, but after two weeks of trying, I decided 
to give up.  It only took me a few hours to build the requisite 
PostgreSQL scripts and I never looked back.

Adam Ruth

On Feb 2, 2004, at 10:21 AM, Josh Berkus wrote:

Folks,

I've had requests from a couple of businesses to see results of 
infomal MySQL
+InnoDB vs. PostgreSQL tests.I know that we don't have the setup 
to do
full formal benchmarking, but surely someone in our community has gone
head-to-head on your own application?

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco
---(end of 
broadcast)---
TIP 8: explain analyze is your friend



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-02 Thread Robert Treat

On Mon, 2004-02-02 at 12:21, Josh Berkus wrote:
> Folks,
> 
> I've had requests from a couple of businesses to see results of infomal MySQL
> +InnoDB vs. PostgreSQL tests.I know that we don't have the setup to do 
> full formal benchmarking, but surely someone in our community has gone 
> head-to-head on your own application?
> 

We have the setup to do informal benchmarking via OSDL, but afaik mysql
doesn't conform to any of the dbt benchmarks...

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Increasing number of PG connections.

2004-02-02 Thread Qing Zhao

I am new here. I have a question related to this in some way.

Our web site needs to upload a large volume of data into Postgres at a 
time. The performance deterioates as number of rows becomes larger.  
When it reaches 2500 rows, it never come back to GUI. Since the tests 
were run through GUI, my suspision is
that it might be caused by the way the application server talking to 
Postgres server, the connections, etc.. What might be the factors 
involved here? Does anyone know?

Thanks a lot!

Qing
On Feb 2, 2004, at 11:14 AM, Kevin Barnard wrote:
I am running a Dual Xeon hyperthreaded server with 4GB RAM RAID-5.  
The only
thing running on the server is Postgres running under Fedora.  I have 
a 700
connection limit.

The DB is setup as a backend for a very high volume website.  Most of 
the queries
are simple, such as logging accesses, user login verification etc.  
There are a few
bigger things suchas reporting etc but for the most part each 
transaction lasts less
then a second.  The connections are not persistant (I'm using 
pg_connect in PHP)

The system was at 2 GB with a 400 connection limit.  We ran into 
problems because
we hit the limit of connections during high volume.

1.  Does 400 connections sound consistant with the 2GB of RAM?  Does 
700 sound
good with 4 GB.  I've read a little on optimizing postgres.  Is there 
anything else I can
do maybe OS wise to increase how many connections I get before I start 
swapping?

2.  Are there any clustering technologies that will work with 
postgres?  Specifically I'm
looking at increasing the number of connections.

The bottom line is since the website launched (middle of January) we 
have increased
the number of http connections, and increased bandwidth allowances by 
over 10
times.  The site continues to grow and we are looking at our options.  
Some of the
ideas have been possible DB replication.   Write to master and read 
from multiple
slaves.  Other ideas including increasing hardware.

This is the biggest site I have ever worked with.  Almost everything 
else fits in a T1
with a single DB server handling multiple sites.  Does anybody with 
experence in this
realm have any suggestions?

Thank you in advance for whatever help you can provide.
--
Kevin Barnard


---(end of 
broadcast)---
TIP 4: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

[PERFORM] pg_stat_activity

2004-02-02 Thread Orion Henry

Quick Question,

The full query listed in pg_stat_activity is getting truncated.  Does
anyone know how I can see the full query in progress?

-- 
Orion Henry <[EMAIL PROTECTED]>


signature.asc
Description: This is a digitally signed message part

Re: [PERFORM] "Overlaping" indexes

2004-02-02 Thread Rod Taylor

On Mon, 2004-02-02 at 13:43, Tomasz Myrta wrote:
> Dnia 2004-02-02 19:30, Użytkownik scott.marlowe napisał:
> > Not entirely, since it only has to sort two columns, it will be smaller, 
> > and will therefore be somewhat faster.
> 
> Can you say something more about it? Will it be enough faster to keep 
> them both? Did anyone make such tests?

You can actually come up with test cases where both indexes are useful.
The three column index will have more data to sift through. That said,
having both indexes used means there is less ram available for cache.

The biggest mistake I see is people doing everything they can to
optimize a single query, then they optimize the next query, etc.

When you consider the entire set of queries, those two indexes are very
likely to slow select throughput down due to increased memory
requirements and the system hitting disk a little more often.

It's similar to the mistake of benchmarking a set of 1000 row tables and
optimizing memory settings for that, then using that configuration on
the 10M row tables in production.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

[PERFORM] Increasing number of PG connections.

2004-02-02 Thread Kevin Barnard

I am running a Dual Xeon hyperthreaded server with 4GB RAM RAID-5.  The only 
thing running on the server is Postgres running under Fedora.  I have a 700 
connection limit.

The DB is setup as a backend for a very high volume website.  Most of the queries 
are simple, such as logging accesses, user login verification etc.  There are a few 
bigger things suchas reporting etc but for the most part each transaction lasts less 
then a second.  The connections are not persistant (I'm using pg_connect in PHP)

The system was at 2 GB with a 400 connection limit.  We ran into problems because 
we hit the limit of connections during high volume.

1.  Does 400 connections sound consistant with the 2GB of RAM?  Does 700 sound 
good with 4 GB.  I've read a little on optimizing postgres.  Is there anything else I 
can 
do maybe OS wise to increase how many connections I get before I start swapping?

2.  Are there any clustering technologies that will work with postgres?  Specifically 
I'm 
looking at increasing the number of connections.

The bottom line is since the website launched (middle of January) we have increased 
the number of http connections, and increased bandwidth allowances by over 10 
times.  The site continues to grow and we are looking at our options.  Some of the 
ideas have been possible DB replication.   Write to master and read from multiple 
slaves.  Other ideas including increasing hardware.

This is the biggest site I have ever worked with.  Almost everything else fits in a T1 
with a single DB server handling multiple sites.  Does anybody with experence in this 
realm have any suggestions?

Thank you in advance for whatever help you can provide.
--
Kevin Barnard



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] "Overlaping" indexes

2004-02-02 Thread scott.marlowe

On Mon, 2 Feb 2004, Tomasz Myrta wrote:

> Dnia 2004-02-02 19:30, U¿ytkownik scott.marlowe napisa³:
> > Not entirely, since it only has to sort two columns, it will be smaller, 
> > and will therefore be somewhat faster.
> 
> Can you say something more about it? Will it be enough faster to keep 
> them both? Did anyone make such tests?

that really depends on the distribution of the third column.  If there's 
only a couple of values in the third column, no big deal.  If each entry 
is unique, and it's a large table, very big deal.

It is only useful to have a three column index if you actually use it.  If 
you have an index on (a,b,c) and select order by b, the index won't get 
used unless the a part is in the where clause.

the other issue is updates.  IT WILL cost more to update two indexes 
rather than one.  Generally, you can drop / readd the index and use 
explain analyze on one of your own queries to see if that helps.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] "Overlaping" indexes

2004-02-02 Thread Tomasz Myrta

Dnia 2004-02-02 19:30, Użytkownik scott.marlowe napisał:
Not entirely, since it only has to sort two columns, it will be smaller, 
and will therefore be somewhat faster.
Can you say something more about it? Will it be enough faster to keep 
them both? Did anyone make such tests?

Regards,
Tomasz Myrta
---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] "Overlaping" indexes

2004-02-02 Thread scott.marlowe

On Mon, 2 Feb 2004, Tomasz Myrta wrote:

> Dnia 2004-02-02 15:46, U?ytkownik Rigmor Ukuhe napisa3:
> > Hi,
> > 
> > I have many indexes somehow overlaping like:
> > ... btree ("STATUS", "VISIBLE", "NP_ID");
> > ... btree ("STATUS", "VISIBLE");
> > 
> > is perfomance gained by "more exact" index worth overhead with managing
> > indexes.
> 
> The second (2 columns) index is useless - it's function is well done by 
> the first one (3 columns).

Not entirely, since it only has to sort two columns, it will be smaller, 
and will therefore be somewhat faster.

On the other hand, I've seen a lot of folks create multi column indexes 
who didn't really understand how they work in Postgresql.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

[PERFORM] Mainframe Linux + PostgreSQL

2004-02-02 Thread Josh Berkus

Folks,

Is anyone on this list using PostgreSQL on a mini or mainframe platform?   If 
so, drop me a line.   Thanks!

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[PERFORM] MySQL+InnoDB vs. PostgreSQL test?

2004-02-02 Thread Josh Berkus

Folks,

I've had requests from a couple of businesses to see results of infomal MySQL
+InnoDB vs. PostgreSQL tests.I know that we don't have the setup to do 
full formal benchmarking, but surely someone in our community has gone 
head-to-head on your own application?

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] "Overlaping" indexes

2004-02-02 Thread Tomasz Myrta

Dnia 2004-02-02 15:46, Użytkownik Rigmor Ukuhe napisał:
Hi,

I have many indexes somehow overlaping like:
... btree ("STATUS", "VISIBLE", "NP_ID");
... btree ("STATUS", "VISIBLE");
is perfomance gained by "more exact" index worth overhead with managing
indexes.
The second (2 columns) index is useless - it's function is well done by 
the first one (3 columns).

Regards,
Tomasz Myrta
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

[PERFORM] "Overlaping" indexes

2004-02-02 Thread Rigmor Ukuhe

Hi,

I have many indexes somehow overlaping like:
... btree ("STATUS", "VISIBLE", "NP_ID");
... btree ("STATUS", "VISIBLE");

is perfomance gained by "more exact" index worth overhead with managing
indexes.

Rigmor Ukuhe
Finestmedia Ltd
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.564 / Virus Database: 356 - Release Date: 19.01.2004


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] views?

2004-02-02 Thread Richard Huxton

On Saturday 31 January 2004 04:35, Neil Conway wrote:
>
> This is called a "materialized view". PostgreSQL doesn't support them
> yet, but most people think it would be a Good Thing to have.

There is a project on gborg (called "mview" iirc) though I don't know how far 
it's got - I think it's still pretty new.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

Re: [PERFORM] Increasing number of PG connections.

Re: [PERFORM] Increasing number of PG connections.

Re: [PERFORM] Increasing number of PG connections.

Re: inserting large number of rows was: Re: [PERFORM] Increasing

Bulk Record upload (was Re: [PERFORM] Increasing number of PG connections)

inserting large number of rows was: Re: [PERFORM] Increasing number of PG connections.

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

Re: [PERFORM] Increasing number of PG connections.

[PERFORM] pg_stat_activity

Re: [PERFORM] "Overlaping" indexes

[PERFORM] Increasing number of PG connections.

Re: [PERFORM] "Overlaping" indexes

Re: [PERFORM] "Overlaping" indexes

Re: [PERFORM] "Overlaping" indexes

[PERFORM] Mainframe Linux + PostgreSQL

[PERFORM] MySQL+InnoDB vs. PostgreSQL test?

Re: [PERFORM] "Overlaping" indexes

[PERFORM] "Overlaping" indexes

Re: [PERFORM] views?

30 matches

Site Navigation

Mail list logo

Footer information