Re: [PERFORM] libgcc double-free, backend won't die

2007-12-17 Thread Bruce Momjian
James Mansion wrote:
 I think you have your head in the ground, but its your perogative.
 *You* might not care, but anyone wanting to use thread-aware libraries
 (and I'm *not* talking about threading in any Postgres code) will
 certainly value it if they can do so with some stability.

I suggest you find out the cause of your problem and then we can do more
research.  Talking about us changing the Postgres behavior from the
report of one user who doesn't even have the full details isn't
productive.

--
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] viewing source code

2007-12-17 Thread Roberts, Jon
Alvaro Herrera pointed out that pg_read_file requires superuser access which
these users won't have so revoking access to the function code should be
possible.

Joshua D. Drake suggested revoking pg_proc but that isn't the source code,
it just has the definition of the functions.  

If it isn't a feature today, what table has the source code in it?  Maybe I
can revoke that.


Jon
 -Original Message-
 From: Jonah H. Harris [mailto:[EMAIL PROTECTED]
 Sent: Friday, December 14, 2007 3:04 PM
 To: Bill Moran
 Cc: Joshua D. Drake; Roberts, Jon; pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] viewing source code
 
 On Dec 14, 2007 2:03 PM, Bill Moran [EMAIL PROTECTED]
 wrote:
  I disagree here.  If they're connecting remotely to PG, they have no
  direct access to the disk.
 
 pg_read_file?
 
 --
 Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
 EnterpriseDB Corporation| fax: 732.331.1301
 499 Thornall Street, 2nd Floor  | [EMAIL PROTECTED]
 Edison, NJ 08837| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] viewing source code

2007-12-17 Thread Merlin Moncure
On Dec 17, 2007 8:11 AM, Roberts, Jon [EMAIL PROTECTED] wrote:
 Alvaro Herrera pointed out that pg_read_file requires superuser access which
 these users won't have so revoking access to the function code should be
 possible.

 Joshua D. Drake suggested revoking pg_proc but that isn't the source code,
 it just has the definition of the functions.

 If it isn't a feature today, what table has the source code in it?  Maybe I
 can revoke that.

the table is pg_proc.  you have to revoke select rights from public
and the user of interest.  be aware this will make it very difficult
for that user to do certain things in psql and (especially) pgadmin.
it works.

a better solution to this problem is to make a language wrapper for
pl/pgsql that encrypts the source on disk. afaik, no one is working on
th is.  it would secure the code from remote users but not necessarily
from people logged in to the server.  the pg_proc hack works ok
though.

merlin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] VACUUM FREEZE output more than double input

2007-12-17 Thread Kevin Grittner
 On Fri, Dec 14, 2007 at  5:59 PM, in message [EMAIL PROTECTED],
Tom Lane [EMAIL PROTECTED] wrote: 
 Kevin Grittner [EMAIL PROTECTED] writes:
 Why double writes per read, plus massive writes at checkpoint?
 
 The double writes aren't surprising: freezing has to be WAL-logged,
 and the odds are that each page hasn't been touched since the last
 checkpoint, so the WAL log will include a complete page image.
 So in the steady state where all shared buffers are dirty, the
 per-page cycle is:
   * write out a dirty buffer so it can be reclaimed
   * read in a page
   * modify it to mark tuples frozen
   * write an image of the page to WAL
   * leave the now-dirty page in shared buffers for later writing
 
 The checkpoint spikes would come from trying to flush out all the
 dirty buffers at once.
 
Got it.  Thanks.
 
 You'd expect a bit of a valley after each peak, since the vacuum
 could presumably recycle some buffers without having to flush 'em
 first; but I don't see one in your data.  That may just be because
 the numbers are too noisy, but I kinda suspect that the vacuum is
 dirtying buffers nearly as fast as the bgwriter can clean them,
 leaving not a lot of daylight for a valley.
 
Yeah, the pattern was pretty consistent and without valleys.
 
 8.3 should pretty well eliminate the checkpoint spike in this scenario,
 because vacuum will work in a limited number of shared buffers instead
 of dirtying the whole cache.  But you'll still see 2X writes over reads.
 
Testing 8.3beta4 so far has shown both smoother I/O and better
performance in all respects.  The preliminary post I did where I
thought I saw some regression on loading a pg_dump turned out to be
was an apples to oranges comparison; comparing the same load on
the same hardware and OS, 8.3 wins.  (Kudos to all who worked on
these improvements!)
 
 If this is data that you could re-generate at need, it might make sense
 to turn off full_page_writes during the initial data load and vacuum.
 
Thanks for the suggestions; I'll try that.
 
 I concur with trying to FREEZE all the data while you do this, else
 you'll see the same work done whenever the data happens to slip past
 the auto freeze threshold.
 
Thanks.  I thought that made sense, but I'm still trying to get my
head around some of the dynamics of PostgreSQL and MVCC.  I'll
suggest that as policy here.
 
-Kevin
 


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] viewing source code

2007-12-17 Thread Joshua D. Drake

Roberts, Jon wrote:

Alvaro Herrera pointed out that pg_read_file requires superuser access which
these users won't have so revoking access to the function code should be
possible.

Joshua D. Drake suggested revoking pg_proc but that isn't the source code,
it just has the definition of the functions.  


Actually I suggested using a obfuscation module.



If it isn't a feature today, what table has the source code in it?  Maybe I
can revoke that.


If your pl is perl or plpgsql it will be in the prosrc (pro_src?) column 
in pg_proc.


Joshua D. Drake




Jon

-Original Message-
From: Jonah H. Harris [mailto:[EMAIL PROTECTED]
Sent: Friday, December 14, 2007 3:04 PM
To: Bill Moran
Cc: Joshua D. Drake; Roberts, Jon; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] viewing source code

On Dec 14, 2007 2:03 PM, Bill Moran [EMAIL PROTECTED]
wrote:

I disagree here.  If they're connecting remotely to PG, they have no
direct access to the disk.

pg_read_file?

--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
499 Thornall Street, 2nd Floor  | [EMAIL PROTECTED]
Edison, NJ 08837| http://www.enterprisedb.com/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] update 600000 rows

2007-12-17 Thread H. Hall

Note: I am resending this because the first never appeared after 40hrs.
HH

[EMAIL PROTECTED] wrote:

Hello

i have a python script to update 60 rows to one table from a csv file in my
postgres database and it takes me 5 hours to do the transaction...

I'm on debian etch with 8.1 postgres server on a 64 bits quad bi opteron.

I have desactived all index except the primary key who is not updated since it's
the reference column of the update too.

When i run this script the server is not used by any other user.

First when i run htop i see that the memory used is never more than 150 MB.
I don't understand in this case why setting shmall and shmmax kernel's
parameters to 16 GB of memory (the server has 32 GB) increase the rapidity of
the transaction a lot compared to a shmall and shmax in (only) 2 GB ?!

The script is run with only one transaction and pause by moment to let the time
to postgres to write data to disk.

If the data were writed at the end of the transaction will be the perfomance
better ? i wan't that in production data regulary writed to disk to prevent
loosinf of data but it there any interest to write temporary data in disk in a
middle of a transaction ???

I'm completely noob to postgres and database configuration and help  are
welcome.

thank


You will get a huge improvement in time if you use batch updates instead
of updating a row at a time. See:

   http://www.postgresql.org/docs/8.2/interactive/populate.html
   and
   http://www.postgresql.org/docs/8.2/interactive/sql-begin.html

You will also get a big improvement if you can turn fsync off during the
update. See:
   http://www.postgresql.org/docs/8.2/interactive/runtime-config-wal.html

You also need to vacuum the table after doing that many updates since pg
does a delete and insert on each update, there will be a lot of holes.

Cheers
HH





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] Multi-threading friendliness (was: libgcc double-free, backend won't die)

2007-12-17 Thread Craig James

Bruce Momjian wrote:

James Mansion wrote:

I think you have your head in the ground, but its your perogative.
*You* might not care, but anyone wanting to use thread-aware libraries
(and I'm *not* talking about threading in any Postgres code) will
certainly value it if they can do so with some stability.


I suggest you find out the cause of your problem and then we can do more
research.  Talking about us changing the Postgres behavior from the
report of one user who doesn't even have the full details isn't
productive.


I think you're confusing James Mansion with me (Craig James).  I'm the one with 
the unresolved problem.

James is suggesting, completely independently of whether or not there's a bug 
in my system, that a thread-friendly option for Postgres would be very useful.

Don't confuse thread-friendly with a threaded implemetation of Postgres itself. 
 These are two separate questions.  Thread-friendly involves compile/link 
options that don't affect the Postgres source code at all.

Craig

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Multi-threading friendliness

2007-12-17 Thread James Mansion

Craig James wrote:
Don't confuse thread-friendly with a threaded implemetation of 
Postgres itself.  These are two separate questions.  Thread-friendly 
involves compile/link options that don't affect the Postgres source 
code at all.
Indeed.  I'm specifically not suggesting that Postgres should offer an 
API that can be called from
anything except the initial thread of its process - just that library 
subsystems might want to use
threads internally and that should be OK. Or rather, it should be 
possible to build Postgres
so that its OK.  Even if there's a small slowdown, the benefit of 
running the full JVM or CLR

might outweigh that quite easily *in some circumstances*.

I've also hinted that at some stage you might want to thread some parts 
of the implementation,
but I'm not suggesting that would be an early target.  It seems to me 
sensible to make it
straightforward to take baby steps in that direction in future would be 
a reasonable thing to
do.  As would being friendly to dynamically loaded C++ code.  If you 
create the framework,
(and we're talking the barest of scaffolding) then others can work to 
show the cost/benefit.


I fail to see why this would be a controversial engineering approach.

James


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Evaluation of PG performance vs MSDE/MSSQL 2000 (not 2005)

2007-12-17 Thread Robert Bernabe
Hi All,
Thanks for all the help here. Sorry for the late update but we've found our 
problem and fixed it already. Prior to looking at the translated code more  
intently, I wanted to make sure that our environmental settings were acceptable 
and the various emails from members have confirmed that...

In a nutshell it seems that MS SQL allows bad T-SQL code by optimizing and 
ignoring redundant/useless from and where clauses in an update statement 
whereas plpgsql will execute exactly what the code is asking it to do...

We had several update instances in the T-SQL code that looked like this :

update _tbl_tmp2
set LongBackPeriod = (select count (EPeriod) from _tbl_tmp1 where 
_tbl_tmp1.Row = _tbl_tmp2.Row);
--
from _tbl_tmp2 tmp2, _tbl_tmp1 tmp1
where tmp2.Row = tmp1.Row;
---

In T-SQL, the performance is the same whether the last two lines are there or 
not...

In plpgsql, this is not the case the from and where clauses are not necessary 
and probably creates an internal (rather useless and time consuming) inner join 
in plpgsql which accounts for the original performance issue.

I'm happy (actually ecstatic) to report that Win2kPro + PG performance is 
slightly faster than  Win2kPro + MSSQL/MSDE. 

Linux(FC7) + PG 8.x performance seems to be 3x faster than Win2KPro + 
MSSQL/MSDE for our stored functions. 

Thanks for all the help! Am a believer now. :) 





Re: [PERFORM] viewing source code

2007-12-17 Thread Kris Jurka



On Mon, 17 Dec 2007, Merlin Moncure wrote:


the table is pg_proc.  you have to revoke select rights from public
and the user of interest.  be aware this will make it very difficult
for that user to do certain things in psql and (especially) pgadmin.
it works.

a better solution to this problem is to make a language wrapper for
pl/pgsql that encrypts the source on disk. afaik, no one is working on
th is.  it would secure the code from remote users but not necessarily
from people logged in to the server.  the pg_proc hack works ok
though.



Another enhancement that would improve this situation would be to 
implement per column permissions as the sql spec has, so that you could 
revoke select on just the prosrc column and allow clients to retrieve the 
metadata they need.


Kris Jurka

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings