Re: [PERFORM] libgcc double-free, backend won't die
James Mansion wrote: I think you have your head in the ground, but its your perogative. *You* might not care, but anyone wanting to use thread-aware libraries (and I'm *not* talking about threading in any Postgres code) will certainly value it if they can do so with some stability. I suggest you find out the cause of your problem and then we can do more research. Talking about us changing the Postgres behavior from the report of one user who doesn't even have the full details isn't productive. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] viewing source code
Alvaro Herrera pointed out that pg_read_file requires superuser access which these users won't have so revoking access to the function code should be possible. Joshua D. Drake suggested revoking pg_proc but that isn't the source code, it just has the definition of the functions. If it isn't a feature today, what table has the source code in it? Maybe I can revoke that. Jon -Original Message- From: Jonah H. Harris [mailto:[EMAIL PROTECTED] Sent: Friday, December 14, 2007 3:04 PM To: Bill Moran Cc: Joshua D. Drake; Roberts, Jon; pgsql-performance@postgresql.org Subject: Re: [PERFORM] viewing source code On Dec 14, 2007 2:03 PM, Bill Moran [EMAIL PROTECTED] wrote: I disagree here. If they're connecting remotely to PG, they have no direct access to the disk. pg_read_file? -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 499 Thornall Street, 2nd Floor | [EMAIL PROTECTED] Edison, NJ 08837| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] viewing source code
On Dec 17, 2007 8:11 AM, Roberts, Jon [EMAIL PROTECTED] wrote: Alvaro Herrera pointed out that pg_read_file requires superuser access which these users won't have so revoking access to the function code should be possible. Joshua D. Drake suggested revoking pg_proc but that isn't the source code, it just has the definition of the functions. If it isn't a feature today, what table has the source code in it? Maybe I can revoke that. the table is pg_proc. you have to revoke select rights from public and the user of interest. be aware this will make it very difficult for that user to do certain things in psql and (especially) pgadmin. it works. a better solution to this problem is to make a language wrapper for pl/pgsql that encrypts the source on disk. afaik, no one is working on th is. it would secure the code from remote users but not necessarily from people logged in to the server. the pg_proc hack works ok though. merlin ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] VACUUM FREEZE output more than double input
On Fri, Dec 14, 2007 at 5:59 PM, in message [EMAIL PROTECTED], Tom Lane [EMAIL PROTECTED] wrote: Kevin Grittner [EMAIL PROTECTED] writes: Why double writes per read, plus massive writes at checkpoint? The double writes aren't surprising: freezing has to be WAL-logged, and the odds are that each page hasn't been touched since the last checkpoint, so the WAL log will include a complete page image. So in the steady state where all shared buffers are dirty, the per-page cycle is: * write out a dirty buffer so it can be reclaimed * read in a page * modify it to mark tuples frozen * write an image of the page to WAL * leave the now-dirty page in shared buffers for later writing The checkpoint spikes would come from trying to flush out all the dirty buffers at once. Got it. Thanks. You'd expect a bit of a valley after each peak, since the vacuum could presumably recycle some buffers without having to flush 'em first; but I don't see one in your data. That may just be because the numbers are too noisy, but I kinda suspect that the vacuum is dirtying buffers nearly as fast as the bgwriter can clean them, leaving not a lot of daylight for a valley. Yeah, the pattern was pretty consistent and without valleys. 8.3 should pretty well eliminate the checkpoint spike in this scenario, because vacuum will work in a limited number of shared buffers instead of dirtying the whole cache. But you'll still see 2X writes over reads. Testing 8.3beta4 so far has shown both smoother I/O and better performance in all respects. The preliminary post I did where I thought I saw some regression on loading a pg_dump turned out to be was an apples to oranges comparison; comparing the same load on the same hardware and OS, 8.3 wins. (Kudos to all who worked on these improvements!) If this is data that you could re-generate at need, it might make sense to turn off full_page_writes during the initial data load and vacuum. Thanks for the suggestions; I'll try that. I concur with trying to FREEZE all the data while you do this, else you'll see the same work done whenever the data happens to slip past the auto freeze threshold. Thanks. I thought that made sense, but I'm still trying to get my head around some of the dynamics of PostgreSQL and MVCC. I'll suggest that as policy here. -Kevin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] viewing source code
Roberts, Jon wrote: Alvaro Herrera pointed out that pg_read_file requires superuser access which these users won't have so revoking access to the function code should be possible. Joshua D. Drake suggested revoking pg_proc but that isn't the source code, it just has the definition of the functions. Actually I suggested using a obfuscation module. If it isn't a feature today, what table has the source code in it? Maybe I can revoke that. If your pl is perl or plpgsql it will be in the prosrc (pro_src?) column in pg_proc. Joshua D. Drake Jon -Original Message- From: Jonah H. Harris [mailto:[EMAIL PROTECTED] Sent: Friday, December 14, 2007 3:04 PM To: Bill Moran Cc: Joshua D. Drake; Roberts, Jon; pgsql-performance@postgresql.org Subject: Re: [PERFORM] viewing source code On Dec 14, 2007 2:03 PM, Bill Moran [EMAIL PROTECTED] wrote: I disagree here. If they're connecting remotely to PG, they have no direct access to the disk. pg_read_file? -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 499 Thornall Street, 2nd Floor | [EMAIL PROTECTED] Edison, NJ 08837| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] update 600000 rows
Note: I am resending this because the first never appeared after 40hrs. HH [EMAIL PROTECTED] wrote: Hello i have a python script to update 60 rows to one table from a csv file in my postgres database and it takes me 5 hours to do the transaction... I'm on debian etch with 8.1 postgres server on a 64 bits quad bi opteron. I have desactived all index except the primary key who is not updated since it's the reference column of the update too. When i run this script the server is not used by any other user. First when i run htop i see that the memory used is never more than 150 MB. I don't understand in this case why setting shmall and shmmax kernel's parameters to 16 GB of memory (the server has 32 GB) increase the rapidity of the transaction a lot compared to a shmall and shmax in (only) 2 GB ?! The script is run with only one transaction and pause by moment to let the time to postgres to write data to disk. If the data were writed at the end of the transaction will be the perfomance better ? i wan't that in production data regulary writed to disk to prevent loosinf of data but it there any interest to write temporary data in disk in a middle of a transaction ??? I'm completely noob to postgres and database configuration and help are welcome. thank You will get a huge improvement in time if you use batch updates instead of updating a row at a time. See: http://www.postgresql.org/docs/8.2/interactive/populate.html and http://www.postgresql.org/docs/8.2/interactive/sql-begin.html You will also get a big improvement if you can turn fsync off during the update. See: http://www.postgresql.org/docs/8.2/interactive/runtime-config-wal.html You also need to vacuum the table after doing that many updates since pg does a delete and insert on each update, there will be a lot of holes. Cheers HH ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Multi-threading friendliness (was: libgcc double-free, backend won't die)
Bruce Momjian wrote: James Mansion wrote: I think you have your head in the ground, but its your perogative. *You* might not care, but anyone wanting to use thread-aware libraries (and I'm *not* talking about threading in any Postgres code) will certainly value it if they can do so with some stability. I suggest you find out the cause of your problem and then we can do more research. Talking about us changing the Postgres behavior from the report of one user who doesn't even have the full details isn't productive. I think you're confusing James Mansion with me (Craig James). I'm the one with the unresolved problem. James is suggesting, completely independently of whether or not there's a bug in my system, that a thread-friendly option for Postgres would be very useful. Don't confuse thread-friendly with a threaded implemetation of Postgres itself. These are two separate questions. Thread-friendly involves compile/link options that don't affect the Postgres source code at all. Craig ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Multi-threading friendliness
Craig James wrote: Don't confuse thread-friendly with a threaded implemetation of Postgres itself. These are two separate questions. Thread-friendly involves compile/link options that don't affect the Postgres source code at all. Indeed. I'm specifically not suggesting that Postgres should offer an API that can be called from anything except the initial thread of its process - just that library subsystems might want to use threads internally and that should be OK. Or rather, it should be possible to build Postgres so that its OK. Even if there's a small slowdown, the benefit of running the full JVM or CLR might outweigh that quite easily *in some circumstances*. I've also hinted that at some stage you might want to thread some parts of the implementation, but I'm not suggesting that would be an early target. It seems to me sensible to make it straightforward to take baby steps in that direction in future would be a reasonable thing to do. As would being friendly to dynamically loaded C++ code. If you create the framework, (and we're talking the barest of scaffolding) then others can work to show the cost/benefit. I fail to see why this would be a controversial engineering approach. James ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Evaluation of PG performance vs MSDE/MSSQL 2000 (not 2005)
Hi All, Thanks for all the help here. Sorry for the late update but we've found our problem and fixed it already. Prior to looking at the translated code more intently, I wanted to make sure that our environmental settings were acceptable and the various emails from members have confirmed that... In a nutshell it seems that MS SQL allows bad T-SQL code by optimizing and ignoring redundant/useless from and where clauses in an update statement whereas plpgsql will execute exactly what the code is asking it to do... We had several update instances in the T-SQL code that looked like this : update _tbl_tmp2 set LongBackPeriod = (select count (EPeriod) from _tbl_tmp1 where _tbl_tmp1.Row = _tbl_tmp2.Row); -- from _tbl_tmp2 tmp2, _tbl_tmp1 tmp1 where tmp2.Row = tmp1.Row; --- In T-SQL, the performance is the same whether the last two lines are there or not... In plpgsql, this is not the case the from and where clauses are not necessary and probably creates an internal (rather useless and time consuming) inner join in plpgsql which accounts for the original performance issue. I'm happy (actually ecstatic) to report that Win2kPro + PG performance is slightly faster than Win2kPro + MSSQL/MSDE. Linux(FC7) + PG 8.x performance seems to be 3x faster than Win2KPro + MSSQL/MSDE for our stored functions. Thanks for all the help! Am a believer now. :)
Re: [PERFORM] viewing source code
On Mon, 17 Dec 2007, Merlin Moncure wrote: the table is pg_proc. you have to revoke select rights from public and the user of interest. be aware this will make it very difficult for that user to do certain things in psql and (especially) pgadmin. it works. a better solution to this problem is to make a language wrapper for pl/pgsql that encrypts the source on disk. afaik, no one is working on th is. it would secure the code from remote users but not necessarily from people logged in to the server. the pg_proc hack works ok though. Another enhancement that would improve this situation would be to implement per column permissions as the sql spec has, so that you could revoke select on just the prosrc column and allow clients to retrieve the metadata they need. Kris Jurka ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings