Re: [PERFORM] SSD database benchmarks
On 29.04.2008, at 12:55, Greg Smith wrote: This is the best write-up I've seen yet on quantifying what SSDs are good and bad at in a database context: http://www.bigdbahead.com/?p=37 They totally missed "mainly write" applications which most of my applications are. Reads in a OLTP setup are typically coming from a cache (or, like in our case an index like Solr) while writes go through ... So you might get some decent IO from the SSD when the database just started up without caches filled, but as long as your cache hit ratio is good, it doesn't matter anymore after a couple of minutes. Nevertheless it's an interesting development. cug -- http://www.event-s.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Training Recommendations
On 02.12.2007, at 06:30, Merlin Moncure wrote: I've been dying to know if anyone has ever done PostgreSQL training at 'the big nerd ranch'. There are a couple of reviews floating around the web: http://www.linux.com/articles/48870 http://www.linuxjournal.com/article/7847 I was in the course too (out of interest) but as I'm with Big Nerd Ranch, I don't want to say anything here about the course. cug -- http://www.event-s.net ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] LIKE search and performance
Am 23.05.2007 um 09:08 schrieb Andy: I have a table with varchar and text columns, and I have to search through these text in the whole table. An example would be: SELECT * FROM table WHERE name like '%john%' or street like '%srt%' Anyway, the query planner always does seq scan on the whole table and that takes some time. How can this be optimized or made in another way to be faster? The problem is that normal indexes cannot be used for "contains" queries. If you need fulltext search capabilities you have to take a look at tsearch2 or an external search engine like Lucene. cug ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Postgres Benchmark Results
Am 21.05.2007 um 23:51 schrieb Greg Smith: The standard pgbench transaction includes a select, an insert, and three updates. I see. Didn't know that, but it makes sense. Unless you went out of your way to turn it off, your drive is caching writes; every Seagate SATA drive I've ever seen does by default. "1062 tps with 3-4 clients" just isn't possible with your hardware otherwise. Btw: it wasn't my hardware in this test! cug ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Postgres Benchmark Results
Am 21.05.2007 um 15:01 schrieb Jim C. Nasby: I'd be willing to bet money that the drive is lying about commits/ fsync. Each transaction committed essentially requires one revolution of the drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS. Yes, that right, but if a lot of the transactions are selects, there is no entry in the x_log for them and most of the stuff can come from the cache - read from memory which is blazing fast compared to any disk ... And this was a pg_bench test - I don't know what the benchmark really does but if I remember correctly it is mostly reading. cug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4
On 18.05.2007, at 10:21, Kenneth Marshall wrote: It is arguable, that updating the DB software version in an enterprise environment requires exactly that: check all production queries on the new software to identify any issues. In part, this is brought on by the very tuning that you performed against the previous software. Restore the 8.1 DB into 8.2. Then run the queries against both versions to evaluate functioning and timing. And it is always a good idea to do this in a "clean room environment" aka test server and set the logging in PostgreSQL to log all queries longer than xx ms. If you first install 8.1 on the test machine, do a test run and then upgrade to 8.2, you can compare results from the tests and find the queries that are slower or faster quite easily. cug ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Slow Postgresql server
On 12.04.2007, at 15:58, Jason Lustig wrote: Wow! That's a lot to respond to. Let me go through some of the ideas... First, I just turned on autovacuum, I forgot to do that. I'm not seeing a major impact though. Also, I know that it's not optimal for a dedicated server. Hmm, why not? Have you recently vacuumed your db manually so it gets cleaned up? Even a vacuum full might be useful if the db is really bloated. It's not just for postgres, it's also got our apache server on it. We're just getting started and didn't want to make the major investment right now in getting the most expensive server we can get Hmmm, but more RAM would definitely make sense, especially in that szenaria. It really sounds like you machine is swapping to dead. What does the system say about memory usage? Some of the queries are definitely making an impact on the speed. We are constantly trying to improve performance, and part of that is reassessing our indexes and denormalizing data where it would help. We're also doing work with memcached to cache the results of some of the more expensive operations. Hmmm, that kills you even more, as it uses RAM. I really don't think at the moment that it has something to do with PG itself, but with not enough memory for what you want to achieve. What perhaps helps might be connection pooling, so that not so many processes are created for the requests. It depends on your "middle- ware" what you can do about that. pg_pool might be an option. cug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Slow Postgresql server
On 12.04.2007, at 08:59, Ron wrote: 1= Unless I missed something, the OP described pg being used as a backend DB for a webserver. Yep. I know the typical IO demands of that scenario better than I sometimes want to. :-( Yep. Same here. ;-) 2= 1GB of RAM + effectively 1 160GB HD = p*ss poor DB IO support. Absolutely right. Depending a little bit on the DB and WebSite layout and on the actual requirements, but yes - it's not really a kick-ass machine ... Completely agree that at some point the queries need to be examined (ditto the table schema, etc), but this system is starting off in a Bad Place for its stated purpose IME. Some minimum stuff is obvious even w/o spending time looking at anything beyond the HW config. Depends. As I said - if the whole DB fits into the remaining space, and a lot of website backend DBs do, it might just work out. But this seems not to be the case - either the site is chewing on seq scans all the time which will cause I/O or it is bound by the lack of memory and swaps the whole time ... He has to find out. cug ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Slow Postgresql server
On 12.04.2007, at 07:26, Ron wrote: You need to buy RAM and HD. Before he does that, wouldn't it be more useful, to find out WHY he has so much IO? Have I missed that or has nobody suggested finding the slow queries (when you have much IO on them, they might be slow at least with a high shared memory setting). So, my first idea is, to turn on query logging for queries longer than a xy milliseconds, "explain analyse" these queries and see wether there are a lot of seq scans involved, which would explain the high IO. Just an idea, perhaps I missed that step in that discussion somewhere ... But yes, it might also be, that the server is swapping, that's another thing to find out. cug ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Large objetcs performance
On 04.04.2007, at 08:03, Alexandre Vasconcelos wrote: We have an application subjected do sign documents and store them somewhere. The files size may vary from Kb to Mb. Delelopers are arguing about the reasons to store files direcly on operating system file system or on the database, as large objects. My boss is considering file system storing, because he is concerned about integrity, backup/restore corruptions. I'd like to know some reasons to convince them to store these files on PosgtreSQL, including integrity, and of course, performance. I would like to know the file system storing disadvantages as well. It is not directly PostgreSQL related, but this might give you something to think about: http://en.wikibooks.org/wiki/Programming:WebObjects/Web_Applications/ Development/Database_vs_Filesystem cug ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] scalablility problem
On 30.03.2007, at 19:18, Christopher Browne wrote: 2. There are known issues with the combination of Xeon processors and PAE memory addressing; that sort of hardware tends to be *way* less speedy than the specs would suggest. That is not true as the current series of processors (Woodcrest and the like) are also called Xeon. You probably mean the Pentium IV era Xeons. cug ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Performance of count(*)
On 22.03.2007, at 11:53, Steve Atkins wrote: As long as you're ordering by some row in the table then you can do that in straight SQL. select a, b, ts from foo where (stuff) and foo > X order by foo limit 10 Then, record the last value of foo you read, and plug it in as X the next time around. This does only work if you have unique values in foo. You might have "batch breaks" inside a list of rows with equal values for foo. But: a web application that needs state and doesn't maintain it by itself (or inside the dev toolkit) is imho broken by design. How should the database store a "state" for a web app? It's only possible on the web app part, because the app is either stateless and so are the queries to the database - they have to be re-evaluated for every request as the request might come from totally different sources (users, ...) or it is stateful and has to maintain the state because only the app developer knows, what information is needed for the "current state". This is why all web application toolkits have a "session" concept. I think the problem is more that most web developers aren't very good at using the database, and tend to fall back on simplistic, wrong, approaches to displaying the data. There's a lot of monkey-see, monkey-do in web UI design too, which doesn't help. Sure. That is the other problem ... ;-) But, and I think this is much more important: most toolkits today free you from using the database directly and writing lots and lots of lines of sql code which instantly breaks when you switch the storage backend. It's just the thing from where you look at something. cug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 05.03.2007, at 19:56, Alex Deucher wrote: Yes, I started setting that up this afternoon. I'm going to test that tomorrow and post the results. Good - that may or may not give some insight in the actual bottleneck. You never know but it seems to be one of the easiest to find out ... cug ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 01.03.2007, at 13:40, Alex Deucher wrote: I read several places that the SAN might be to blame, but testing with bonnie and dd indicates that the SAN is actually almost twice as fast as the scsi discs in the old sun server. I've tried adjusting just about every option in the postgres config file, but performance remains the same. Any ideas? As mentioned last week: Did you actually try to use the local drives for speed testing? It might be that the SAN introduces latency especially for random access you don't see on local drives. cug ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] strange performance regression between 7.4 and 8.1
On 02.03.2007, at 14:20, Alex Deucher wrote: Ah OK. I see what you are saying; thank you for clarifying. Yes, the SAN is configured for maximum capacity; it has large RAID 5 groups. As I said earlier, we never intended to run a DB on the SAN, it just happened to come up, hence the configuration. So why not dumping the stuff ones, importing into a PG configured to use local discs (Or even ONE local disc, you might have the 16GB you gave as a size for the db on the local machine, right?) and testing whether the problem is with PG connecting to the SAN. So you have one factor less to consider after all your changes. Maybe it's just that something in the chain from PG to the actual HD spindles kills your random access performance for getting the actual rows. cug ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Seqscan/Indexscan still a known issue?
On 27.01.2007, at 00:35, Russell Smith wrote: Guess 1 would be that your primary key is int8, but can't be certain that is what's causing the problem. Why could that be a problem? cug ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] New to PostgreSQL, performance considerations
On 13.12.2006, at 19:03, Ron wrote: What I find interesting is that so far Guido's C2D Mac laptop has gotten the highest values by far in this set of experiments, and no one else is even close. This might be the case because I have tested with fsync=off as my internal harddrive would be a limiting factor and the results wouldn't be really helpful. Perhaps it's still the IO system, I don't know. I can try to reproduce the tests as close as possible again. Perhaps I had different settings on something but I doubt that. The new Core * CPUs from Intel are extremely fast with PostgreSQL. Anyone got a 2.33 GHz C2D box with a decent HD IO subsystem more representative of a typical DB server hooked up to it? I have also now an Xserve with two Dual-Core Xeons and two SAS drives (15k Seagates) in a mirrored RAID here. Will do some testing tomorrow. Btw: I always compare only to my own results to have something comparable - same test, same scripts, same db version, same operating system and so on. The rest is just pure interest. cug ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] New to PostgreSQL, performance considerations
On 12.12.2006, at 02:37, Michael Stone wrote: Can anyone else reproduce these results? I'm on similar hardware (2.5GHz P4, 1.5G RAM) and my test results are more like this: I'm on totally different hardware / software (MacBook Pro 2.33GHz C2D) and I can't reproduce the tests. I have played with a lot of settings in the CFLAGS including -march and -O3 and -O2 - there is no significant difference in the tests. With fsync=off I get around 2100tps on average with all different settings I have tested. I tried to get the rest of the setup as similar to the described on ty Daniel as possible. It might be that the crappy Pentium 4 needs some special handling, but I can't get the Core 2 Duo in my laptop produce different tps numbers with the different optimizations. Btw: best results were 2147 with -march=i686 and 2137 with - march=nocona. Both with -O3. cug ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] shared_buffers > 284263 on OS X
Am 27.11.2006 um 17:05 schrieb AgentM: There is a known unfortunate limitation on Darwin for SysV shared memory which, incidentally, does not afflict POSIX or mmap'd shared memory. Hmmm. The article from Chris you have linked does not mention the size of the mem segment you can allocate. Nevertheless - if you compile a 32 Bit binary, there is the limitation Brian mentioned. You can easily simulate this with a small C program that allocates memory - if you compile it as 64 Bit binary - not problem, if you compile as 32 Bit - crash. cug ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Plattform comparison (lies, damn lies and benchmarks)
Hi. After I had my hands on an Intel MacBook Pro (2 GHz Core Duo, 1GB RAM), I made some comparisons between the machines I have here at the company. For the ease of it and the simple way of reproducing the tests, I took pgbench for the test. Konfigurations: 1. PowerMac G5 (G5 Mac OS X) with two 1.8 CPUs (not a dual core), 1.25GB RAM, Mac OS X Tiger 10.4.8, Single S-ATA harddrive, fsync on 2. PowerMac G5 from above but with Yellow Dog Linux 4.1 3. MacBook Pro, 2GHz Core Duo, 1GB RAM, Mac OS X Tiger 10.4.8, internal harddrive (5k4, 120GB). PostgreSQL version is 8.2beta3 compiled with same settings on all plattforms, on Mac OS X Spotlight was turned off, same memory settings on all plattforms (320MB of shmmax on Mac OS X, 128MB shared_buffers for PostgreSQL). Here we go: Results with 2 concurrent connections: G5 Mac OS X: 495 G5 YD Linux: 490 - 520 MBP X: 1125 Results with 10 concurrent connections: G5 Mac OS X: 393 G5 YD Linux: 410 - 450 MBP: 1060 Results with 50 concurrent connections: G5 Mac OS X: 278 G5 YD Linux: 232 MBP X: 575 Results with 90 concurrent connections: Mac OS X: 210 YD Linux: 120 MBP X: 378 The tests were taken with: [EMAIL PROTECTED] ~]$ for n in `seq 0 9`; do pgbench -U postgres -c 10 - t 100 benchdb; done | perl -nle '/tps = (\d+)/ or next; $cnt++; $tps +=$1; END{ $avg = $tps/$cnt; print $avg }' Yesterday a friend had a chance to test with a 2.16GHz MacBook Pro Core 2 Duo (Mac OS X, 5k4 160GB internal harddrive): 10 connections: ~1150 tps 50 connections: ~640 tps To quantify the performance hit from the harddrive we tested also with fsync off: 10 connections: ~1500 tps 50 connections: ~860 tps The G5 with fsync off had only 5% more performance, so the harddrive didn't have such a high impact on the performance there. Okay, nothing really special so far, but interesting enough. Only wanted to share the results with you. cug ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] shared_buffers > 284263 on OS X
Am 27.11.2006 um 08:04 schrieb Guido Neitzer: But, be aware of another thing here: As far as I have read about 64 Bit applications on G5, these apps are definitely slower than their 32 bit counterparts (I'm currently on the train so I can't be more precise here without Google ...). Was it something with not enough registers in the CPU? Something like that ... So it might be, that the 64 bit version is able to use more shared memory but is slower than the 32 bit version and you come out with the same performance. Nobody knows ... Some information about that: <http://www.geekpatrol.ca/2006/09/32-bit-vs-64-bit-performance/> So, the impact doesn't seem to high. So it seems to depend on the usage pattern whether the 32 bit with less RAM and slightly higher performance might be faster than 64 bit with more shared memory and slightly lower performance. cug ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] shared_buffers > 284263 on OS X
Am 27.11.2006 um 00:25 schrieb Jim C. Nasby: Got any data about that you can share? People have been wondering about cases where drastically increasing shared_buffers makes a difference. I have tried to compile PostgreSQL as a 64Bit application on my G5 but wasn't successful. But I must admit, that I'm not a C programmer at all. I know enough to work with source packages and configure / make but not enough to work with the errors I got from the compile. And as I'm extremely busy right now, I can't follow the trail and learn more about it. Perhaps someone with more knowledge can take a look at it. cug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] shared_buffers > 284263 on OS X
Am 27.11.2006 um 04:20 schrieb Brendan Duddridge: I think the main issue is that we can't seem to get PostgreSQL compiled for 64 bit on OS X on an Xserve G5. Has anyone done that? We have 8 GB of RAM on that server, but we can't seem to utilize it all. At least not for the shared_buffers setting. One VERY ugly idea is: if you have your stuff in more than one db, let two PostgreSQL installations run on the same machine and put some databases on one and others on the second installation (on different ports and different data directories of course) and give either one the 2GB shared mem you like. So you can use the 50% of the available RAM. I don't know whether Mac OS X itself is able to handle a larger amount of shared memory but I believe it can. But nevertheless this is only a very ugly workaround on a problem that shouldn't exist. The correct way would be to get a 64 Bit binary of PostgreSQL - which I wasn't able to create. But, be aware of another thing here: As far as I have read about 64 Bit applications on G5, these apps are definitely slower than their 32 bit counterparts (I'm currently on the train so I can't be more precise here without Google ...). Was it something with not enough registers in the CPU? Something like that ... So it might be, that the 64 bit version is able to use more shared memory but is slower than the 32 bit version and you come out with the same performance. Nobody knows ... cug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Postgres scalability and performance on windows
Am 23.11.2006 um 23:37 schrieb Gopal: hared_buffers = 2# min 16 or max_connections*2, 8KB each If this is not a copy & paste error, you should add the "s" at the beginning of the line. Also you might want to set this to a higher number. You are setting about 2 * 8k = 160MB, this number might be a bit too small if you do a lot of queries spread over the whole dataset. I don't know whether the memory management on Windows handles this well, but you can give it a try. effective_cache_size = 82728 # typically 8KB each Hmm. I don't know what the real effect of this might be as the doc states: "This parameter has no effect on the size of shared memory allocated by PostgreSQL, nor does it reserve kernel disk cache; it is used only for estimation purposes." You should try optimizing your shared_buffers to cache more of the data. But postgres has everything spread across 10-15 processes, with each process using about 10-30MB, not nearly enough to cache all the data and ends up doing a lot of disk reads. It's not soo easy. PostgreSQL maintains a shared_buffer which is accessible by all processes for reading. On a Unix system you can see this in the output of top - don't know how this works on Windows. In any case I cannot believe that having 15-20 processes running on windows helps. Why not spwan of threads instead of processes, which migh be far less expensive and more efficient. Is there any way of doing this? Because it brings you a whole lot of other problems? And because PostgreSQL is not "made for Windows". PostgreSQL runs very good on Linux, BSD, Mac OS X and others. The Windows version is quite young. But before you blame stuff on PostgreSQL you should give more information about the query itself. My question is, should I just accept the performance I am getting as the limit on windows or should I be looking at some other params that I might have missed? Post the "explain analyse select " output here. That might help to understand, why you get such a high CPU load. cug ---(end of broadcast)--- TIP 6: explain analyze is your friend
PostgreSQL with 64 bit was: Re: [PERFORM] shared_buffers > 284263 on OS X
Am 18.11.2006 um 19:44 schrieb Guido Neitzer: It might be, that you hit an upper limit in Mac OS X: [galadriel: memtext ] cug $ ./test test(291) malloc: *** vm_allocate(size=2363490304) failed (error code=3) test(291) malloc: *** error: can't allocate region test(291) malloc: *** set a breakpoint in szone_error to debug max alloc = 2253 M Compiled with 64 Bit support the test program doesn't bring an error. I have now tried to compile PostgreSQL as a 64 Bit binary on Mac OS X but wasn't able to do so. I'm running against the wall with my attempts but I must admit that I'm not an expert on that low level C stuff. I tried with setting the CFLAGS env variable to '-mpowerpc64 - mcpu=970 -m64' but with that, I'm not able to compile PostgreSQL on my G5. Has someone hints for that? cug ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] shared_buffers > 284263 on OS X
Am 19.11.2006 um 04:13 schrieb Brian Wipf: It certainly is unfortunate if Guido's right and this is an upper limit for OS X. The performance benefit of having high shared_buffers on our mostly read database is remarkable. I hate to say that, but if you want best performance out of PostgreSQL, Mac OS X (Server) isn't the best OS to achieve this. This might change in the future (who knows), but currently you get more out of Linux. Brendan might have some of my old benchmarks. We wrote a couple of mails about that a couple of months ago. If you're interested, I can run a pgbench benchmark on my desktop machine in the company comparing Mac OS X Tiger to Yellow Dog Linux with 8.1.5 and 8.2beta3. If I remember correctly I have YDL installed on a second hard drive and should be about a couple of minutes to install the latest PostgreSQL release. So, there is no need for you to do the testing of YDL on your Xserves without knowing pretty much for sure, that it will bring you some benefit. As far as I remember I got around 50% to 80% better performance with Linux on the same machine with same settings but that was in times when I hardly new anything about optimizing the OS and PostgreSQL for OLTP performance. Some hints from what I have learned in the past about PostgreSQL on Mac OS X / Apple machines: - Turn off Spotlight on all harddrives on the server (in /etc/ hostconfig) - Use the latest compilers (gcc) and PostgreSQL versions (I'm sure, you do ... ;-)). - If you need the highest possible performance, use Linux instead of Mac OS X for the DB server. :-/ I know that some of the tips don't help with your current setup. Perhaps the switch to Linux on the DB machines might help. But I don't know whether they work good with the XserveRAID you have. Might bring you some headache - I don't know, perhaps you can find opinions on the net. Regarding the memory test I also tried it on Leopard and it seems that the problem persists. Perhaps someone from Apple can say something about that. We might ask on the Darwin list. I'll post some results tomorrow. cug ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] shared_buffers > 284263 on OS X
Hi. I've sent this out once, but I think it didn't make it through the mail server ... don't know why. If it is a double post - sorry for it. Brian Wipf <[EMAIL PROTECTED]> wrote: > I'm trying to optimize a PostgreSQL 8.1.5 database running on an > Apple G5 Xserve (dual G5 2.3 GHz w/ 8GB of RAM), running Mac OS X > 10.4.8 Server. > > The queries on the database are mostly reads, and I know a larger > shared memory allocation will help performance (also by comparing it > to the performance of the same database running on a SUSE Linux box, > which has a higher shared_buffers setting). > > When I set shared_buffers above 284263 (~ 2.17 GB) in the > postgresql.conf file, I get the standard error message when trying to > start the db: It might be, that you hit an upper limit in Mac OS X: [galadriel: memtext ] cug $ ./test test(291) malloc: *** vm_allocate(size=2363490304) failed (error code=3) test(291) malloc: *** error: can't allocate region test(291) malloc: *** set a breakpoint in szone_error to debug max alloc = 2253 M That seems near the size you found to work. I don't really know much about that, but it seems you just can't alloc more memory than a bit over 2GB. So, be careful with my non-existing knowledge about that ... ;-) cug ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] shared_buffers > 284263 on OS X
Brian Wipf <[EMAIL PROTECTED]> wrote: > I'm trying to optimize a PostgreSQL 8.1.5 database running on an > Apple G5 Xserve (dual G5 2.3 GHz w/ 8GB of RAM), running Mac OS X > 10.4.8 Server. > > The queries on the database are mostly reads, and I know a larger > shared memory allocation will help performance (also by comparing it > to the performance of the same database running on a SUSE Linux box, > which has a higher shared_buffers setting). > > When I set shared_buffers above 284263 (~ 2.17 GB) in the > postgresql.conf file, I get the standard error message when trying to > start the db: It might be, that you hit an upper limit in Mac OS X: [galadriel: memtext ] cug $ ./test test(291) malloc: *** vm_allocate(size=2363490304) failed (error code=3) test(291) malloc: *** error: can't allocate region test(291) malloc: *** set a breakpoint in szone_error to debug max alloc = 2253 M That seems near the size you found to work. I don't really know much about that, but it seems you just can't alloc more memory than a bit over 2GB. So, be careful with my non-existing knowledge about that ... ;-) cug ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Opteron vs. Xeon "benchmark"
On 9/23/06, Dave Cramer <[EMAIL PROTECTED]> wrote: 1) The database fits entirely in memory, so this is really only testing CPU, not I/O which should be taken into account IMO I don't think this really is a reason that MySQL broke down on ten or more concurrent connections. The RAM might be, but I don't think so too in this case as it represents exactly what we have seen in similar tests. MySQL performs quite well on easy queries and not so much concurrency. We don't have that case very often in my company ... we have at least ten to twenty connections to the db performing statements. And we have some fairly complex statements running very often. Nevertheless - a benchmark is a benchmark. Nothing else. We prefer PostgreSQL for other reasons then higher performance (which it has for lots of situations). cug -- PostgreSQL Bootcamp, Big Nerd Ranch Europe, Nov 2006 http://www.bignerdranch.com/news/2006-08-21.shtml ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Opteron vs. Xeon "benchmark"
I find the benchmark much more interesting in comparing PostgreSQL to MySQL than Intel to AMD. It might be as biased as other "benchmarks" but it shows clearly something that a lot of PostgreSQL user always thought: MySQL gives up on concurrency ... it just doesn't scale well. cug On 9/23/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: Yep. From what I understand, Intel is 8 to 10 times the size of AMD. It's somewhat amazing that AMD even competes, and excellent for us, the consumer, that they compete well, ensuring that we get very fast computers, for amazingly low prices. But Intel isn't crashing down any time soon. Perhaps they became a little lazy, and made a few mistakes. AMD is forcing them to clean up. May the competition continue... :-) Cheers, mark -- PostgreSQL Bootcamp, Big Nerd Ranch Europe, Nov 2006 http://www.bignerdranch.com/news/2006-08-21.shtml ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] PostgreSQL and sql-bench
Hi. Do you compare apples to apples? InnoDB tables to PostgreSQL? Are all needed indexes available? Are you sure about that? What about fsync? Does the benchmark insert a lot of rows? Have you tested placing the WAL on a separate disk? Is PostgreSQL logging more stuff? Another thing: have you analyzed the tables? Have you tested higher shared_buffers? And the last thing: there are lies, damn lies and benchmarks. What does a benchmark, which might be optimized for one DB, help you with your own db workload? There are s many things that can go wrong with a benchmark if you don't have real knowledge on how to optimize both DBMS that it is just worthless to use it anyway if you don't have the knowledge ... PostgreSQL outperforms MySQL in our environment in EVERY situation needed by the application. So, does the benchmark represent your work load? Does the benchmark result say anything for your own situation? Or is this all for the sake of running a benchmark? cug On 9/21/06, yoav x <[EMAIL PROTECTED]> wrote: Hi After upgrading DBI and DBD::Pg, this benchmark still picks MySQL as the winner (at least on Linux RH3 on a Dell 1875 server with 2 hyperthreaded 3.6GHz CPUs and 4GB RAM). I've applied the following parameters to postgres.conf: max_connections = 500 shared_buffers = 3000 work_mem = 10 effective_cache_size = 30 Most queries still perform slower than with MySQL. Is there anything else that can be tweaked or is this a limitation of PG or the benchmark? Thanks. -- PostgreSQL Bootcamp, Big Nerd Ranch Europe, Nov 2006 http://www.bignerdranch.com/news/2006-08-21.shtml ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Poor performance on seq scan
Because there is no MVCC information in the index. cug 2006/9/12, Piotr Kołaczkowski <[EMAIL PROTECTED]>: On Tuesday 12 September 2006 12:47, Heikki Linnakangas wrote: > Laszlo Nagy wrote: > > I made another test. I create a file with the identifiers and names of > > the products: > > > > psql#\o products.txt > > psql#select id,name from product; > > > > Then I can search using grep: > > > > grep "Mug" products.txt | cut -f1 -d\| > > > > There is a huge difference. This command runs within 0.5 seconds. That > > is, at least 76 times faster than the seq scan. It is the same if I > > vacuum, backup and restore the database. I thought that the table is > > stored in one file, and the seq scan will be actually faster than > > grepping the file. Can you please tell me what am I doing wrong? I'm > > not sure if I can increase the performance of a seq scan by adjusting > > the values in postgresql.conf. I do not like the idea of exporting the > > product table periodically into a txt file, and search with grep. :-) > > Is there any other columns besides id and name in the table? How big is > products.txt compared to the heap file? > > > Another question: I have a btree index on product(name). It contains > > all product names and the identifiers of the products. Wouldn't it be > > easier to seq scan the index instead of seq scan the table? The index > > is only 66MB, the table is 1123MB. > > Probably, but PostgreSQL doesn't know how to do that. Even if it did, it > depends on how many matches there is. If you scan the index and then > fetch the matching rows from the heap, you're doing random I/O to the > heap. That becomes slower than scanning the heap sequentially if you're > going to get more than a few hits. Why match rows from the heap if ALL required data are in the index itself? Why look at the heap at all? This is the same performance problem in PostgreSQL I noticed when doing some "SELECT count(*)" queries. Look at this: explain analyze select count(*) from transakcja where data > '2005-09-09' and miesiac >= (9 + 2005 * 12) and kwota < 50; QUERY PLAN -- Aggregate (cost=601557.86..601557.87 rows=1 width=0) (actual time=26733.479..26733.484 rows=1 loops=1) -> Bitmap Heap Scan on transakcja (cost=154878.00..596928.23 rows=1851852 width=0) (actual time=9974.208..18796.060 rows=1654218 loops=1) Recheck Cond: ((miesiac >= 24069) AND (kwota < 50::double precision)) Filter: (data > '2005-09-09 00:00:00'::timestamp without time zone) -> Bitmap Index Scan on idx_transakcja_miesiac_kwota (cost=0.00..154878.00 rows=556 width=0) (actual time=9919.967..9919.967 rows=1690402 loops=1) Index Cond: ((miesiac >= 24069) AND (kwota < 50::double precision)) Total runtime: 26733.980 ms (7 rows) The actual time retrieving tuples from the index is less than 10 seconds, but the system executes needless heap scan that takes up additional 16 seconds. Best regards, Peter ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- PostgreSQL Bootcamp, Big Nerd Ranch Europe, Nov 2006 http://www.bignerdranch.com/news/2006-08-21.shtml ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Posrgres speed problem - solved!
On 13.06.2006, at 12:33 Uhr, Ruben Rubio Rey wrote: Seems autovacumm is working fine. Logs are reporting that is being useful. But server load is high. Is out there any way to stop "autovacumm" if server load is very high? Look at the cost settings for vacuum and autovacuum. From the manual: "During the execution of VACUUM and ANALYZE commands, the system maintains an internal counter that keeps track of the estimated cost of the various I/O operations that are performed. When the accumulated cost reaches a limit (specified by vacuum_cost_limit), the process performing the operation will sleep for a while (specified by vacuum_cost_delay). Then it will reset the counter and continue execution. The intent of this feature is to allow administrators to reduce the I/ O impact of these commands on concurrent database activity. There are many situations in which it is not very important that mainte- nance commands like VACUUM and ANALYZE finish quickly; however, it is usually very important that these commands do not significantly interfere with the ability of the system to perform other database operations. Cost-based vacuum delay provides a way for administrators to achieve this." cug ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Posrgres speed problem - solved?
On 13.06.2006, at 8:44 Uhr, Ruben Rubio Rey wrote: Tonight database has been vacumm full and reindex (all nights database do it) Now its working fine. Speed is as spected. I ll be watching that sql ... Maybe the problem exists when database is busy, or maybe its solved ... Depending on the usage pattern the nightly re-index / vacuum analyse is suboptimal. If you have high insert/update traffic your performance will decrease over the day and will only be good in the morning hours and I hope this is not what you intend to have. Autovacuum is the way to go, if you have "changing content". Perhaps combined with vacuum analyse in a nightly or weekly schedule. We do this weekly. cug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Benchmarck PostgreSQL 8.1.4 MySQL 5.0.20 and Oracle
On 18.05.2006, at 12:42 Uhr, Olivier Andreotti wrote: I use prepared statements for all requests. Each transaction is about 5-45 requests. This may lead to bad plans (at least with 8.0.3 this was the case) ... I had the same problem a couple of months ago and I switched from prepared statements with bound values to statements with "inlined" values: SELECT t0.aktiv, t0.id, t0.ist_teilnehmer, t0.nachname, t0.plz, t0.vorname FROM public.dga_dienstleister t0 WHERE t0.plz like ?::varchar(256) ESCAPE '|' withBindings: 1:"53111"(plz) has changed in my app to: SELECT t0.aktiv, t0.id, t0.ist_teilnehmer, t0.nachname, t0.plz, t0.vorname FROM public.dga_dienstleister t0 WHERE t0.plz like '53111' ESCAPE '|' The problem was, that the planner wasn't able to use an index with the first version because it just didn't know enough about the actual query. It might be, that you run into similar problems. An easy way to test this may be to set the protocolVersion in the JDBC driver connection url to "2": jdbc:postgresql://127.0.0.1/Database?protocolVersion=2 cug -- PharmaLine, Essen, GERMANY Software and Database Development ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Performance decrease
On 20.04.2006, at 18:10 Uhr, Radovan Antloga wrote: I have once or twice a month update on many records (~6000) but not so many. I did not expect PG would have problems with updating 15800 records. It has no problems with that. We have a database where we often update/insert rows with about one hundred columns. No problem so far. Performance is in the sub 10ms range. The whole table has about 10 records. Do you wrap every update in a separate transaction? I do commits every 200 updates for bulk updates. cug -- PharmaLine, Essen, GERMANY Software and Database Development smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Problem with LIKE-Performance
On 18.04.2006, at 17:16 Uhr, Tarabas (Manuel Rorarius) wrote: Is there any way to speed the like's up with a different locale than C or to get an order by in a different Locale although using the default C locale? Sure. Just create the index with create index __index on ( varchar_pattern_ops); Than you can use something like select * from where like 'Something%'; Remember that an index can't be used for queries with '%pattern%'. cug smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Decide between Postgresql and Mysql (help of
On 30.03.2006, at 23:31 Uhr, PFC wrote: (why do you think I don't like Java ?) Because you haven't used a good framework/toolkit yet? Come on, the language doesn't really matter these days, it's all about frameworks, toolkits, libraries, interfaces and so on. But, nevertheless, this has nothing to do with a decision between PostgreSQL or MySQL. They can both be accessed by a myriad of programming languages, so the decision may (and should) be based on other things. cug -- PharmaLine, Essen, GERMANY Software and Database Development smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] count(*) performance
On 27.03.2006, at 21:20 Uhr, Brendan Duddridge wrote: Does that mean that even though autovacuum is turned on, you still should do a regular vacuum analyze periodically? It seems that there are situations where autovacuum does not a really good job. However, in our application I have made stupid design decision which I want to change as soon as possible. I have a "visit count" column in one of the very large tables, so updates are VERY regular. I've just checked and saw that autovacuum does a great job with that. Nevertheless I have set up a cron job to do a standard vacuum every month. I've used vacuum full only once after I did a bulk update of about 200.000 rows ... cug -- PharmaLine, Essen, GERMANY Software and Database Development smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Query plan from hell
On 24.03.2006, at 23:54 Uhr, PFC wrote: bookmark_delta contains very few rows but is inserted/deleted very often... the effect is spectacular ! I guess I'll have to vacuum analyze this table every minute... What about using autovacuum? cug -- PharmaLine, Essen, GERMANY Software and Database Development smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Process Time X200
On 10.03.2006, at 10:11 Uhr, NbForYou wrote: So the only solution is to ask my webhost to upgrade its postgresql? Seems to be. The question is will he do that? You are the customer. If they don't, go to another provider. After all a license fee is required for commercial use. And running a webhosting service is a commercial use. No license fee is required for any use of PostgreSQL. Read the license: "Permission to use, copy, modify, and distribute this software and its documentation for any purpose, without fee, and without a written agreement is hereby granted, provided that the above copyright notice and this paragraph and the following two paragraphs appear in all copies." A commercial license is needed for MySQL, not for PostgreSQL. cug -- PharmaLine, Essen, GERMANY Software and Database Development smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] t1000/t2000 sun-servers
On 06.03.2006, at 21:10 Uhr, Jignesh K. Shah wrote: Like migrate all your postgresql databases to one T2000. You might see that your average response time may not be faster but it can handle probably all your databases migrated to one T2000. In essence, your single thread performance will not speed up on Sun Fire T2000 but you can certainly use it to replace all your individual postgresql servers in your organization or see higher scalability in terms of number of users handled with 1 server with Sun Fire T2000. How good is a pgbench test for evaluating things like this? I have used it to compare several machines, operating systems and PostgreSQL versions - but it was more or less just out of curiosity. The real evaluation was made with "real life tests" - mostly scripts which also tested the application server itself. But as it was it's easy to compare several machines with pgbench, I just did the tests and they were interesting and reflected the real world not as bad as I had thought from a "benchmark". So, personally I'm interested in a simple pgbench test - perhaps with some more ( > 50) clients simulated ... cug smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Order by behaviour
On 23.12.2005, at 15:35 Uhr, Carlos Benkendorf wrote: I appreciate your suggestion but I think I´m misunderstanding something, the select statement should return at about 150.000 rows, why 5 rows? I have looked at the wrong lines of the explain ... statement. Sorry, my fault. With that many lines, I doubt that my workaround will do anything good ... :-/ I was just a little bit to fast ... looking at to many different "explain ..." (or similar) statements in the last weeks. Sorry, my fault. Other idea: have you tried ordering the rows in memory? Is that faster? From now looking better at the explain result, it seems to me, that the sorting takes most of the time: Sort (cost=201296.59..201663.10 rows=146602 width=897) (actual time=9752.555..10342.363 rows=167710 loops=1) How large are the rows returned by your query? Do they fit completely in the memory during the sort? If PostgreSQL starts switching to temp files ... There was a discussion on that topic a few weeks ago ... Perhaps this may help: -- work_mem (integer) Specifies the amount of memory to be used by internal sort operations and hash tables before switching to temporary disk files. The value is specified in kilobytes, and defaults to 1024 kilobytes (1 MB). Note that for a complex query, several sort or hash operations might be running in parallel; each one will be allowed to use as much memory as this value specifies before it starts to put data into temporary files. Also, several running sessions could be doing such operations concurrently. So the total memory used could be many times the value of work_mem; it is necessary to keep this fact in mind when choosing the value. Sort operations are used for ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash joins, hash-based aggregation, and hash-based processing of IN subqueries. -- cug smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Order by behaviour
On 23.12.2005, at 13:34 Uhr, Carlos Benkendorf wrote: For some implementation reason in 8.0.3 the query is returning the rows in the correct order even without the order by but in 8.1.1 probably the implementation changed and the rows are not returning in the correct order. You will never be sure to get rows in a specific order without an "order by". I don't know why PG is faster without ordering, perhaps others can help with that so you don't need a workaround like this: If you can't force PostgreSQL to perform better on the ordered query, what about retrieving only the primary keys for the rows you want unordered in a subquery and using an "where primaryKey in (...) order by ..." statement with ordering the five rows? Like this: select * from mytable where pk in (select pk from mytable where ...) order by ...; I don't know whether the query optimizer will flatten this query, but you can try it. cug -- PharmaLine Essen, GERMANY and Big Nerd Ranch Europe - PostgreSQL Training, Feb. 2006, Rome, Italy http://www.bignerdranch.com/classes/postgresql.shtml smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] 15,000 tables
On 01.12.2005, at 17:04 Uhr, Michael Riess wrote: No. Our database contains tables for we content management systems. The server hosts approx. 500 cms applications, and each of them has approx. 30 tables. Just for my curiosity: Are the "about 30 tables" with similar schemas or do they differ much? We have a small CMS system running here, where I have all information for all clients in tables with relationships to a client table. But I assume you are running a pre-build CMS which is not designed for "multi-client ability", right? cug -- PharmaLine, Essen, GERMANY Software and Database Development smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] VERY slow after many updates
On 19.11.2005, at 13:05 Uhr, Alex Wang wrote: Yes, it's a "queue" table. But I did not perform many insert/delete before it becomes slow. After insert 10 records, I just do get/ update continuously. When PostgreSQL updates a row, it creates a new row with the updated values. So you should be aware, that the DB gets bigger and bigger when you only update your rows. Vacuum full reclaims that used space. The concepts are described in detail in the manual in chapter 12. cug -- PharmaLine Essen, GERMANY and Big Nerd Ranch Europe - PostgreSQL Training, Dec. 2005, Rome, Italy http://www.bignerdranch.com/classes/postgresql.shtml smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Prepared statement not using index
On 12.09.2005, at 14:38 Uhr, Dave Cramer wrote: The difference between the 7.4 driver and the 8.0.3 driver is the 8.0.3 driver is using server side prepared statements and binding the parameter to the type in setXXX(n,val). Would be a good idea when this were configurable. I found my solution (use the JDBC2 drivers with protocolVersion=2), but how long will this work? cug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Index not used with prepared statement
On 11.09.2005, at 11:03 Uhr, Andreas Seltenreich wrote: I'm not perfectly sure, but since the index could only be used with a subset of all possible parameters (the pattern for like has to be left-anchored), I could imagine the planner has to avoid the index in order to produce an universal plan (the thing behind a prepared statement). Hmm. Now I get it. So I have to look that my framework doesn't produce a preparedStatement, instead build a complete statement string. Weird. Is there a reason you are using the like operator at all? IMO using the =-operator instead in your example should produce an "index-using prepared statement". Yes, you are right, but then I can't pass anything like '45%' to the query. It will just return nothing. I use the "like" because I build the queries on the fly and add a % at the end where necessary. And, to be clear: this is a minimal example, most of my queries are generated by a framework. This was an example to test the behaviour. Okay, I had problems with the understanding of prepared statements on the client and the server side. What I thought was, that I get a preparedStatement by JDBC which also inserts the values into the string and this is executed on the server side. cug ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Index not used with prepared statement
Hi. I have a performance problem with prepared statements (JDBC prepared statement). This query: PreparedStatement st = conn.prepareStatement("SELECT id FROM dga_dienstleister WHERE plz like '45257'"); does use an index. This query: String plz = "45257"; PreparedStatement st = conn.prepareStatement("SELECT id FROM dga_dienstleister WHERE plz like ?"); st.setString(1, plz); does NOT use an index. As it should in the end result in absolutely the same statement, the index should be used all the time. I have to set the protocolVersion=2 and use the JDBC2 driver to get it working (but then the statements are created like in the first query, so no solution, only a workaround). I'm not sure whether this is a bug (I think it is) or a problem of understanding. Known problem? I have tried PG 8.0.1, 8.0.3, 8.1beta with the JDBC-drivers - postgresql-8.0-312.jdbc2.jar --> okay with protocolVersion=2 in the URL - postgresql-8.0-312.jdbc3.jar --> not okay whatever I do I'm on Mac OS X, if that matters. cug ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Prepared statement not using index
Hi. I have an interesting problem with the JDBC drivers. When I use a select like this: "SELECT t0.aktiv, t0.id, t0.ist_teilnehmer, t0.nachname, t0.plz, t0.vorname FROM public.dga_dienstleister t0 WHERE t0.plz like ?::varchar(256) ESCAPE '|'" withBindings: 1:"53111"(plz)> the existing index on the plz column is not used. When I the same select with a concrete value, the index IS used. I use PostgreSQL 8.0.3 on Mac OS X and the JDBC driver 8.0-312 JDBC 3. After a lot of other things, I tried using a 7.4 driver and with this, the index is used in both cases. Why can this happen? Is there a setting I might have not seen? Something I do wrong? cug ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings