Re: [GENERAL] Question about hosting and server grade
On Wed, 25 Mar 2009 13:19:12 -0600 Scott Marlowe scott.marl...@gmail.com wrote: Spend your money on more RAM, (32G isn't much more than 16G and I've seen it make a world of difference on our servers). Spend it on disks. Number of disks is often more important than RPM etc. Spend it on fast RAID controllers with battery backed cache. Then, consider upgrading your CPUs. We have 8 opteron cores in our servers, and 12 Disk RAID-10s under a very fast RAID controller, and we are still I/O not CPU bound. [snip] But all of this depends on the type of workload your db has to do. If you're running memory hungry select queries, focus on more memory. If you're running lots and lots of little queries with a mix of update, insert, delete and select, focus on the drives / controller. If you're running queries that require a lot of CPU, then focus more on that. Could IO load show up as apparent CPU load? I mean I've a pretty small DB. It should fit nearly all in RAM... or at least... after 1 day of load I can see the box may use 50K of swap. Anyway when I update the main table (~1M rows and a gin index) I can see the CPU reaching its limit. Most frequent updates involves 5K-20K changed record. On normal workload the most intensive queries run in 200ms with few exceptions and the BIG table is mostly in read access only. It would be nice if the update would be a bit faster since I'm still forced to do them during working hours... because people on the other side are still convinced it is not worth to clean rubbish at the source, so sometimes updates fail for inconsistent data. Unfortunately... I can add ram and disks but all the sockets for CPU are used. The box has 2 old Xeon HT at 3.2GHz. It's on RAID5 (not my choice) on a decent controller and has 4Gb of RAM. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] HINT: Please REINDEX it?
I am seeing this in my log...: ERROR: index connect_idx contains unexpected zero page at block 208 HINT: Please REINDEX it. What does this mean? And how do I reindex it? Regards, BTJ -- --- Bjørn T Johansen b...@havleik.no --- Someone wrote: I understand that if you play a Windows CD backwards you hear strange Satanic messages To which someone replied: It's even worse than that; play it forwards and it installs Windows --- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [ADMIN] Can we load all database objects in memory?
Iñigo Martinez Lasala wrote: Hi All, I have a database of 10GB. My Database Server has a RAM of 16GB Is there a way that I can load all the database objects to memory? Thanks for your time and taking a look at this question. Thanks Deepak Increase effective_cache_size parameter. An effective_cache_size=11GB should be more than enough. Sorry, but that is quite wrong. effective_cache_size only affects the planner, not memory utilization. You can use shared_buffers=11GB to give PostgreSQL memory enough to buffer all of a 10GB database in memory. But that does not automatically put all the tables in memory. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Weird encoding behavior
Leonardo M. Ramé wrote: Hi, I'm experiencing a weird behavior when storing latin characters to a PostgreSQL 8.3.1.876 server. The database is Latin1 encoded, and it is working since September 2008, it wasn't updated nor replaced since its first installation. The weirdness of the problem is that sometimes the characters are stored correctly, but sometimes doesn't (allways by the same program), the field type is Varchar(30), and for example the text NUÑEZ is stored as NU?EZ. The data comes from an external application in an XML file (also Latin1), then, a Delphi service parses the XML and create the Insert/Update statements to store the data in the database. I'd try to reproduce the bug by sending XML files with 'Ñ' to the service, but it is stored correctly. Also, there's a front end that allows users to see/edit the data in a user friendlier way. Again, I checked by inserting records with 'Ñ' using this front-end, and also are stored correctly. Does anyone faced the same problem? any workaround? Well, there *must* be one client that stores wrong data... As a first step, can you find out the code point of the character that is represented as ? in your E-Mail? Something like SELECT ascii(substr('NU?EZ', 3, 1)); except that instead of the string literal you substitute the column containing the bad value. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] intermittant performance problem
On Mar 25, 2009, at 5:09 PM, Mike Charnoky wrote: Due to the nature of the sampling (need to limit using several parameters with a WHERE clause), I can't just generate random numbers to select data that I need. Looks like I am stuck using ORDER BY RANDOM(). The only other option at this point seems to be to implement TABLESAMPLE, probably starting with the academic work that Neil Conway published (http://neilconway.org/talks/hacking/) I'm not sure I answered to this thread before, but ORDER BY RANDOM is not the only way to get x random rows out of n rows. Calculating random numbers isn't particularly cheap, so doing that n times is going to cost a fair amount of CPU cycles and will require a sequential scan over the table. If you search the archives you're bound to stumble on a solution I suggested before that only needs to call random() x times (instead of n times). It still does a sequential scan (I think there's currently no way around that unless quasi-random results are acceptable to you). My solution involves a scrollable cursor that is used to navigate to x random rows in the (otherwise unsorted) n rows in the result set. I tried putting that functionality into a pl/pgsql function, but pl/ pgsql doesn't (yet) support the MOVE FORWARD ALL statement, which you need to get the upper boundary of the random row number (equals the number of rows in the result set). An alternative solution is to wrap your query in SELECT COUNT(*) FROM (...) AS resultset or something similar, but in that case the query (which does a seqscan) has to be performed twice! Maybe other PL- languages fair better there, but from the looks of it not even C- functions can perform MOVE FORWARD ALL, so I guess they won't. My last attempt used that approach, but it's obviously not optimal. I'd much prefer to feed my function a query or a refcursor instead of a string containing a query. Feeding it a string makes me itch. Anyway, here's how far I got. It is in a usable state and I'm interested how it performs on a real data set compared to ordering by random() or other solutions. !DSPAM:737,49cb5930129747428277249! sample.sql Description: Binary data It's at the moment probably more efficient to not use a stored procedure but query the cursor from your application instead (saves one of the two seqscans). That has it's own disadvantages of course. I've used something like that (as a function in our PHP application) on a medium-sized data set before, and it performed adequately. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,49cb5930129747428277249! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] How to compile a 32 bit version of postgres on a x64 machine.
It looks like most avenues for high availability with postgres are not available if one of the machines is a 64 bit machine and the other a 32. Somebody on this list suggested I install a 32 bit version of postgres on my x64 machine. What's the best way to handle this? Should I compile it fresh? Install the 32 bit binaries? Can I just copy the binaries from the secondary? I presume I am going to have to drop all the databases and reload them of course. Here are the uname -a of both machines. The primary 2.6.27-11-server #1 SMP Thu Jan 29 20:13:12 UTC 2009 x86_64 GNU/Linux The secondary 2.6.27-11-server #1 SMP Thu Jan 29 20:19:41 UTC 2009 i686 GNU/Linux
[GENERAL] idle connection timeout
hey folks, is there any potential idle connection timeout on server side ? I got a C++ client here, using pqxx, that waits with CLOSE_WAIT, which would potentially mean that back end called close() on connection, if there's such idle timeout, how long is it set to by default, and can it be set by user ? the server version in this case is 8.1.4 ta. -- GJ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Weird encoding behavior
Well, there *must* be one client that stores wrong data... You are right, I'll ask someone in site to look at *each* client hunting for the root of the problem. It must be a Windows Regional Settings or something similar. As a first step, can you find out the code point of the character that is represented as ? in your E-Mail? Something like SELECT ascii(substr('NU?EZ', 3, 1)); except that instead of the string literal you substitute the column containing the bad value. Yours, Laurenz Albe I did what you suggested, and it responds with a 63 when the string is NU?NEZ and 209 when it's NUÑEZ. Thanks, Leonardo. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] idle connection timeout
=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= gryz...@gmail.com writes: is there any potential idle connection timeout on server side ? No. You might be getting bit by an idle timeout somewhere else, such as in a NAT router between the client and server. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Parallel Query Processing
You could run parallel queries across multiple servers using pgpool-II. ~Cory Coager aravind chandu said the following on 03/25/2009 04:24 PM: Hello, I have a few questions related to the parallel query processing.Can you guys tell me how to implement parallel query processing in postgresql database. Thanks, Avin. mailto:ccoa...@davisvision.com The information contained in this communication is intended only for the use of the recipient(s) named above. It may contain information that is privileged or confidential, and may be protected by State and/or Federal Regulations. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please return it to the sender immediately and delete the original message and any copy of it from your computer system. If you have any questions concerning this message, please contact the sender.
Re: [GENERAL] log shipping from 64 bit linux server to a 32 bit linux server
Tatsuo Ishii wrote: I know what you think:-) Problem is, he asks high availabilty, that means, no SPOF, minimum down time. For the purpose, I suppose pgpool-HA(actually heartbeat)+pgpool-II+Slony-I might work, but I'm not sure heartbeat does work with 32/64bit combo. Heartbeat does work with 32/64 bit combo. However, CARP is another (simpler) option as well. Additionally, if you were going with that solution, you're likely using a VIP, so adding the complexity of pgpool isn't really required either...for better performance you could use pgBouncer (if you really needed connection pooling). In any case, your HA solution (Heartbeat/CARP) could definitely trigger a slony switchover to minimize downtime, but might end up losing some transactions due to the asynchronous nature of slony. Though, as others have intimated, the best solution out there is for you (if you want synchronous HA) is to get another 64 bit system for your secondary, or even install a 32 bit OS on your 64 bit system, so you have 2 32 bit systems. In which case you'd probably want to add DRBD in to the mix for your active WALs.. -- Chander Ganesan Open Technology Group, Inc. One Copley Parkway, Suite 210 Morrisville, NC 27560 919-463-0999/877-258-8987 http://www.otg-nc.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] idle connection timeout
2009/3/26 Tom Lane t...@sss.pgh.pa.us: You might be getting bit by an idle timeout somewhere else, such as in a NAT router between the client and server. ok, thanks tom. -- GJ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Weird encoding behavior
Leonardo M. Ramé wrote: Hi, I'm experiencing a weird behavior when storing latin characters to a PostgreSQL 8.3.1.876 server. The database is Latin1 encoded, and it is working since September 2008, it wasn't updated nor replaced since its first installation. The weirdness of the problem is that sometimes the characters are stored correctly, but sometimes doesn't (allways by the same program), the field type is Varchar(30), and for example the text NUÑEZ is stored as NU?EZ. You need to be sure that the client_encoding variable is set to whatever the client is actually using. Some client programs may be using UTF8, and thus they should do SET client_encoding TO utf8 on connection start; others may be using Win-1252, Latin1, or other encodings. If you declare each correctly when the connection is established, the server will do the right conversion for you automatically. If you fail to do that you will have a mess of characters, just like you do now. BTW it is very advisable to update to 8.3.7. It has a lot of bug fixes since 8.3.1. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] ltree and full text search questions
Hello, (sorry for this long mail) I have started to evaluate ltree and tsearch (first on Windows with PG Version 8.3.7) and I would apprecicate some clarification. The first part deals with ltree where I have some questions, and the second part is a concept to combine ltree with the full text search capabilities. Here I'd like to get your opinion whether I'm on the right way or not. Moreover I will probably need my own parser for the full text search and I would be very gratefull if anybody could provide me with some C code example that is similar to my requirement. and kudos for ltree and the full text search. I'm impressed by their performances :-) I would like to use these two functionalities to store and analyze paths, (the model beyond the data is not a tree, but a dense directed graphs). A) ltree A ltree path would look like this: ...2_456.7_3425.1_23.9_231 whereas each node is a combination of 2 information: class_id_item_id If my idea works well, I may also like to put some more information in each node which would hence be a micropath of fix length (3 or 4) With such a model, I could offer global path analysis on the class, and drilldown possibilities on the items. I came to this idea as the documentation of ltree says that a node is a list of words separated by the '_' character. a short example: --drop table paths ; create table paths (id int, path ltree); insert into paths(id,path)values(1, '1_11.2_13.3_10.4_13'); insert into paths(id,path)values(2, '1_12.4_15.3_11.4_10.15_14.1_11'); insert into paths(id,path)values(3, '1_11.2_13.3_10.4_10'); insert into paths(id,path)values(4, '1_12.4_15.3_11.3_10.13_14.13_14'); insert into paths(id,path)values(5, '1_11.2_13.3_10.2_13'); insert into paths(id,path)values(6, '1_12.4_15.3_11.1_10.12_14.1_11'); insert into paths(id,path)values(7, '127_1235'); --now I can e.g retrieve all items that have a path from class 2 to class 4 using the prefix matching: select id from paths where path ~ '*.2_*.*.4_*.*'::lquery --(1 3) --drop table paths ; And now my questions and comments: 1) From the module description, I first though that the '_' character had a special meaning but it just seems to be an extra allowed character beside [a-zA-Z0-9]. Am I correct or is '_' defined internally as separator for indexing/searching ltree data ? 2) The documentation says that the length of a label path is limited to 65kB. This is the max number of nodes, and not the size of the string. Moreover, when you try to define a larger path, it will be silently cut without notice or error (seems that the first 65kB are just dropped, but I'm not sure about this). Maybe it would be a good thing to add this to the documentation. 3) Gist index create index path_gist_idx on paths using gist (path); = ERROR: index row requires 621840 bytes, maximum size is 8191 So it seems that gist indexes can only be used for paths that do not exceed 8kB in size, which is much less than the 65kB limit in depth. Is this correct or am I missing a point ? (I also have PostGIS 1.3.5 installed. Could this be an issue ?) B) Full text search === ltree offers a prefix search on the nodes, but no suffix search which I need to look for given item_id. So my idea is to combine ltree with full text search. If a node has a fix format like class_item, I could use full text search indexing to look for paths containing a given item. My first step was to check the default parser which works really badly in this case: SELECT alias, description, token FROM ts_debug('1_12.4_15.3_11.1_10'); alias description token - --- - uintUnsigned integer1 blank Space symbols _ float Decimal notation12.4 blank Space symbols _ float Decimal notation15.3 blank Space symbols _ float Decimal notation11.1 blank Space symbols _ uintUnsigned integer10 So I guess I need my own parser which should return something like: (this would be a specific parser that had to know the exact pattern and meaning of a node) alias description token - --- - class class_id1 blank Space symbols _ itemitem_id 12 blank Space symbols . class class_id4 blank Space symbols _ itemitem_id 15 blank Space symbols . class class_id3 As already said, I would be very gratefull if anybody could provide me with some C code example that is similar to my requirement. Having this, I would then define different search configuration: CREATE TEXT SEARCH CONFIGURATION ltclass; CREATE TEXT SEARCH CONFIGURATION ltitem; ALTER TEXT SEARCH CONFIGURATION ltclass DROP
Re: [GENERAL] Weird encoding behavior
Leonardo M. Ramé wrote: As a first step, can you find out the code point of the character that is represented as ? in your E-Mail? Something like SELECT ascii(substr('NU?EZ', 3, 1)); except that instead of the string literal you substitute the column containing the bad value. I did what you suggested, and it responds with a 63 when the string is NU?NEZ and 209 when it's NUÑEZ. 63 is indeed a question mark. Since such a conversion would not be done by PostgreSQL, something else must convert Ñ to ?N *before* it gets into PostgreSQL... Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Question about hosting and server grade
On Thu, Mar 26, 2009 at 09:06:03AM +0100, Ivan Sergio Borgonovo wrote: Could IO load show up as apparent CPU load? I may not be interpreting you correctly; but, as I understand it, if your IO subsystem is too slow then your CPUs are going to be idling. So if your CPUs are sitting at 100% utilisation then you're CPU bound and not IO bound. If your dataset mainly fits in RAM then SELECTs are always going to be CPU (or RAM to cache bandwidth) bound. You'll always be waiting for your disks when you modify data so if you consider your UPDATEs too slow you should look at what's going on in your system when they're happening. -- Sam http://samason.me.uk/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] log shipping from 64 bit linux server to a 32 bit linux server
On Thu, Mar 26, 2009 at 02:08:33PM +1300, Tim Uckun wrote: On Thu, Mar 26, 2009 at 2:05 PM, Tatsuo Ishii is...@postgresql.org wrote: According to the documentation it's not possible to log ship from a 64 bit server to a 32 bit server. I think the doc is quite correct. So what is the best way to accomplish a failover from a 64 bit machine to a 32 bit machine? What about running a 32bit build of PG on the 64bit machine? -- Sam http://samason.me.uk/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] HINT: Please REINDEX it?
On Thu, Mar 26, 2009 at 09:00:53AM +0100, Bjørn T Johansen wrote: ERROR: index connect_idx contains unexpected zero page at block 208 HINT: Please REINDEX it. What does this mean? It means something bad has happened to an index. If you're running an old version of PG then you may be affected by a known bug or you could have bad hardware. Did the computer have its power turned off unexpectedly, as if things are set up right then this could affect things. I've just had a quick search and you seemed to be running 8.3.5 before, there don't seem to be any index related changes since then so maybe your hardware isn't doing what it's told to! And how do I reindex it? Just type: REINDEX connect_idx; in psql. -- Sam http://samason.me.uk/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Weird encoding behavior
Albe Laurenz laurenz.a...@wien.gv.at writes: Leonardo M. Ramé wrote: I did what you suggested, and it responds with a 63 when the string is NU?NEZ and 209 when it's NUÑEZ. 63 is indeed a question mark. Since such a conversion would not be done by PostgreSQL, something else must convert Ñ to ?N *before* it gets into PostgreSQL... Yeah, I think this destroys the theory that it's due to a wrong choice of client_encoding setting. What you'd be likely to get from that is a character can't be translated kind of error, not silent substitution of a question mark. The damage must be getting done on the client side. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Question about hosting and server grade
On Thursday 26 March 2009, Ivan Sergio Borgonovo m...@webthatworks.it wrote: Could IO load show up as apparent CPU load? It would show up as CPU busy in iowait state. If the CPU is actually busy it would show mostly in user state, some in system. -- Even a sixth-grader can figure out that you can’t borrow money to pay off your debt -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Parallel Query Processing
On Wed, 2009-03-25 at 15:52 -0600, Scott Marlowe wrote: On Wed, Mar 25, 2009 at 2:24 PM, aravind chandu avin_frie...@yahoo.com wrote: Hello, I have a few questions related to the parallel query processing.Can you guys tell me how to implement parallel query processing in postgresql database. Do you mean one query being parallelized, or multiple queries running at once? PostgreSQL provides no capability to parallelize one query into multiple processes. I have a tool that will allow you to manually parallelize a query (and get the right answer). That is useful for certain larger queries. I'll be publishing that a bit more visibly in next few months. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Parallel Query Processing
On Thu, Mar 26, 2009 at 05:04:29PM +, Simon Riggs wrote: On Wed, 2009-03-25 at 15:52 -0600, Scott Marlowe wrote: On Wed, Mar 25, 2009 at 2:24 PM, aravind chandu avin_frie...@yahoo.com wrote: Hello, I have a few questions related to the parallel query processing.Can you guys tell me how to implement parallel query processing in postgresql database. Do you mean one query being parallelized, or multiple queries running at once? PostgreSQL provides no capability to parallelize one query into multiple processes. I have a tool that will allow you to manually parallelize a query (and get the right answer). That is useful for certain larger queries. I'll be publishing that a bit more visibly in next few months. Is this snapshot cloning? If so, thanks very much for publishing it! :) If not, I'm really curious as to what it is :) On the subject of snapshot cloning, I can see, at least in broad brush strokes, how this would work for read queries, but I'm curious how (or if) it could work for writes like, for example, updates of many partitions at once. Could it? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Parallel Query Processing
On Thu, 2009-03-26 at 10:12 -0700, David Fetter wrote: On Thu, Mar 26, 2009 at 05:04:29PM +, Simon Riggs wrote: On Wed, 2009-03-25 at 15:52 -0600, Scott Marlowe wrote: On Wed, Mar 25, 2009 at 2:24 PM, aravind chandu avin_frie...@yahoo.com wrote: Hello, I have a few questions related to the parallel query processing.Can you guys tell me how to implement parallel query processing in postgresql database. PostgreSQL provides no capability to parallelize one query into multiple processes. I have a tool that will allow you to manually parallelize a query (and get the right answer). That is useful for certain larger queries. I'll be publishing that a bit more visibly in next few months. Is this snapshot cloning? Basically, yes. On the subject of snapshot cloning, I can see, at least in broad brush strokes, how this would work for read queries, but I'm curious how (or if) it could work for writes like, for example, updates of many partitions at once. Could it? Yes, think so, since most writes involve first reading the data. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] commit performance anomaly
I've been tracking the performance of our DB query statements across a number of fairly high-volume pg clusters for several years (combined 2700 tps, ~1.3TB). Last year, we started migrating off HP-UX IA64 servers running pg 8.1.x onto Linux quadcore x86_64 Blade servers running pg 8.3.x while running on a high-grade SAN. Our average, overall query performance has improved by a very pleasant ~75%. But I'm curious why 'commit' statements (as well as certain update statements) seem to have actually degraded (1ms vs 5ms on avg, 2ms vs 14ms in the 95th percentile, etc). Any ideas? TIA. Ed -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] commit performance anomaly
On Thu, 26 Mar 2009, Ed L. wrote: But I'm curious why 'commit' statements (as well as certain update statements) seem to have actually degraded (1ms vs 5ms on avg, 2ms vs 14ms in the 95th percentile, etc). When you commit something, it writes information to the write-ahead log (WAL) and then issues a fsync call to the filesystem involved to confirm that the data has made it onto disk, which usually percolates down onto a physical write to disk or a write cache. It sounds like your newer system doesn't have as fast of a path between writes and a useful cache here. Since you mentioned moving onto a SAN, I wouldn't be surprised if much of the additional latency is just overhead moving bits over whatever its I/O interface is. I hope you didn't drink too much SAN vendor Kool-Aid and connect with iSCSI or something similarly slow (I doubt that because you're only seeing medium scale degredation). Direct connected disks always have lower latency for short writes, the only time a SAN can pull ahead of them is when you've doing something that can utilize a lot of disks at once. You might be able to tune the SAN to optimize for faster write performance though. SAN vendors seem completely incompetant at producing out of the box tunings that work well for database use (I feel a RAID5 rant brewing). -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] HINT: Please REINDEX it?
On Thu, 26 Mar 2009 15:32:46 + Sam Mason s...@samason.me.uk wrote: On Thu, Mar 26, 2009 at 09:00:53AM +0100, Bjørn T Johansen wrote: ERROR: index connect_idx contains unexpected zero page at block 208 HINT: Please REINDEX it. What does this mean? It means something bad has happened to an index. If you're running an old version of PG then you may be affected by a known bug or you could have bad hardware. Did the computer have its power turned off unexpectedly, as if things are set up right then this could affect things. Not any power problem but had a disk problem; guess that might be the cause... I've just had a quick search and you seemed to be running 8.3.5 before, there don't seem to be any index related changes since then so maybe your hardware isn't doing what it's told to! Almost correct... I am running 8.3.6 now... :) And how do I reindex it? Just type: REINDEX connect_idx; in psql. Thx, did that and now the error message is gone... BTJ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] log shipping from 64 bit linux server to a 32 bit linux server
What about running a 32bit build of PG on the 64bit machine? How would one go about doing something like this?
Re: [GENERAL] log shipping from 64 bit linux server to a 32 bit linux server
On Thu, Mar 26, 2009 at 3:32 PM, Tim Uckun timuc...@gmail.com wrote: What about running a 32bit build of PG on the 64bit machine? How would one go about doing something like this? Compiling with the proper -march flags I believe. It's been like 5 years since I had to mess with such things, so google it a bit first. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how to specify the locale that psql uses
When you install postgreSQL over XP at the same time it asks for encoding locale. Set at the same time. Follow the below link will give you more idea. http://www.postgresql.org/docs/7.2/static/multibyte.html If you want to change the locale of existing database. You can't.. The workaround for that you can create another cluster with initdb command with supported locale. If you want to create database with different encoding. You can. You can execute the following command. Command: CREATE DATABASE Description: create a new database Syntax: CREATE DATABASE name [ [ WITH ] [ OWNER [=] dbowner ] [ TEMPLATE [=] template ] [ ENCODING [=] encoding ] [ TABLESPACE [=] tablespace ] [ CONNECTION LIMIT [=] connlimit ] ] -- Thanks Regards, Dhaval Jaiswal EnterpriseDB www.enterprisedb.com From: Kent Tong k...@cpttm.org.mo To: pgsql-general@postgresql.org Sent: Thursday, January 22, 2009 2:06:53 PM Subject: [GENERAL] how to specify the locale that psql uses Hi, I am running a Chinese edition of XP. When I start psql, it is probably trying to display Chinese and maybe it gets the encoding wrong, it displays garbage in its console. Is there any way to tell it to just use English instead? In Linux, one can do something like LC_ALL=en psql but it doesn't seem to work on the Windows port of PostgreSQL. Thanks for any help! - -- Kent Tong Wicket tutorials freely available at http://www.agileskills2.org/EWDW Axis2 tutorials freely available at http://www.agileskills2.org/DWSAA -- View this message in context: http://www.nabble.com/how-to-specify-the-locale-that-psql-uses-tp21599904p21599904.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Enumerating a row set
Hi all, Is there a function similiar to Python's enumerate() [1] ? Searching the docs didn't reveal any relevant builtin but I hope it's doable in pgsql. Ideally I'd like a function that can be used as: SELECT e.i, e.col1, e.col2 FROM enumerate(some_table, 'i') e LIMIT 10 i col1 col2 = 0 ... ... 1 ... ... ... ... ... 9 ... ... Also ideally it should work on any rowset (e.g. nested select), not just on concrete tables. Thanks in advance, George [1] http://docs.python.org/library/functions.html#enumerate -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how to avoid that a postgres session eats up all the memory
Generally out of memory error persist when the memory is not enough to handle the query. When query executed without using standard plan it consum lots of memory. If shared buffer is not enough to handle this query it thrown error like out of memory. It if good practice to do explain query before executing it, will give you information about which plan it will use. The workaround for the Delete... query is you can make pl/pgsql function, which will execute faster than this query. As it is not gone for sequential scan. It will directly jump to the table.bar_id where you given appropriate condition. --- Thanks Regards Dhaval Jaiswal EnterpriseDB www.enterprisedb.com From: Clemens Schwaighofer clemens.schwaigho...@tequila.jp To: pgsql-general@postgresql.org Sent: Thursday, January 22, 2009 2:21:47 PM Subject: [GENERAL] how to avoid that a postgres session eats up all the memory Hi, I just literally trashed my test server with one delete statement because the psql used up all its memory and started to swap like crazy. my delete looked like this DELETE FROM table_foo WHERE bar_id IN (SELECT DISTINCT bar_id FROM bar LEFT JOIN ... LEFT JOIN ... LEFT JOIN WHERE table.bar_id IS NULL AND ...) so basically it runs a select to see what entries do not have any reference data and then should delete them all. Now, my question is. How can I setup postgres to not use up all the memory and then make the server useless. How can I set it up, so it dies with out of memory before I have to restart the server. -- [ Clemens Schwaighofer -=:~ ] [ IT Engineer/Web Producer/Planning/Manager ] [ E-Graphics Communications SP Digital ] [ 6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN ] [ Tel: +81-(0)3-3545-7706 Fax: +81-(0)3-3545-7343 ] [ http://www.e-gra.co.jp ]
Re: [GENERAL] Enumerating a row set
On Mar 26, 2009, at 3:42 PM, George Sakkis wrote: Hi all, Is there a function similiar to Python's enumerate() [1] ? Searching the docs didn't reveal any relevant builtin but I hope it's doable in pgsql. Ideally I'd like a function that can be used as: SELECT e.i, e.col1, e.col2 FROM enumerate(some_table, 'i') e LIMIT 10 i col1 col2 = 0 ... ... 1 ... ... ... ... ... 9 ... ... Also ideally it should work on any rowset (e.g. nested select), not just on concrete tables. You're looking for what's called rownum in some other databases. You can do it in postgresql with a temporary sequence, sometimes at least: abacus=# create temporary sequence bar; CREATE SEQUENCE abacus=# select setval('bar', 1, false); setval 1 (1 row) abacus=# select nextval('bar'), baz from foo; nextval | baz -+ 1 | red 2 | yellow 3 | blue Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Is there a meaningful benchmark?
On Fri, Mar 20, 2009 at 9:27 AM, Scott Ribe scott_r...@killerbytes.com wrote: That's why I was looking for a more balanced benchmark that exercises said capabilities. OK, here's the thing, I will give you *one* sample issue to think about, as an illustration of the kinds of differences there are. - PostgresQL uses MVCC instead of row or page locking, which means only writer vs writer locks, which means in many situations less contention and better throughput as your concurrent load goes up. It's also important to point out that writers don't necessarily block other writers. As long as they're operating on different ranges of the data set. You can have dozens of writers streaming data in with differening primary keys all running together. Then of course, you have the ability to have readers block writers using for update, which turns a reader into a writer-to-(possibly)-be. A lot of older dbs had locking by the page, not by the record, or as with myisam, by the whole table. Page locking lead to records that shared pages locking each other needlessly. Table locking leads to even longer queues forming under heavy write load. This does NOT happen in pgsql if you're updating / inserting independent records. One of the reasons postgres scales so well is it keeps writes 'cheap' in that they don't have to interact with anything other than writes to the same records. If you've got millions of records and thousands being updated, writes not blocking writes combined with record level locking versus page level locking (or worse table level locking), pg can handle pretty high concurrent write loads on the right hardware and still maintain a good throughput on reads. And concurrent write load is what usually cripples a server. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how to specify the locale that psql uses
On Fri, Mar 27, 2009 at 9:35 AM, Dhaval Jaiswal bablu_postg...@yahoo.comwrote: When you install postgreSQL over XP at the same time it asks for encoding locale. Set at the same time. Follow the below link will give you more idea. http://www.postgresql.org/docs/7.2/static/multibyte.html Please updates yourself with the latest version as its 8.3.7 these days and you are still like years and years behind with version 7.2 documentation :) If you want to change the locale of existing database. You can't. The workaround for that you can create another cluster with initdb command with supported locale. I dont think so the person asking the question ever asked for changing encoding for the database. Please make sure to read the question atleast, as it was about changing locale for psql. In order to do that For server messages: --- set the configuration parameter lc_messages to a value you like to have for getting messages back from server. It can done in both ways, for the session and permanently. In order to do for a session use: set lc_message = and otherwise set it in your postgresql.conf file and reload. For Client messages: - For client programs like psql, change the current locale where you are starting up psql. I am not sure about windows but in Linux you can do this by setting LANG environment variable. -- Shoaib Mir http://shoaibmir.wordpress.com/
Re: [GENERAL] Is there a meaningful benchmark?
On 26/03/2009 23:10, Scott Marlowe wrote: It's also important to point out that writers don't necessarily block other writers. As long as they're operating on different ranges of the data set. You can have dozens of writers streaming data in with differening primary keys all running together. Do you have to do anything special to have this happen - e.g. table partitioning? - Or does it just happen automagically based on the primary key? Ray. -- Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland r...@iol.ie Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals -- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] difference between current_timestamp and now() in quotes
current_timestamp is the reserved keyword of postgreSQL. When you executes it within single quotation mark it treated as string that is the only reason it thrown error. Now() is an in-built function you can use it with/without single quotation mark. For more information refere the below link. http://www.postgresql.org/docs/8.1/static/sql-keywords-appendix.html -- Thanks Regards Dhaval Jaiswal EnterpriseDB www.enterprisedb.com From: Grzegorz Jaśkiewicz gryz...@gmail.com To: GENERAL pgsql-general@postgresql.org Sent: Thursday, January 22, 2009 7:22:21 PM Subject: [GENERAL] difference between current_timestamp and now() in quotes test2=# create table dupa(a timestamp, b serial); NOTICE: CREATE TABLE will create implicit sequence dupa_b_seq for serial column dupa.b CREATE TABLE test2=# insert into dupa(a) select current_timestamp from generate_series(1,100); INSERT 0 100 test2=# insert into dupa(a) select 'current_timestamp' from generate_series(1,100); ERROR: date/time value current is no longer supported LINE 1: insert into dupa(a) select 'current_timestamp' from generate ^ test2=# insert into dupa(a) select 'now()' from generate_series(1,100); INSERT 0 100 Any ideas why the difference ? -- GJ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Is there a meaningful benchmark?
On Thu, Mar 26, 2009 at 5:23 PM, Raymond O'Donnell r...@iol.ie wrote: On 26/03/2009 23:10, Scott Marlowe wrote: It's also important to point out that writers don't necessarily block other writers. As long as they're operating on different ranges of the data set. You can have dozens of writers streaming data in with differening primary keys all running together. Do you have to do anything special to have this happen - e.g. table partitioning? - Or does it just happen automagically based on the primary key? No, it's pretty much automatic. Pgsql creates new records for every update or insert, so there's no weird locking on the original records to make it slow. Everything just goes in the WAL and gets flushed out to disk later. Setting up commit siblings helps on some loads. That's about it. No rocket science or triggers really needed for lots of writes at the same time. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Best way to store a threaded message list/tree in SQL
Hi guys - I'm looking for the best way to store a set of posts as well as comments on those posts in SQL. Imagine a design similar to a Wall on Facebook where users can write posts on their wall and other users can comment on those posts. I need to be able to display all wall posts as well as the comments. When I first started out, I came up with a table such as: CREATE Table wallposts ( id uuid NOT NULL, posted timestamp NOT NULL, userid uuid NOT NULL, posterid uuid NOT NULL, parentid uuid NOT NULL, comment text NOT NULL ) id is unique, parentid will be null on original posts and point to an id if the row is a comment on an existing post. Easy enough and super fast to insert new data. However, doing a select which would return me: POST 1 COMMENT 1 COMMENT 2 POST 2 COMMENT 1 COMMENT 2 Regardless of which order the rows existed in the database proved to be extremely difficult. I obviously can't just order by date, as someone might comment on post 1 after post 2 has been posted. If I do a LEFT JOIN to get the parent post on all rows, and then sort by that date first, all the original posts group together as they'd have a value of null. Then I got this idea: CREATE TABLE wallposts ( id uuid NOT NULL, threadposted timestamp, posted timestamp, ... comment text ) On an original post, threadposted and posted would be the same. On a comment, timestamp would be the time the original post was posted and posted would be the time the comment on that thread was posted. Now I can just do: select * from wallposts order by threadposted, posted; This works great, however one thing irks me. If two people create a post at the same time, comments on the two posts would get munged together as they'd have the same timestamp. I could use ticks instead of a datetime, but still the accuracy is only 1/1000 of a second. I could also setup a unique constraint on threadposted and posted which makes inserts a bit more expensive, but if I had multiple database servers in a farm, the chance of a collision is still there. I almost went ahead with this anyway since the chances of this happening are extremely small, but I wanted to see if I could eat my cake and still have it too. Mostly for my own educational curiosity. Third solution would be to store this data in the form of a graph. Each node would have a v-left and v-right pointer. I could order by left which would traverse the tree in the order I need. However, every time someone inserts a comment I'd have to re balance the whole tree. This would create a ton of row locking, and all sorts of problems if the site was very busy. Plus, it's kinda extreme and also causes replication problems. So I tossed this idea quickly. I also thought about just storing the original posts and then serializing the comments in a binary form, since who cares about individual comments. This would be very fast, however if a user wants to delete their comment or append a new comment to the end, I have to deserialize this data, modify the structure, then serialize it back and update the row. If a bunch of people are commenting on the same post at the same time, I might have random issues with that. So here's what I eventually did. I query for all the posts ordered by date entered. In the middle ware layer, I loop through the recordset and create a stack of original posts, each node on the stack points to a linked list of comments. When I come across an original post, I push a new node on the stack and when I come across a comment I add a node to the linked list. I organize this in memory so I can traverse the recordset once and have O(n). After I create the in-memory representation of the wall, I traverse through this data structure again and write out HTML. This works great and has super fast inserts and super fast selects, and no weird row locking issues; however it's a bit heavier on my presentation layer and requires me to build an in memory representation of the user's wall to move stuff around so it's in the right order. Still, I believe this is the best approach I've found so far. I thought I'd check with some other SQL experts and see if 1) Postgres has some super nifty tree functions that I've never heard of and 2) see if there's actually a way to do this using joins or unions or something which would still be performant with millions of users. Thoughts? Mike -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Enumerating a row set
On Thu, Mar 26, 2009 at 06:42:45PM -0400, George Sakkis wrote: Hi all, Is there a function similiar to Python's enumerate() [1] ? Searching the docs didn't reveal any relevant builtin but I hope it's doable in pgsql. Ideally I'd like a function that can be used as: SELECT e.i, e.col1, e.col2 FROM enumerate(some_table, 'i') e LIMIT 10 In PostgreSQL 8.4, you'll be able to do: SELECT row_number() OVER (ORDER BY col1) AS i, e.col1, e.col2, ... FROM ... Until then, there are some ugly, fragile workarounds with generate_series() and/or temp sequences. Cheers, David. i col1 col2 = 0 ... ... 1 ... ... ... ... ... 9 ... ... Also ideally it should work on any rowset (e.g. nested select), not just on concrete tables. Thanks in advance, George [1] http://docs.python.org/library/functions.html#enumerate -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Enumerating a row set
On Thu, Mar 26, 2009 at 06:42:45PM -0400, George Sakkis wrote: Hi all, Is there a function similiar to Python's enumerate() [1] ? Searching the docs didn't reveal any relevant builtin but I hope it's doable in pgsql. I found this via Google: http://www.depesz.com/index.php/2007/08/17/rownum-anyone-cumulative-sum-in-one-query/ -- Guy Rouillier -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Best way to store a threaded message list/tree in SQL
Mike, connectby() is your friend here. Do a search on tablefunc in the help file. CREATE Table wallposts ( id uuid NOT NULL, posted timestamp NOT NULL, userid uuid NOT NULL, posterid uuid NOT NULL, parentid uuid NOT NULL, comment text NOT NULL ) SELECT * FROM connectby('wallposts', 'id', 'parentid', 'entrypoint', 0, '~') AS t(keyid text, parent_keyid text, level int, branch text); HTH, Johan Nel Durban, South Africa. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general