Re: [GENERAL] Deletion Challenge
> test=> delete from cash_journal where ARRAY[click, cash_journal_id] NOT in > (select max(ARRAY[click,cash_journal_id]) from cash_journal group by > fairian_id); DELETE 7 For what it's worth, we've run into *severe* performance issues using in() if there are a large number of values in conjunction with a complex query. (EG: more than 10,000) Using a with() prefix table and joining against that doesn't seem to carry anything like a similar performance penalty. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deletion Challenge
On Saturday, December 05, 2015 11:08:05 AM Berend Tober wrote: > WITH max_click AS ( >SELECT > cash_journal.fairian_id, > max(cash_journal.click) AS click > FROM cash_journal > GROUP BY cash_journal.fairian_id > ) >delete from cash_journal j > using max_click b > where j.fairian_id = b.fairian_id > and j.click< b.click; > > WITH max_journal_id AS ( >SELECT > cash_journal.fairian_id, > cash_journal.click, > max(cash_journal.cash_journal_id) AS cash_journal_id > FROM cash_journal > GROUP BY cash_journal.fairian_id, cash_journal.click > ) >delete from cash_journal j > using max_journal_id b > where j.fairian_id= b.fairian_id > and j.click = b.click > and j.cash_journal_id < b.cash_journal_id; Although I couldn't be sure if this would provide atomicity, I'd merge these into one query like: WITH max_click AS ( SELECT cash_journal.fairian_id, max(cash_journal.click) AS click FROM cash_journal GROUP BY cash_journal.fairian_id ), max_journal_id AS ( SELECT cash_journal.fairian_id, cash_journal.click, max(cash_journal.cash_journal_id) AS cash_journal_id FROM cash_journal GROUP BY cash_journal.fairian_id, cash_journal.click ), delete_journal1 AS ( delete from cash_journal j using max_click b where j.fairian_id = b.fairian_id and j.click< b.click returning *, 'journal1'::varchar AS source ), delete_journal2 AS ( delete from cash_journal j using max_journal_id b where j.fairian_id= b.fairian_id and j.click = b.click and j.cash_journal_id < b.cash_journal_id returning *, 'journal2'::varchar AS source ) -- AND THEN TO FIND OUT WHAT HAPPENED SELECT delete_journal1.* UNION ALL select delete_journal2.* -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Permissions, "soft read failure" - wishful thinking?
On Monday, December 14, 2015 05:25:16 PM Adrian Klaver wrote: > > FOLLOWUP QUESTION: is there a way to ask the query planner what > > tables/fields were output in a database result? > > Just dawned on me, are you asking if EXPLAIN can output more detailed > information? Ha ha, in another post, I just explained that the idea for the follow up question came from EXPLAIN ANALYZE. Yes, the idea being to see if there was a way to ask PG what tables/fields were used to output a specific result, field by field, and then squelch these fields in our DB abstraction layer rather than in the DB directly. We're being asked to satisfy some pretty strict guarantees of data privacy that were unanticipated when designing our product. Adding strict permissions now would be an expensive proposition. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Permissions, "soft read failure" - wishful thinking?
On Monday, December 14, 2015 05:20:52 PM Adrian Klaver wrote: > > FOLLOWUP QUESTION: is there a way to ask the query planner what > > tables/fields were output in a database result? > > I am not following, that would be in the query output would it not? A > more detailed explanation of what you want to see would be helpful. The problem is that permissions were originally designed to be handled in the ORM (and they are, just fine) and read-level permissions were set up at the page level (it's a web based application) so you can either access the page or you can't. But in order to satisfy a large client, they want the guarantee of permissions at the field level. Problem is, our product has been in development for a LONG time (started when PHP3 was king of the hill) and if we did that, large, complex queries would break in many horrible ways. For example, it's typical to create a with prefix table with a 5-table query, then link to that two or three times over in order to develop a complex relationship with data, and then output the result. I may be wishing for magic, but would it be possible to be able to ask the DB server where the fields of data got their data from? (so we could squelch the output as it passes through our DB abstraction layer) EXPLAIN ANALYZE *almost* does this, thus the thought. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Permissions, "soft read failure" - wishful thinking?
Is there a way to set PG field-level read permissions so that a deny doesn't cause the query to bomb, but the fields for which permission is denied to be nullified? In our web-based app, we have a request to implement granular permissions: table/field level permissions. EG: userX can't read customers.socialsecurity in any circumstance. We'd like to implement DB-level permissions; so far, we've been using an ORM to manage CRUD permissions. This is old hat, but our system has a large number of complex queries that immediately break if *any* field permission fails. So, implementing this for customers could be *very* painful Is that there is a way to let the query succeed, but nullify any fields where read permissions fail? (crossing fingers) We'd be watching the PG logs to identify problem queries in this case. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] converting in() clause into a with prefix?
I have a horribly-performing query similar to below, and I'd like to convert it to use a "WITH mytable as ( ... ) " without having to re-architect my code. For some reason, using a WITH prefix seems to generally work much faster than IN() sub clause even allowing identical results. (runs in 1/4th the time) Is there a PG native function that can convert the listing format of in() clause to row-level results from a WITH prefix? I see the array* functions but they seem to work with arrays like "array[1,2,3]" and unnest seems to drill right through nested arrays and flattens every single element to a new row, regardless of depth. EG: the following two lines are equivalent: select unnest(array([1,2,2,3]); select unnest(array[array[1,2],array[2,3]]); I'd expect the latter to put out two rows as 1, 2 2, 3 Thanks for your input, clarifying pseudo code examples below (PHP). We're running 9.4.4 on CentOS 6. Ben // DESIRED END RESULT PSUEDO CODE $query = " WITH mytable AS ( unnest(". $in .", school_id, building_id) ) SELECT id, name FROM mytable JOIN classes ON ( mytable.school_id = classes.school_id AND mytable.building_id = classes.building_id )" ; // CURRENT CODE EXAMPLE (PHP) $query = " SELECT id, name FROM classes WHERE (classes.school_id, classes.building_id) IN (" . $in . ")"; // EXAMPLE RESULT (small list) SELECT id, name FROM classes WHERE (classes.school_id, classes.building_id) IN ((291,189),(291,192), (291,191),(291,195),(291,198),(291,197),(291,188),(291,184),(291,199), (291,193),(291,185),(291,194),(291,-2),(291,186),(291,183),(291,187), (291,200),(291,190),(291,196),(291,202),(200,455),(200,433),(200,442), (200,440),(200,445),(200,438),(200,437),(200,450),(200,452),(200,459), (200,456),(200,432),(200,441),(200,460),(200,446),(200,447),(200,448), (200,449),(200,434),(200,443),(200,-2),(200,444),(200,457),(200,458), (200,451),(200,454),(200,453),(246,-1),(246,190),(246,186),(246,188), (246,195),(246,196),(246,197),(246,192),(246,-2),(246,184),(246,189), (246,193),(246,194),(246,198),(246,191),(246,199),(246,187),(246,185), (246,183),(246,200),(63,-1),(63,269),(63,267),(63,264),(63,258),(63,126), (63,270),(63,262),(63,127),(63,134),(63,263),(63,271),(63,272),(63,265), (63,-2),(63,259),(63,260),(63,266),(9,-1),(9,283),(9,192),(9,266),(9,276), (9,313),(9,294),(9,311),(9,298),(9,299),(9,300),(9,312),(9,310),(9,263), (9,272),(9,305),(9,314),(9,152),(9,284),(9,277),(9,260),(9,271),(9,278), (9,315),(9,287),(9,289),(9,316),(9,288),(9,290),(9,259),(9,258),(9,269), (9,279),(9,265),(9,273),(9,280),(9,64),(9,264),(9,281),(9,317),(9,304), (9,303),(9,322),(9,267),(9,282),(9,318),(9,306),(9,295),(9,319),(9,301), (9,302),(9,43),(9,268),(9,275),(9,-2),(9,262),(9,296),(9,297),(9,320),(9,286), (9,274),(9,307),(9,308),(9,321),(9,270),(9,261),(9,309),(9,292),(9,293), (9,285),(9,291),(94,-1),(94,280),(94,285),(94,268),(94,265),(94,278),(94,283), (94,287),(94,288),(94,289),(94,284),(94,281),(94,110),(94,262),(94,259), (94,273),(94,270),(94,266),(94,258),(94,269),(94,274),(94,263),(94,290), (94,272),(94,264),(94,260),(94,279),(94,261),(94,286),(94,275),(94,277), (94,291),(94,109),(94,-2),(94,257),(111,-1),(111,452),(111,486),(111,404), (111,456),(111,455),(111,485),(111,484),(111,483),(111,457),(111,441), (111,468),(111,430),(111,442),(111,446),(111,490),(111,440),(111,466), (111,491),(111,449),(111,454),(111,461),(111,492),(111,469),(111,465), (111,458),(111,426),(111,443),(111,493),(111,488),(111,475),(111,481), (111,482),(111,474),(111,477),(111,478),(111,476),(111,479),(111,480), (111,471),(111,463),(111,450),(111,470),(111,494),(111,447),(111,448), (111,495),(111,-2),(111,473),(111,496),(111,429),(111,487),(111,497), (111,167),(111,444),(111,451),(111,431),(111,445),(111,166),(111,453), (111,467),(111,462),(111,464),(111,472),(111,489),(334,-1),(334,188), (334,189),(334,203),(334,201),(334,200),(334,193),(334,192),(334,191), (334,202),(334,185),(334,190),(334,204),(334,186),(334,187),(334,197), (334,199),(334,205),(334,184),(334,194),(334,-2),(334,196),(334,183), (334,195),(334,206),(334,198),(334,207),(201,-1),(201,445),(201,442), (201,446),(201,468),(201,458),(201,455),(201,456),(201,469),(201,454), (201,453),(201,466),(201,474),(201,443),(201,441),(201,440),(201,447), (201,470),(201,460),(201,459),(201,461),(201,471),(201,465),(201,463), (201,448),(201,451),(201,472),(201,457),(201,-2),(201,462),(201,464), (201,452),(201,449),(201,450),(201,473),(201,467),(201,475),(62,-1),(62,279), (62,280),(62,294),(62,281),(62,282),(62,285),(62,274),(62,299),(62,300), (62,290),(62,291),(62,289),(62,273),(62,286),(62,194),(62,295),(62,275), (62,-2),(62,292),(62,301),(62,196),(62,195),(62,296),(62,276),(62,284), (62,287),(62,297),(62,288),(62,277),(62,298),(62,278),(188,-1),(188,443), (188,446),(188,449),(188,453),(188,454),(188,455),(188,456),(188,450), (188,445),(188,
Re: [GENERAL] Postgresql 9.4 and ZFS?
On Wednesday, September 30, 2015 03:49:44 PM Keith Fiske wrote: > We've run postgres on ZFS for years with great success (first on > OpenSolaris, now on OmniOS, and I personally run it on FreeBSD). The > snapshotting feature makes upgrades on large clusters much less scary > (snapshot and revert if it goes bad) and being able to bring a snapshot > backup up as a clone to restore an accidentally dropped table is great. Somebody mentioned some trouble running it with ZFS on Linux, which is exactly how we're planning our roll out. (We're a RHEL/CentOS shop) Have you tried that config, and has it worked for you? > Others have given a lot of great advice as far as system tuning. Only other > thing I can add is you definitely do want your data directory on its own > pool. But I recommend putting the actual data in a folder under that pool > (possibly by major version name). For example if your pool is > > /data/postgres > > Create a folder under that directory to actually put the data: > > mkdir /data/postgres/9.4 > > This allows pg_upgrade's --link option to work during major upgrades since > you can't have an upgrade destination on a different filesystem. Just make > a 9.5 directory in the same spot when the time comes around. With ZFS > snapshots available, there's really no reason not to use the --link option > to greatly speed up upgrades. Recently, the PGDG RPMs provided by PostgreSQL have done something similar by moving from /var/lib/pgsql/ to (EG) /var/lib/pgsql/9.4 and we've followed suit, trying to keep things "stock" where possible. Our intent is to make /var/lib/pgsql a filesystem in a pool containing no other file systems, with SSD-based VDEVs that aren't shared for any other purpose. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgresql 9.4 and ZFS?
On Wednesday, September 30, 2015 09:58:08 PM Tomas Vondra wrote: > On 09/30/2015 07:33 PM, Benjamin Smith wrote: > > On Wednesday, September 30, 2015 02:22:31 PM Tomas Vondra wrote: > >> I think this really depends on the workload - if you have a lot of > >> random writes, CoW filesystems will perform significantly worse than > >> e.g. EXT4 or XFS, even on SSD. > > > > I'd be curious about the information you have that leads you to this > > conclusion. As with many (most?) "rules of thumb", the devil is > > quiteoften the details. > > A lot of testing done recently, and also experience with other CoW > filesystems (e.g. BTRFS explicitly warns about workloads with a lot of > random writes). > > >>> We've been running both on ZFS/CentOS 6 with excellent results, and > >>> are considering putting the two together. In particular, the CoW > >>> nature (and subsequent fragmentation/thrashing) of ZFS becomes > >>> largely irrelevant on SSDs; the very act of wear leveling on an SSD > >>> is itself a form of intentional thrashing that doesn't affect > >>> performance since SSDs have no meaningful seek time. > >> > >> I don't think that's entirely true. Sure, SSD drives handle random I/O > >> much better than rotational storage, but it's not entirely free and > >> sequential I/O is still measurably faster. > >> > >> It's true that the drives do internal wear leveling, but it probably > >> uses tricks that are impossible to do at the filesystem level (which is > >> oblivious to internal details of the SSD). CoW also increases the amount > >> of blocks that need to be reclaimed. > >> > >> In the benchmarks I've recently done on SSD, EXT4 / XFS are ~2x > >> faster than ZFS. But of course, if the ZFS features are interesting > >> for you, maybe it's a reasonable price. > > > > Again, the details would be highly interesting to me. What memory > > optimization was done? Status of snapshots? Was the pool RAIDZ or > > mirrored vdevs? How many vdevs? Was compression enabled? What ZFS > > release was this? Was this on Linux,Free/Open/Net BSD, Solaris, or > > something else? > > I'm not sure what you mean by "memory optimization" so the answer is > probably "no". I mean the full gamut: Did you use an l2arc? Did you use a dedicated ZIL? What was arc_max set to? How much RAM/GB was installed on the machine? How did you set up PG? (PG defaults are historically horrible for higher-RAM machines) > FWIW I don't have much experience with ZFS in production, all I have is > data from benchmarks I've recently done exactly with the goal to educate > myself on the differences of current filesystems. > > The tests were done on Linux, with kernel 4.0.4 / zfs 0.6.4. So fairly > recent versions, IMHO. > > My goal was to test the file systems under the same conditions and used > a single device (Intel S3700 SSD). I'm aware that this is not a perfect > test and ZFS offers interesting options (e.g. moving ZIL to a separate > device). I plan to benchmark some additional configurations with more > devices and such. Also, did you try with/without compression? My information so far is that compression significantly improves overall performance. > > A 2x performance difference is almost inconsequential in my > > experience, where growth is exponential. 2x performance change > > generally means 1 to 2 years of advancement or deferment against the > > progression of hardware; our current, relatively beefy DB servers > > are already older than that, and have an anticipated life cycle of at > > leastanother couple years. > > I'm not sure I understand what you suggest here. What I'm saying is that > when I do a stress test on the same hardware, I do get ~2x the > throughput with EXT4/XFS, compared to ZFS. What I'm saying is only what it says on its face: A 50% performance difference is rarely enough to make or break a production system; performance/capacity reserves of 95% or more are fairly typical, which means the difference between 5% utilization and 10%. Even if latency rose by 50%, that's typically the difference between 20ms and 30ms, not enough that, over the 'net for a SOAP/REST call, that anybody'd notice even if it's enough to make you want to optimize things a bit. > > // Our situation // Lots of RAM for the workload: 128 GB of ECC RAM > > with an on-disk DB size of ~ 150 GB. Pretty much, everything runs > > straight out of RAM cache, with only writes hitting disk. Smart > &
Re: [GENERAL] Postgresql 9.4 and ZFS?
On Wednesday, September 30, 2015 02:22:31 PM Tomas Vondra wrote: > I think this really depends on the workload - if you have a lot of > random writes, CoW filesystems will perform significantly worse than > e.g. EXT4 or XFS, even on SSD. I'd be curious about the information you have that leads you to this conclusion. As with many (most?) "rules of thumb", the devil is quite often the details. > > We've been running both on ZFS/CentOS 6 with excellent results, and > > are considering putting the two together. In particular, the CoW > > nature (and subsequent fragmentation/thrashing) of ZFS becomes > > largely irrelevant on SSDs; the very act of wear leveling on an SSD > > is itself a form of intentional thrashing that doesn't affect > > performance since SSDs have no meaningful seek time. > > I don't think that's entirely true. Sure, SSD drives handle random I/O > much better than rotational storage, but it's not entirely free and > sequential I/O is still measurably faster. > > It's true that the drives do internal wear leveling, but it probably > uses tricks that are impossible to do at the filesystem level (which is > oblivious to internal details of the SSD). CoW also increases the amount > of blocks that need to be reclaimed. > > In the benchmarks I've recently done on SSD, EXT4 / XFS are ~2x faster > than ZFS. But of course, if the ZFS features are interesting for you, > maybe it's a reasonable price. Again, the details would be highly interesting to me. What memory optimization was done? Status of snapshots? Was the pool RAIDZ or mirrored vdevs? How many vdevs? Was compression enabled? What ZFS release was this? Was this on Linux, Free/Open/Net BSD, Solaris, or something else? A 2x performance difference is almost inconsequential in my experience, where growth is exponential. 2x performance change generally means 1 to 2 years of advancement or deferment against the progression of hardware; our current, relatively beefy DB servers are already older than that, and have an anticipated life cycle of at least another couple years. // Our situation // Lots of RAM for the workload: 128 GB of ECC RAM with an on-disk DB size of ~ 150 GB. Pretty much, everything runs straight out of RAM cache, with only writes hitting disk. Smart reports 4/96 read/write ratio. Query load: Constant, heavy writes and heavy use of temp tables in order to assemble very complex queries. Pretty much the "worst case" mix of reads and writes, average daily peak of about 200-250 queries/second. 16 Core XEON servers, 32 HT "cores". SAS 3 Gbps CentOS 6 is our O/S of choice. Currently, we're running Intel 710 SSDs in a software RAID1 without trim enabled and generally happy with the reliability and performance we see. We're planning to upgrade storage soon (since we're over 50% utilization) and in the process, bring the magic goodness of snapshots/clones from ZFS. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgresql 9.4 and ZFS?
On Tuesday, September 29, 2015 10:35:28 AM John R Pierce wrote: > On 9/29/2015 10:01 AM, Benjamin Smith wrote: > > Does anybody here have any recommendations for using PostgreSQL 9.4 > > (latest) with ZFS? > > For databases, I've always used mirrored pools, not raidz*. > put pgdata in its own zfs file system in your zpool. on that dedicated > zfs, set the blocksize to 8k. Based on my reading here, that would be -o ashift=13 ? HowDoesZFSonLinuxHandleAdvacedFormatDrives EG: 2^13 = 8192 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Postgresql 9.4 and ZFS?
Does anybody here have any recommendations for using PostgreSQL 9.4 (latest) with ZFS? We've been running both on ZFS/CentOS 6 with excellent results, and are considering putting the two together. In particular, the CoW nature (and subsequent fragmentation/thrashing) of ZFS becomes largely irrelevant on SSDs; the very act of wear leveling on an SSD is itself a form of intentional thrashing that doesn't affect performance since SSDs have no meaningful seek time. It would seem that PGCon 2013 even had a workshop on it! https://www.pgcon.org/2013/schedule/events/612.en.html The exact configuration we're contemplating is either (3x 400 RAIDZ1) or (4x 400 RAIDZ2) with Intel Enterprise SATA3 SSDs, with default (lz4) compression enabled. If this is a particularly good or bad idea, I'd like to hear it, and why? Thanks, BenP -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Recommendations for SSDs in production?
On Thursday, November 03, 2011 10:59:37 AM you wrote: > There's a pretty varied mix of speed, durability, and price with any > SSD based architecture, but the two that have proven best in our > testing and production use (for ourselves and our clients) seem to be > Intel (mostly 320 series iirc), and Fusion-IO. I'd start with looking > at those. This is *exactly* the type of feedback that I've been looking for - thanks! The Fusion IO looks to be in the "if you have to ask about prices you probably can't afford it" range, although getting a million IOPS is damned impressive. I'm surprised a bit by the Intel 320 referenced, since this is typically touted as a consumer device, but we've consequently decided to give the 710 a shot since it's basically a 320 with some reliability upgrades, and see how it goes. I will post my results here later, time permitting. Thanks again. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Recommendations for SSDs in production?
On Wednesday, November 02, 2011 01:01:47 PM Yeb Havinga wrote: > Could you tell a bit more about the sudden death? Does the drive still > respond to queries for smart attributes? Just that. It's almost like somebody physically yanked them out of the machine, after months of 24x7 perfect performance. A cold reboot seems to restore order for a while, but the drives die again similarly fairly soon after a failure like this. From what I can tell, SMART is not worth much with SSDs. -Ben -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Recommendations for SSDs in production?
On Wednesday, November 02, 2011 11:39:25 AM Thomas Strunz wrote: > I have no idea what you do but just the fact that you bought ssds to > improve performance means it's rather high load and hence important. Important enough that we back everything up hourly. Because of this, we decided to give the SSDs a try. > Using a consumer drive for that IMHO is not the best idea. I know a lot > about ssds but just in consumer space. Intel has a good reputation in > terms of reliability but they are not the fastest. Which is what we're trying next, X25E. 710's apparently have 1/5th the rated write endurance, without much speed increase, so don't seem like such an exciting product. > I guess go Intel > route or some other crazy expensive enterprise stuff. It's advice about some of the "crazy expensive enterprise" stuff that I'm seeking...? I don't mind spending some money if I get to keep up this level of performance, but also am not looking to make somebody's private plane payment, either. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Recommendations for SSDs in production?
Well, After reading several glowing reviews of the new OCZ Vertex3 SSD last spring, we did some performance testing in dev on RHEL6. (CentOS) The results were nothing short of staggering. Complex query results returned in 1/10th the time as a pessimistic measurement. System loads dropped from 2+ to 0.1 or less. Wow. So after months of using this SSD without any issues at all, we tentatively rolled this out to production, and had blissful, sweet beauty until about 2 weeks ago, now we are running into sudden death scenarios. We have excellent backup system, so the damage is reduced to roadbumps, but are looking for a longer term solution that doesn't compromise performance too much. The config is super-basic, basically no tuning at all was done: # fdisk /dev/NNN; mke2fs -j $partn; mount $partn /var/lib/pgsql; rsync -vaz /var/lib/pgsql.old /var/lib/pgsql; service postgresql start; I don't mind spending some money. Can anybody comment on a recommended drive in real world use? After some review I found: 1) Micron P300 SSD: claims excellent numbers, can't find them for sale anywhere. 2) Intel X25E - good reputation, significantly slower than the Vertex3. We're buying some to reduce downtime. 3) OCZ "Enterprise" - reviews are mixed. 4) Kingston "Enterprise" drives appear to be identical to consumer drives with a different box. 5) STEC drives are astronomically expensive. (EG: "You're kidding, right?") 6) Corsair consumer drives getting excellent reviews, Aberdeen Inc recommended in use with RAID 1. 7) Seagate Pulsar drives, XT.2 drives are expensive SLC but can't find a vendor, Pulsar .2 drives are more available but having trouble finding reviews other than rehashed press releases. Thanks! -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SSDs with Postgresql?
On Sunday, April 17, 2011 01:55:02 AM Henry C. wrote: > On Thu, April 14, 2011 18:56, Benjamin Smith wrote: > > After a glowing review at AnandTech (including DB benchmarks!) I decided > > to spring for an OCX Vertex 3 Pro 120 for evaluation purposes. It cost > > about $300 > > > > with shipping, etc and at this point, won't be putting any > > > > Considering that I sprang for 96 GB of ECC RAM last spring for around > > $5000, even if I put the OCX drives in pairs w/RAID1, I'd still come out > > well ahead if it allows me to put off buying more servers for a year or > > two. > > Exactly. Be aware of the risks, plan for failure and reap the rewards. Just curious what your thoughts are with respect to buying SSDs and mirroring them with software RAID 1. (I use Linux/CentOS) -Ben -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Re: [GENERAL] pgsql 9.0.1 table corruption
On Friday, April 15, 2011 09:50:57 AM Tom Lane wrote: > If you simply unpacked the tar archive and started a postmaster on that, > you'd be pretty much guaranteed to get a corrupt database. The tar > archive is not a valid snapshot by itself --- you have to replay > whatever WAL was generated during the archiving sequence in order to get > to a consistent database state. I have, more than once, "moved" a PG instance from one machine to another with the following sequence, without apparent issue. is there anything I'm missing and/or need to be concerned with? 1) service postgresql stop; 2) rsync -vaz /var/lib/pgsql root@newserver:/var/lib/pgsql; 3) ssh root@newserver; 4) (edit postgresql.conf, set IP addresses, memory, etc) 5) service postgresql start; This is a "done at 10 PM on Friday night" kind of process. -Ben -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Re: [GENERAL] SSDs with Postgresql?
After a glowing review at AnandTech (including DB benchmarks!) I decided to spring for an OCX Vertex 3 Pro 120 for evaluation purposes. It cost about $300 with shipping, etc and at this point, won't be putting any Considering that I sprang for 96 GB of ECC RAM last spring for around $5000, even if I put the OCX drives in pairs w/RAID1, I'd still come out well ahead if it allows me to put off buying more servers for a year or two. -Ben On Thursday, April 14, 2011 02:30:06 AM Leonardo Francalanci wrote: > have a look at > > http://postgresql.1045698.n5.nabble.com/Intel-SSDs-that-may-not-suck-td426 > 8261.html > > > It looks like those are "safe" to use with a db, and aren't that expensive. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
[GENERAL] SSDs with Postgresql?
The speed benefits of SSDs as benchmarked would seem incredible. Can anybody comment on SSD benefits and problems in real life use? I maintain some 100 databases on 3 servers, with 32 GB of RAM each and an extremely rich, complex schema. (300+ normalized tables) I was wondering if anybody here could comment on the benefits of SSD in similar, high-demand rich schema situations? -Ben -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Re: [GENERAL] Web Hosting
Try this: http://lmgtfy.com/?q=web+hosting+postgresql On Sunday, March 06, 2011 11:33:01 am Eduardo wrote: > At 17:24 06/03/2011, you wrote: > >On 3/5/2011 4:08 PM, matty jones wrote: > >>I already have a domain name but I am looking for a hosting company > >>that I can use PG with. The few I have contacted have said that > >>they support MySQL only and won't give me access to install what I > >>need or they want way to much. I don't need a dedicated host which > >>so far seems the only way this will work, all the companies I have > >>researched so far that offer shared hosting or virtual hosting only > >>use MySQL. I will take care of the setup and everything myself but > >>I have already written my code using PG/PHP and I have no intention > >>of switching. > >> > >>Thanks. > > > >http://hub.org/ > > +1 -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Re: [GENERAL] pg_dump makes our system unusable - any way to pg_dump in the middle of the day? (postgres 8.4.4)
I'd also add: run pg_tune on your server. Made a *dramatic* difference for us. On Friday, February 25, 2011 05:26:56 am Vick Khera wrote: > On Thu, Feb 24, 2011 at 6:38 PM, Aleksey Tsalolikhin > > wrote: > > In practice, if I pg_dump our 100 GB database, our application, which > > is half Web front end and half OLTP, at a certain point, slows to a > > crawl and the Web interface becomes unresponsive. I start getting > > check_postgres complaints about number of locks and query lengths. I > > see locks around for over 5 minutes. > > I'd venture to say your system does not have enough memory and/or disk > bandwidth, or your Pg is not tuned to make use of enough of your > memory. The most likely thing is that you're saturating your disk > I/O. > > Check the various system statistics from iostat and vmstat to see what > your baseline load is, then compare that when pg_dump is running. Are > you dumping over the network or to the local disk as well? -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Re: [GENERAL] Massively Parallel transactioning?
On Wednesday, August 18, 2010 04:58:08 pm Joshua D. Drake wrote: > Well if you are just using it for updates to the schema etc... you > should only need to launch a single connection to each database to make > those changes. And that's exactly the problem. On each server, we have at least dozens of active databases - one for each client. Opening a connection for each database starts to become problematic. This problem is now solved with prepared transactions. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Massively Parallel transactioning?
On Wednesday, August 18, 2010 08:40:21 pm Adrian von Bidder wrote: > Heyho! > > On Thursday 19 August 2010 01.32:06 Benjamin Smith wrote: > > This way we can be sure that either all the databases are in synch, or > > that we need to rollback the program patch/update. > > I guess this might be more a hack than a solution: do the updates in > batches and use 2pc: first connect to batches of databases, but instead of > commit, you "prepare to commit". Prepared commits like this are > persistent accross connections, so you can come back later and commit or > rollback. > > Note that such prepared commits will block (some) stuff and use resources > (not sure how many) before they are finally committed or rolled back, so > you'll want to make sure they don't stick around too long. I can't see how this would be a hack, it's EXACTLY what I'm looking for! So often I find that when limits in Postgres get in my way, it's because I don't understand Postgres well enough. Much kudos to all of the Postgres team! -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Clustering, parallelised operating system, super-computing
On Thursday, May 13, 2010 11:51:08 pm Brian Modra wrote: > Maybe the best way to solve this is not to do automatic distribution > of the data, but rather to provide tools for implementing distributed > references and joins. Here's my vote! I'd *LOVE* it if I could do a simple cross-database join (without the ugliness of dblink), it would be just awesome. Two beers for cross-database foreign keys... We already do use dblink extensively with a wrapper. for various reporting functions. Since the cross-database queries are in the minority, it does function as load balancing, even if the cross-joined queries aren't so balanced. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Massively Parallel transactioning?
Is there a way to update a number of databases hosted on a single server without opening a separate psql connection to each database? We have a cluster of servers hosting an application on Postgres. Right now, we have dozens of databases per server, enough that we're starting to have problems with our update process. When we release updates, we have a migrate script within our update process that runs all the database schema updates for all our clients. The way that it works is to open a transaction on all the databases concurrently, run the commands in sequence on the databases within the transactions, and then commit them all (or rollback if there was a problem) This way we can be sure that either all the databases are in synch, or that we need to rollback the program patch/update. So far, it's been a dream, but now, as we continue to grow, we're starting to reach connection limits per server. Short of raising the number of simultaneous connections, is there a way to run all the transactions for a single server for all databases within it on a single (or small number) of connections? I've tried the following: # ATTEMPT 1 $psql -U postgres template1 -h server1; template1=# begin transaction; create table testtable (name varchar); BEGIN CREATE TABLE \c somedatabase; ri psql (8.4.4, server 8.4.0) SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256) You are now connected to database "somedatabase". somedatabase=# rollback; NOTICE: there is no transaction in progress ROLLBACK somedatabase=# \c template1; template1=# rollback; NOTICE: there is no transaction in progress ROLLBACK template1=# # ATTEMPT 2 $psql -U postgres template1 -h server1; template1=# alter table somedatabase.testtable add address varchar; ERROR: cross-database references are not implemented: "somedatabase.public.students" template1=# Is there a better way? -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Code tables, conditional foreign keys?
"A deep unwavering belief is a sure sign that you're missing something." -- Unknown I had no intention of sparking an ideological discussion. I read Joe's article reference previously - a simple case for using a normalized database. I would pretty much agree with his assessment from beginning to end. However, he really doesn't address my scenario at all. I'm not trying to mash my database together into a single table, I'm trying to deal with the fact that we have hundreds of tables with nearly identical syntax, but further, deal with the concept of "code tables". See our product has to work in many venues, and each venue has their own set of code-table data that they'd like to support. Worse, they often use similar values for the different things, so the "natural key" is just not natural. Sometimes I've seen venues "re-using" the old code table value from previous years to mean new things in current/future years. Yes, this is a bad, bad, bad idea but it was still there and it's still my job to deal with it. Surrogate keys are used to make sure that 15 to mean "BS College Degree" in venue A aren't confused with 15 to mean "No High School Education" in another venue. They cover a similar value, EG: applicant's educational level. Some values don't translate at all, (EG: differing representations of vocational arts) so using our own code table set and then translating doesn't work consistently, either. So we have multiple, distinct sets of data to be used within a single field. Either that, or we create massive data tables with every possible different set of otherwise similar data, each of which has a foreign key to a table with a slightly different name, which is, far and away, even uglier. (EG: applicants.ca_edlevel, applicants.or_edlevel applicants.nv_edlevel, applicants.southca_edlevel...) educational level is one example, there are hundreds that we have to deal with! So back to the first question: is there a way to have a conditional foreign key? On Saturday 23 May 2009 17:22:36 Lew wrote: > Conrad Lender wrote: > > I didn't intend any disrespect to Joe Celko. I have read a number of his > > articles, which tend to be well written and informative. Last year, when > > I posted to comp.databases asking for advice on whether to refactor that > > table, he wrote "You will have to throw it all out and start over with a > > relational design", "Throw away the idiot who did the EAV. This is not a > > good design -- in fact, it is not a design at all", and "This is basic > > stuff!!" Then he copied the same EAV example that was linked earlier by > > Rodrigo, claiming that "someone like me" had suggested it. With all the > > respect I have for Mr. Celko, that was hardly helpful, as that example > > and the situation I had described were quite different. It also did not > > encourage me to follow his advice and start from scratch (and fire my > > boss, who was the mentioned "idiot"). > > If we fired every boss who actually is an idiot there would be about half > the number of bosses. > > All kidding aside, why is the boss specifying a database architecture? > That is not the boss's job. > > > I understand the problems that can arise from bad design choices, and I > > know that Celko is vehemently opposed to anything that resembles EAV, > > For good reasons. > > > but I felt that in our case "throwing it all away" would be excessive. > > Perhaps not. I had a situation some years ago where a supervisor would not > let me normalize a database and consequently the project nearly failed. > Fortunately, the company assigned a new team lead/project manager who did > the normalization or it would have been a disaster. Trying to make a bad > approach work is often, if not always, more expensive than replacing it > with a good approach. > > > We had safeguards to ensure referential integrity, and keeping the > > values in the same table allowed us to let users manage them all with > > the same form. So I guess it's like Stefan Keller said in a different > > thread today: "Know when to break the rules." > > Managing all the values in the same form is not intrinsically connected to > whether one stores the values in an EAV layout. > > Telling oneself that one should know when to break the rules is not the > same as knowing when to break the rules. They are the rules for good > reason. > > All I'm saying is that EAV is a very problematic approach. I've been on > projects that tried to use it, and while that didn't make me an expert on > the matter by any means, it gave me some cause to trust Mr. Celko's opinion > on the matter. > > -- > Lew > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clea
Re: [GENERAL] Aggregate Function to return most common value for a column
I've used this same concept in subqueries for a very long time. Doing this allows me to "dive in" and get other values from the joined table, rather than just the thing that we're getting the most of. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - "I kept looking for somebody to solve the problem. Then I realized... I am somebody!" -- Author Unknown -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
[GENERAL] Code tables, conditional foreign keys?
I have some questions about the best way to best use foreign keys in complex schemas. It's becoming cumbersome to manage a large set of foreign keys - is there a better way? // FOUNDATIONAL // Let's say that you want to keep addresses, and one of the values that you need to keep is the state. So you have two tables defined: create table states ( state varchar unique); create table customers (... state varchar not null references states(state), ...); If you want to be a bit more "pure", you might do it like this: create table states (id serial primary key, state varchar(2), description varchar); create table customers (... states_id integer not null references states(id), ...); So far, so good. But when you have a large number of fields with foreign key references, you end up with a bazillion reference tables, all with very similar layouts. EG: create table customer_types (id serial primary key, title varchar(4), description varchar); create table customer_taxcode (id serial primary key, title varchar(4), description varchar); ... create table customers (... customer_types_id integer not null references customer_types(id), customer_taxcode_id integer not null references customer_taxcode(id), ...); Getting the appropriate code tables from all these different tables becomes cumbersome, just because there are SO MANY tables to get these values from. So the next idea is to create a master set of code tables and foreign key to there, but this has its own set of problems EG: create table codetables ( id serial primary key, table varchar unique not null ); create table codevalues ( id serial primary key, codetables_id integer not null references codetables(id), value varchar not null, unique(codetables_id, value) ); create table customers ( customer_types_id integer not null references codevalues(id), customer_taxcode_id integer references codevalues(id), ) How do you know that taxcode_id references the correct set of code values? You could use a dual foreign key, but then you have to have a field for each and every codetable you reference, eg: insert into codetables(33, 'customertypes'); insert into codevalues(codetables_id, value) values (33, 'Gubbmint'); create table customers ( customer_types_id integer not null, customer_taxcode_id integer, custtypes not null default 33, -- the codetables.id for customer types taxcodes not null default 34, -- the codetables.id for taxcodes foreign key (custtypes, customer_types_id) references codevalues(codetables_id, id), foreign key (taxcodes, customer_taxcode_id) references codevalues(codetables_id, id) ); This also becomes cumbersome. Is there a better way? Is there some way to do this not covered in the docs? What would be ideal is to treat the reference something like a join - might be something like: create table customers ( ... customer_taxcode_id integer not null references codevalues(id) ON codevalues.codetables_id = 33, ... ) ... Thanks! -Ben -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Re: [GENERAL] Is this good spec for a PostgreSQL server?
On Wednesday 19 September 2007, Bjørn T Johansen wrote: > It's a Dell server with the following spec: > > PE2950 Quad-Core Xeon E5335 2.0GHz, dual > 4GB 667MHz memory > 3 x 73GB SAS 15000 rpm disk > PERC 5/I Integrated controller card (8 ports, 256MB cache, battery backup) x 6 backplane Asking "is this a good database server?" is a meaningless question without more information. I have an ancient 500 Mhz Pentium III that runs a lightweight Postgres database excellently, but I wouldn't recommend it for enterprise duty! I've admin'd a few Dell servers, and consistently ran into minor driver niggles. They often pick hardware that isn't supported in the source kernel tree, though to their credit, they DO usually provide appropriate drivers. In one case, it was an ethernet driver that was unsupported by my distro. (RedHat/CentOS) There were sources available that I could recompile, and I did, and it worked fine, but it was sure a pain in the [EMAIL PROTECTED] to have to recompile it everytime a new kernel came out, and there was no way to test whether or not the recompile "took" until the reboot - and the reboot is the WORST way to test an ethernet driver when you are admining remotely. Personally, I prefer generic, white-box solutions, like a Tyan reference system, or maybe a SuperMicro. They tend to be conservative in their hardware choices, they're quite reliable, very solid performers, and for the price of one "on brand" server, you can get two whitebox systems and have a hot failover on site. I have 4x quad-core Opteron 1U rackmounts that I've been blissfully happy with, 2x 300 GB 10k SCSI (software RAID 1), 4 GB of RAM, dual Gb NICs. I can pull any one of the RAID 1 drives out any machine, plug it into any other machine, and have a working, booted system in < 5 minutes. No driver headaches, no hassle, with excellent reliability under load. (knocks on wood) Each person picks their favorite blend of poison, I guess. -Ben -- I kept looking for somebody to solve the problem. Then I realized - I am somebody. -- Author Unknown -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Stuck on Foreign Keys
On Wednesday 19 September 2007, Chester wrote: > Hi > > I have a question regarding foreign keys, I just cannot get it to create > them for meI must be doing something wrong but I have no idea what > that might be :) > > I have a table "clients" > > clientID (primary) > ticode > Firstname > SecondName > > I have a second table "titles" > > ticode (primary) > Title > > I am trying to create a foreign key on TIcode "clients" table as in below, > > ALTER TABLE clients ADD CONSTRAINT the_title FOREIGN KEY (ticode) > REFERENCES titles (ticode) ; > > I keep getting this error > > ERROR: insert or update on table "clients" violates foreign key > constraint "the_title" > DETAIL: Key (ticode)=( ) is not present in table "titles". This foreign key constraint that every instance of clients.ticode must have a corresponding (unique) titles.ticode. But you don't - there are records in clients where there's a ticode value that's not found in titles.ticode. Cheers! -Ben -- I kept looking for somebody to solve the problem. Then I realized - I am somebody. -- Author Unknown -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Postgresql and SSL
I'm using 8.1 RPMs for CentOS and so far, it's been great. Now, I'm going to enable SSL. I had no trouble with the instructions on the documentation for server-only certificates, and verified that psql (Linux) acknowledges the SSL connection. But I am stumped as to how to create a client certificate that's enforced! I tried the instructions found http://marc.info/?l=tomcat-user&m=106293430225790&w=2 and used the "ca.pem" created there as the postgres root.crt and although the PG daemon no longer indicates that it couldn't find root.crt, it also doesn't require a client certificate installed to access with psql. Any pointers for somebody who is NOT an ssl guru? (like myself!) -Ben -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] How to enforce uniqueness when NULL values are present?
On Saturday 10 March 2007, Christian Schröder wrote: > Let's assume that the values in this table are some limits that are > given for different data (identified by the id). Some of the limits are > only valid after a given date, whereas other limits are valid all the > time. How would you put this information into one or more tables? Of > course, I could use a special date to indicate that a limit is valid all > the time (e.g. 1970-01-01), but I don't think that this is better design > than representing this with a NULL value. Or I could split the data into > two different tables, one with the date column and one without. But then > I had to work with two tables with more or less the same meaning. > Wouldn't it be quite strange to model the same entities (the limits) > with two tables? - SNIP - >1. If a record with a given id and a null value in the date field > exists, no other record with the same id is allowed. >2. If multiple records with the same id exist, they must have > different values in the date field and none of them must have a > null value in this field. Seems to me that this is what you are looking for: TABLE listofids: Column | Type | Modifiers +--+--- id | integer | not null hasdates | bool| default null unique(id, hasdates) TABLE listofidsdates: Column | Type | Modifiers +--+--- listofids_id | integer | not null REFERENCES listofids(id) date | date | not null unique(listofids_id, date) When there are dates, set listofids.hasdates=null. Otherwise, set it to true. Does this seem most properly normalized? (it's how I would do it!) How could this be done better? -Ben ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] server vendor recommendation
I've purchased a number of systems (a dozen or so) from avadirect. http://www.avadirect.com Their prices are excellent, hardware is solid quality, their service is median. This is a discount shop, so don't expect lightening support. But you can buy three fast AVA systems of top-notch quality for the price of a single HP or Sun server with "enterprise support". Those sound like pretty good numbers if you are technically inclined. I especially like their 1U Tyan-based 1U Opterons with 10k SCSI drives - almost all of the systems I've purchased from them have been in this configuration and all of these have worked very well for me. I have two right now being torture tested in the other room. Quad-core Opterons, 4 GB ECC RAM, dual 10k SCSI, great DB servers - sweet! I like to buy two identical drives with the storage I need, and use Linux kernel software RAID 1. This gives me good performance, redundancy, and compatability that doesn't tie me to a specific chipset or RAID card. There is room for 4 bays on the front, so I can add two more drives later if it's needed. My O/S of choice is CentOS 4. http://www.centos.org The only problem I've had with them is that I bought a couple of SuperMicro 1U systems in a similar configuration and had issues, but so far, they've been resolved equitably. -Ben PS: No, I don't work for them in any way. On Wednesday 07 March 2007, snacktime wrote: > Any recommendations for vendors that can build custom servers? > Specifically opteron based with scsi raid. > > Chris > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Converting 7.x to 8.x
On Tuesday 23 January 2007 13:55, Carlos wrote: > What would be the faster way to convert a 7.4.x database into an 8.x > database? A dump of the database takes over 20 hours so we want to convert > the database without having to do a dump and resptore. You've probably already accounted for this, but make sure you've tried your options for loading the database. Using long ("insert") form vs copy can make a *huge* performance difference. (Hours vs seconds, in some cases!) -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] sequence increment jumps?
On Thursday 25 January 2007 09:53, Douglas McNaught wrote: > Nature of the beast. Sequence increments aren't rolled back on > transaction abort (for performance and concurrency reasons), so you > should expect gaps. Behavior long ago noted and accounted for. But I've always wondered why this was so? Is there a specific reason for this behavior? -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Postgresql.conf
Andreas, Would you mind explaining what you mean by "localized object names" and why it might be bad? Or where I might go to learn more? Thanks, -Ben On Tuesday 23 January 2007 07:38, Tino Wildenhain wrote: > A. Kretschmer schrieb: > > am Tue, dem 23.01.2007, um 10:12:13 -0500 mailte Brandon Aiken folgendes: > >> Out of curiosity, has the COUNT(*) with no WHERE clause slowness been > >> fixed in 8.x? Or is it still an issue of "there's no solution that > >> won't harm aggregates with WHERE clauses"? > > > > I will try it: > > > > scholl=# \timing > > Timing is on. > > scholl=# select count(1) from bde_meldungen ; > > ^^ > k localized object names ;))) > > Tino > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- "I kept looking around for somebody to solve the problem. Then I realized I am somebody" -Anonymous ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Partitioning Vs. Split Databases - performance?
On Thursday 21 December 2006 14:41, Joshua D. Drake wrote: >You should read up on schemas and how they work. Plus the > addition of schemas and table spaces means you can infinite scaling > within the confines of your hardware itself. Ok, so I'd like you to correct me if I'm wrong: 1) Schemas operate within a database. A schema is analogized as a filesystem directory in the docs, except that you can't recurse schemas. 2) A database runs on one machine, with the following addenda: A) slony lets you copy that database to another system, B) pgtool lets you duplicate the database if you're real careful around updating with unique IDs and aggregate functions, C) you can essentially do cross-machine RAID so that if your primary DB gets hosed, you can fire up the backup machine and continue working. D) pg-cluster, a synchronous clustering solution appears to be new/unstable, doesn't appear to be current to 8.x, and takes a performance hit on writes. 3) Thus, any service that splits up a database (EG: a schema) is subject to all the limitations outlined in #2. Did I miss anything? -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Partitioning Vs. Split Databases - performance?
On Thursday 21 December 2006 11:47, Ron Johnson wrote: > This gives you linear growth potential, since if your current box > gets over-utilized, buy a 2nd box and move some of the databases to it. So far, I'm inclined to go this way, due to the option for linear scaling. > >> 2) Copy out the data specific to a customer and load into separate tables > >> (with slightly different names, EG table "dates" becomes "cust1_dates") > >> and > >> use data partitioning to help with performance as needed. > > Definitely *not* scalable. And *very* messy. Yech. Scales better than present, methinks, but still not the best idea. I'd have to revisit all my queries to make sure that they use the correct tablename. > > 3) Put each customer in their own schema/namespace which resides within > > its own table space. > > > > Then you can move customers wherever you need in terms of IO. How is that functionally different than using a separate database? What's the advantage here? I don't *need* to restrict myself to one database, and doing this does require that I revisit 100% of the SQL queries to make sure that I'm referencing the right schema. This solution seems to have the same problems as using dynamic tablenames. > Splitting like mentioned in these three tactics means that you've > now got 2x as many tables. Add more customers and you've got that > many more tables. Perfect candidate for "schema drift". > > If each table has cust_id in it, then you could: > > 4) retain 1 database and partition each table on cust_id. > > http://www.postgresql.org/docs/8.1/interactive/ddl-partitioning.html Intriguing idea, and one that I might do in the future. However, I have 170 normalized tables defined at present. The task of auditing each of these tables for the constraint ranges sounds somewhat nightmarish. Not all the tables have the cust_id record defined - some are implicit. > >> Given the same physical hardware, which one is likely to perform better? Does > >> it make any difference? Does using separate databases use more RAM than a > >> single database with a bunch of different tables? > > Config files are global, so I doubt it. > > >> Company is growing rapidly, so growth room is important... > > Then go for Option 1. My conclusion, too. Another poster mentioned schema drift, and that's a real concern, but we're already updating the schema through a script which could be extended to update all databases, not just one, so I feel this problem would be minor to non-existent. Thanks! -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Partitioning Vs. Split Databases - performance?
I'm breaking up a database into several sets of data with similar layout. (we currently have multiple customers using a single database and tableset, we're splitting it out to give us more "wiggle room") It seems that there are basically two ways to proceed: 1) Copy out the data specific to a customer and load into a separate database for that customer, or 2) Copy out the data specific to a customer and load into separate tables (with slightly different names, EG table "dates" becomes "cust1_dates") and use data partitioning to help with performance as needed. Given the same physical hardware, which one is likely to perform better? Does it make any difference? Does using separate databases use more RAM than a single database with a bunch of different tables? Company is growing rapidly, so growth room is important... -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Performance of outer joins?
I have a situation that can be summarized to the following: -- day in 20061215 format Create table calendar ( day integer unique not null ); Create table customers ( id serial unique not null, name varchar, address varchar, ); Create table deliveries ( customers_id integer not null references customers(id), calendar_day integer not null references calendar(day), delivered bool not null default false, unique(customers_id, calendar_id) ); Imagine tens of thousands of customers, a few million deliveries. A query that's structurally similar to the following query is rather slow. It's taking over 30 seconds, climbing fast on reasonable HW. (SMP Opteron, 10k SCSI, 4 GB RAM) If I remove the outer join, performance is < 1 second. SELECT customers.id as customers_id, customers.name AS customers_name, calendar.day AS calendar_day, CASE WHEN (deliveries.delivered IS NULL) THEN 'n/a' WHEN (deliveries.delivered=TRUE) THEN 'yes' ELSE 'no' END AS delivered FROM customers JOIN calendars ON ( -- GIVE A CALENDAR OF POSSIBLE DAYS FOR DELIVERIES calendar.day < 20061201 AND calendar.day >= 20060101 ) LEFT OUTER JOIN deliveries ON ( customers.id=deliveries.customers_id AND deliveries.calendar_day=calendar.day ) ; What can I do to improve the performance of this oft-used query? Is there a better way to do this, or am I doomed to looping thru results and parsing the results in code? Thanks, -Ben -- "I kept looking around for somebody to solve the problem. Then I realized I am somebody" -Anonymous ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Postgres Team: Thank You All
On Wednesday 20 September 2006 18:59, Brian Maguire wrote: > I justed wanted to let you know how impressed and pleased I have been with > postgres over the past 5 years . The timeliness and quality of the releases > are always robust and stable. Every release has a very nice mix of admin, > performance, platform, and feature adds. The support of the listserves is > bar none to any commercial support I have ever purchased. I can't remember a > time I did not get 3-5 answers to any question I have ever had. I must also > note that the new website design and organization have added very nice > polish to the project. Just found this while searching my email list. I'm CTO of a small (but growing!) software company that uses PG exclusively, as the RPMs that come with CentOS. Under loads heavy and light, with virtually no administration overhead, PG chugs along and "just works". *round of applause* -Ben > Cheers and thank you all, > > Brian > > > > > > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > > -- "I kept looking around for somebody to solve the problem. Then I realized I am somebody" -Anonymous ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] off topic - web shop
On Monday 11 September 2006 11:30, stig erikson wrote: > Hi. > We are looking to open a small web shop. I looked around to see if there are any open source web shops. > Can anyone recommend any web shop system (free or non-free)? I'd guess you're looking for OSCommerce. (Sucks, but less so than most other free alternatives) Good luck! -Ben > ---(end of broadcast)--- > TIP 4: Have you searched our list archives? > >http://archives.postgresql.org > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- "I kept looking around for somebody to solve the problem. Then I realized I am somebody" -Anonymous ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Restricting access to rows?
We have a growing ASP-hosted application built on PHP/Postgres 8.1, and are getting requests from clients to manipulate the databases more directly. However, the structure of our databases prevents this from happening readily. Assume I have two tables configured thusly: create table customers ( id serial unique not null, name varchar not null ); create table widgets ( customers_id integer not null references customers(id), name varchar not null, value real not null default 0 ); insert into customers (name) values ('Bob'); insert into customers (name) values ('Jane'); insert into widgets (customers_id, name, value) VALUES (1, 'Foo', 100); insert into widgets (customers_id, name, value) VALUES (1, 'Bar', 50); insert into widgets (customers_id, name, value) VALUES (2, 'Bleeb', 500); This leaves us with two customers, Bob who has two widgets worth $150, and Jane with one widget worth $500. How can I set up a user so that Bob can update his records, without letting Bob update Jane's records? Is it possible, say with a view or some other intermediate data type? Thanks, -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Checking for Foreign Keys constraining a record?
I have a customer table (very important) and have numerous fields in other tables FK to the serial id of the customer table. There's an option to delete a customer record, but it has to fail if any records are linked to it (eg: invoices) in order to prevent the books from getting scrambled. I want to be able to determine in advance whether or not a record is "deleteable" before displaying the button to delete the record. If it's not deleteable, it should say so before the user hits the button. But, the only way that I've been able to find out if the customer record is deletable is to begin a transaction, try to delete it, check to see if it worked, and then rollback the session. This causes my error logger to log errors everytime somebody looks at a customer record, and (I'm sure) is not very efficient. Is there a way to ask the database: "Are there any FK constraints that would prevent this record from being deleted?" Thanks, -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Select first ten of each category?
I'm stumped on this one... I have a table defined thusly: create table items ( id serial, category integer not null references category(id), name varchar not null, price real, unique(category, name)); It has a LARGE number of entries. I'd like to grab the 10 most expensive items from each category in a single query. How can this be done? Something like Select items.* FROM items where id IN ( select firstTen(id) FROM items group by category ORDER BY price DESC ) ORDER BY price desc; But I've not found any incantation to make this idea work... -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Advantages of PostgreSQL over MySQL 5.0
On Wednesday 22 March 2006 03:06, Jimbo1 wrote: > Hello there, > > I'm a freelance Oracle Developer by trade (can almost hear the boos now > ;o)), and am looking into developing my own Snowboarding-related > website over the next few years. Anyway, I'm making some decisions now > about the site architecture, and the database I'm going to need is > obviously included. If my site works out, I'm expecting reasonably > heavy traffic, so want a database that I'm confident can cope with it. I've built many sites based on PostgreSQL. Originally,like most, I started with MySQL, but after I discovered PG in about 2000, I've switched all development to it, and have never looked back. I have "enterprise" systems developed with PostgreSQL with 500 users, 50 online at a time, > 100 database tables. Although the data sample is still not that impressive, (71 MB sql file with pg_dump) the database itself is quite complex, with multiple foreign keys in a single table being the norm. It's just been a dream. It's solid, reliable, and virtually always behaves as expected. My only caveat is that occasionally, you really have to watch the use of indexes. I had one query (nasty, with 7-8 tables involved in a combined inner->outer->inner join) that was taking some 20 seconds to execute. Just changing the order of some of the tables in the query, without logically changing the result at all, dropped that time down to < 50 ms! > Regarding MySQL, I've been put off by Oracle's recent purchase of > InnoDB and realise this could badly impact the latest version of the > MySQL database. I can almost hear Larry Ellison's laughter from here > (allegedly)! I've also been put off by the heavy marketing propaganda > on the MySQL website. Perhaps the single thing I most like about PostgreSQL is the feeling that "it can't be taken away from me". The license is sufficiently open, and the product is sufficiently stable, that I don't ever wonder if I'm "compliant" or "paid up", nor do I wonder if my growth will be particularly limited anywhere in the forseeable future. > "With MySQL, customers across all industries are finding they can > easily handle nearly every type of database workload, with performance > and scalability outpacing every other open source rival. As Los Alamos > lab (who uses MySQL to manage their terabyte data warehouse) said, "We > chose MySQL over PostgreSQL primarily because it scales better and has > embedded replication.".". PostgreSQL has replication, as well. From what I've read, it's probably about on par with MySQL in terms of manageability and reliability. But, truthfully, having dealt with database replication, it's a PAIN IN THE ARSE and very unlikely worth it. In fact, systems that I've worked on that included replication are generally less reliable than those that simply do a dump/copy every hour or two, due to the increased management headaches and niggling problems that invariably seem to occur. Consider replication if the cost of a full-time DB Admin is justified by saving perhaps a few hours of uptime per year. If so, go for it. Be honest about it - most people grossly overestimate the actual cost of few hours of downtime every other year. > If any PostgreSQL devotees on this group can comment on the above and > its accuracy/inaccuracy, I'd really appreciate it. PG does constraints wonderfully. It's performance is midline with simple schemas. It handles very complex schemas wonderfully, and, with a little tuning, can make very effective use of memory to speed performance. My $0.02. Cheers! -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] Updating a sequential range of unique values?
How can I update a range of constrained values in order, without having to resubmit a query for every single possiblity? I'm trying to create a customer-specific sequence number, so that, for each customer, the number starts at one, and continues, 1, 2, 3, 4, 5... etc. with no values skipped. (This is necessary, as the record is used to sort values, and the order can be changed by the customer) Here's sample code that demonstrates my question: create table snark (custid integer not null, custseq integer not null, unique(custid, custseq)); insert into snark (custid, custseq) VALUES (1, 2); insert into snark (custid, custseq) VALUES (1, 4); insert into snark (custid, custseq) VALUES (1, 3); insert into snark (custid, custseq) VALUES (1, 1); begin transaction; DELETE FROM snark WHERE custid=1 AND custseq=2; UPDATE snark SET custseq=custseq-1 WHERE custid=1 and custseq>2; This generates an error! ERROR: duplicate key violates unique constraint "snark_custid_key" I've tried putting an "order by" clause on the query: UPDATE snark SET custseq=custseq-1 WHERE custid=1 and custseq>2 ORDER BY custseq ASC; But that got me nowhere. Also, I can't defer the enforcement of the constraint, as, according to the manual, this only works for foreign keys. Any ideas where to go from here? -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Postgresql Segfault in 8.1
// FIXED // Tom, thank you so much for your help! Now running 8.1.2, the query now works quickly and properly. -Ben On Wednesday 25 January 2006 13:17, Benjamin Smith wrote: > Version: postgresql-8.1.0-4.c4 > > I'll have to see about getting an update... > > Thanks a TON, > > -Ben > > On Wednesday 25 January 2006 13:11, you wrote: > > Benjamin Smith <[EMAIL PROTECTED]> writes: > > > Aha, yep. Sorry: > > > Program received signal SIGSEGV, Segmentation fault. > > > 0x0043c82c in heap_modifytuple () > > > (gdb) bt > > > #0 0x0043c82c in heap_modifytuple () > > > #1 0x0043c8f5 in slot_getattr () > > > #2 0x0047a50a in FormIndexDatum () > > > #3 0x004ebee3 in ExecInsertIndexTuples () > > > #4 0x004e5265 in ExecutorRun () > > > #5 0x00564312 in FreeQueryDesc () > > > #6 0x00565287 in PortalRun () > > > #7 0x00560f8b in pg_parse_query () > > > #8 0x00562e0e in PostgresMain () > > > #9 0x0053d316 in ClosePostmasterPorts () > > > #10 0x0053ea59 in PostmasterMain () > > > #11 0x005033c3 in main () > > > > Oh, so this is happening during index entry creation? (The reference to > > heap_modifytuple is misleading, but in a debug-symbol-free backend it's > > not so surprising.) > > > > This suddenly looks a whole lot like a known bug: > > http://archives.postgresql.org/pgsql-hackers/2005-11/msg01016.php > > > > Which version did you say you were using exactly? That bug is fixed > > in 8.1.1 ... > > > > regards, tom lane > > > > -- > > This message has been scanned for viruses and > > dangerous content by MailScanner, and is > > believed to be clean. > > > > -- > "The best way to predict the future is to invent it." > - XEROX PARC slogan, circa 1978 > > ---(end of broadcast)--- > TIP 5: don't forget to increase your free space map settings > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- "I kept looking around for somebody to solve the problem. Then I realized I am somebody" -Anonymous -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Postgresql Segfault in 8.1
Version: postgresql-8.1.0-4.c4 I'll have to see about getting an update... Thanks a TON, -Ben On Wednesday 25 January 2006 13:11, you wrote: > Benjamin Smith <[EMAIL PROTECTED]> writes: > > Aha, yep. Sorry: > > Program received signal SIGSEGV, Segmentation fault. > > 0x0043c82c in heap_modifytuple () > > (gdb) bt > > #0 0x0043c82c in heap_modifytuple () > > #1 0x0043c8f5 in slot_getattr () > > #2 0x0047a50a in FormIndexDatum () > > #3 0x004ebee3 in ExecInsertIndexTuples () > > #4 0x004e5265 in ExecutorRun () > > #5 0x00564312 in FreeQueryDesc () > > #6 0x00565287 in PortalRun () > > #7 0x00560f8b in pg_parse_query () > > #8 0x00562e0e in PostgresMain () > > #9 0x0053d316 in ClosePostmasterPorts () > > #10 0x0053ea59 in PostmasterMain () > > #11 0x005033c3 in main () > > Oh, so this is happening during index entry creation? (The reference to > heap_modifytuple is misleading, but in a debug-symbol-free backend it's > not so surprising.) > > This suddenly looks a whole lot like a known bug: > http://archives.postgresql.org/pgsql-hackers/2005-11/msg01016.php > > Which version did you say you were using exactly? That bug is fixed > in 8.1.1 ... > > regards, tom lane > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Postgresql Segfault in 8.1
Tom, Since we host customer data, I have to get OK from the company attorney before I can give you a full "howto create". I've been unable to recreate it without a full database dump. I'm waiting for a call back on that. I also can't recreate it on IA32. I tried to replicate the issue on a uniproc P4/32, but it worked fine there, so it does seem to be something specific about the fact that it's either X86/64 or that it's dual proc. The production server has 4GB of ECC RAM. I can consistently create the problem by dumping and reloading the database to a different PG database, and running it there, so AFAICT I'm not bugging anybody when I run this query. In the meantime, I found the "debuginfo" rpm, and installed it without a hitch. Luckily, it seems to "take effect" without having to restart the PG daemon. (which is busy serving 10-20 people at any given moment...) Again, here's the output from gdb. This looks a bit more useful, I hope this helps! Program received signal SIGSEGV, Segmentation fault. slot_deform_tuple (slot=0xa669c8, natts=36) at heaptuple.c:1262 1262off = att_addlength(off, thisatt->attlen, tp + off); (gdb) bt #0 slot_deform_tuple (slot=0xa669c8, natts=36) at heaptuple.c:1262 #1 0x0043c8f5 in slot_getattr (slot=0xa669c8, attnum=36, isnull=0x7fbfffde87 "") at heaptuple.c:1367 #2 0x0047a50a in FormIndexDatum (indexInfo=0xa66b60, slot=0xa669c8, estate=0xa61190, values=0x7fbfffdf10, isnull=0x7fbfffdef0 "") at index.c:962 #3 0x004ebee3 in ExecInsertIndexTuples (slot=0xa669c8, tupleid=0xa6efc4, estate=0xa61190, is_vacuum=0 '\0') at execUtils.c:925 #4 0x004e5265 in ExecutorRun (queryDesc=Variable "queryDesc" is not available. ) at execMain.c:1437 #5 0x00564312 in ProcessQuery (parsetree=Variable "parsetree" is not available. ) at pquery.c:174 #6 0x00565287 in PortalRun (portal=0xa5ed70, count=9223372036854775807, dest=0xa596f8, altdest=0xa596f8, completionTag=0x7fbfffe380 "") at pquery.c:1076 #7 0x00560f8b in exec_simple_query ( query_string=0xa440e0 "INSERT INTO lcclasses (id, schoolyear, modified, entrydate, creator, status, name, location, city, maxclasssize, prerequisites, cost, costnote, coursecode, section, credits, whytake, materialsnote, te"...) at postgres.c:1014 #8 0x00562e0e in PostgresMain (argc=4, argv=0xa0cca0, username=0xa0cc60 "cworksdev") at postgres.c:3168 #9 0x0053d316 in ServerLoop () at postmaster.c:2852 #10 0x0053ea59 in PostmasterMain (argc=5, argv=0x9ea510) at postmaster.c:943 #11 0x005033c3 in main (argc=5, argv=0x9ea510) at main.c:256 (gdb) continue Continuing. Program terminated with signal SIGSEGV, Segmentation fault. The program no longer exists. ## Postgresql.conf listen_addresses = '127.0.0.1' port = 5432 max_connections = 96 shared_buffers=25 temp_buffers = 1 max_prepared_transactions = 0 work_mem = 1024 # min 64, size in KB maintenance_work_mem = 16384# min 1024, size in KB max_stack_depth = 9240 redirect_stderr = on# Enable capturing of stderr into log log_directory = 'pg_log'# Directory where log files are written log_truncate_on_rotation = on # If on, any existing log file of the same log_rotation_age = 1440 # Automatic rotation of logfiles will log_rotation_size = 0 # Automatic rotation of logfiles will autovacuum = on autovacuum_naptime = 600 lc_messages = 'en_US.UTF-8' # locale for system error message lc_monetary = 'en_US.UTF-8' # locale for monetary formatting lc_numeric = 'en_US.UTF-8' # locale for number formatting lc_time = 'en_US.UTF-8' # locale for time formatting add_missing_from = on -Ben On Wednesday 25 January 2006 11:18, you wrote: > Benjamin Smith <[EMAIL PROTECTED]> writes: > > OK, here's the output: > > (gdb) continue > > Continuing. > > > Program received signal SIGSEGV, Segmentation fault. > > 0x0043c82c in heap_modifytuple () > > (gdb) > > > // not very hopeful, I'd think // > > You forgot the "bt" part ... although I'm not sure we'd learn a whole > lot more without debug symbols. > > > Can I get a RHES/CENTOS PG 8.1 RPM with the symbols enabled? > > Current Red Hat practice is to put the debug symbols into separate > "debuginfo" RPMs. Hopefully you can find the debuginfo RPM wherever > you got the postgres RPM from. > >
Re: [GENERAL] Postgresql Segfault in 8.1
OK, here's the output: (gdb) continue Continuing. Program received signal SIGSEGV, Segmentation fault. 0x0043c82c in heap_modifytuple () (gdb) // not very hopeful, I'd think // Can I get a RHES/CENTOS PG 8.1 RPM with the symbols enabled? Unfortunately, I don't think I can give out a dump of the DB (heavily constrained) because of private customer information... and the query works *FINE* with different datasets. There's something specific about THIS QUERY that's causing the failure. I'm going to try to get this to fail on another system that's not in production use, though it's a uniprocessor P4. -Ben On Wednesday 25 January 2006 07:52, you wrote: > Benjamin Smith <[EMAIL PROTECTED]> writes: > > What's the best way to do this? Take PG down (normally started as a service) > > and run directly in a single-user mode? > > No, just start a psql session in one window, then in another window > determine the PID of the backend process it's connected to, and attach > gdb to that process. Something like > > ps auxww | grep postgres: > ... eyeball determination of correct PID ... > gdb /path/to/postgres-executable PID > gdb> continue > > Now, in the psql window, do what's needed to provoke the crash. gdb > should trap at the instant of the segfault and give you another gdb> > prompt. Type "bt" to get the backtrace, then "q" to disconnect. > > regards, tom lane > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Postgresql Segfault in 8.1
Thanks, What's the best way to do this? Take PG down (normally started as a service) and run directly in a single-user mode? I've never reallly worked with gdb... -Ben On Tuesday 24 January 2006 17:27, you wrote: > > What information do you need to help figure this out? > > Reproduce it with gdb attached to the backend process and post the > backtrace... You may need to recompile PG with debugging symbols to > get the most info. -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Postgresql Segfault in 8.1
I'm running PostgreSQL 8.1 on CentOS 4.2, Dual proc Athlon 64 w/4 GB RAM. I'm trying to get a PHP app to work, but the failure happens when the command is copy/pasted into pgsql. Trying to run a large insert statement, and I get: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. # in /var/log/messages, I see Jan 24 17:00:04 kepler kernel: postmaster[26185]: segfault at 2516d728 rip 0043c82c rsp 007fbfffddd0 error 4 The insert statement is long, but doesn't seem to violate anything strange - no weird characters, and all the fields have been properly escaped with pg_escape(). What information do you need to help figure this out? -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Putting restrictions on pg_dump?
Good ideas, all. but, what about keeping things like check constraints, foreign keys, etc? Hmmm... maybe, if I dumped the entire DB schema, with no data, and then looped thru the tables, creating a temp table (as you describe) with a funky name (such as TABLEaBcDeFgH_U) and then pg_dumping that, and then using a regex to rename the table in the output... (eg /TABLE\s+TABLEaBcDeFgH_U/TABLE customers/ Ugh. I was hoping there was a cleaner way... -Ben On Wednesday 04 January 2006 23:35, you wrote: > On Wed, 4 Jan 2006 21:00:25 -0800, Benjamin Smith <[EMAIL PROTECTED]> wrote: > > Is there a way to put a limit on pg_dump, so that it doesn't dump ALL data, > > but that matching a particular query? > > > > Something like: > > > > pg_dump -da --attribute-inserts -t "customers" \ > > --matching-query="select * from customers where id=11"; > > > > I'd like to selectively dump information from a query, but using the output > > format from pg_dump so that it can be used to create a (partial) database. > > > > Can this sort of thing be done? > > Not directly with pg_dump. > > You could create a table (create table customers_1 as select * from > customers where id=11) and dump that but remember to change the > tablename in the dump file or after loading it. You dont get any > pk/fk/indexes on the table definition. > > You could also use copy to stdout/stdin. > > eg dump > psql -d dbname -c "create temp table dump as select * from customers > where id=11; copy dump to stdout;" >dumpfile > > eg restore > psql -d newdb -c "copy customers from stdin" > You might need to play around with supplying username/password. > > klint. > > +---+-+ > : Klint Gore: "Non rhyming: > : EMail : [EMAIL PROTECTED] : slang - the: > : Snail : A.B.R.I.: possibilities : > : Mail University of New England : are useless" : > : Armidale NSW 2351 Australia : L.J.J. : > : Fax : +61 2 6772 5376 : : > +---+-+ > -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Putting restrictions on pg_dump?
Is there a way to put a limit on pg_dump, so that it doesn't dump ALL data, but that matching a particular query? Something like: pg_dump -da --attribute-inserts -t "customers" \ --matching-query="select * from customers where id=11"; I'd like to selectively dump information from a query, but using the output format from pg_dump so that it can be used to create a (partial) database. Can this sort of thing be done? -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Anyone doing a 8.1.0-ia64-RHEL4-as.rpm ?
I'm using CentOS 4.2 on a dual-opteron, but I'd guess it'd probably work for IA64... Try putting this somewhere in your /etc/yum.repos.d/... #additional packages that extend functionality of existing packages [centosplus] name=CentOS-$releasever - Plus baseurl=http://mirror.centos.org/centos/$releasever/centosplus/$basearch/ gpgcheck=1 enabled=1 includepkgs=postgresql* compat-postgresql-libs* gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-centos4 You'll need to import the RPM-GPG-KEY first, and make sure you backup your postgres. Simply yum update from PG 7.x to 8.x will have some nasty surprises without a dump/reload. Then, use yum... -Ben On Wednesday 04 January 2006 03:04, DANTE ALEXANDRA wrote: > Hello, > > Is anyone working on an 8.1.0 RPM for IA64 on Red Hat Enterprise Linux 4 > AS ? > On the web site : > http://www.postgresql.org/ftp/binary/v8.1.1/linux/rpms/redhat/, I have > found a "rhel-as 4" version for RHEL4 but I don't think it is a version > for IA64. > > Could someone explain me how generate this rpm ? > > Thank you for your help. > Regards, > Alexandra DANTE > > ---(end of broadcast)--- > TIP 4: Have you searched our list archives? > >http://archives.postgresql.org > -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Performance woes
Wow! I did exactly what you suggested, and played with the ordering for about 20-30 minutes. After 10 minutes or so, I'd gotten the response time down to 1700 ms from 2200 ms. Moving the join conditions up into the "FROM" clause, and dropping the "WHERE" clause altogether allowed me to reorder the statements easily without having to worry about rethinking all the logic. And, this dropped the query time from between 2.2-30 seconds all the way down to just 55-ish ms, without any new indexes! What's more, the improvement came from a move of a block I thought more or less unimportant! // tries to put jaw back into mouth // -Ben On Monday 12 December 2005 16:11, you wrote: > Benjamin Smith <[EMAIL PROTECTED]> writes: > > The example that I gave was a small one to illustrate my understanding of > > multiple foreign keys, indexes and how they work together. (or don't) The > > actual query is quite a bit bigger and nastier. I've considered breaking it > > up into smaller pieces, but this query has been extensively tested and > > debugged. It's on a complex schema carefully designed to meet some very > > demanding requirements. > > What you probably need to do is rethink the join order. As coded, the > planner has no freedom to change the join order, which means it's up to > you to get it right. In particular it seems a bad idea to be processing > the join to enrollments last when that table is the best-constrained > one. Instead of "enrollments, stmoduleobjass LEFT JOIN lots-o-stuff" > consider "enrollments JOIN stmoduleobjass ON relevant-join-conditions > LEFT JOIN lots-o-stuff". Likewise for lcregistrations vs lcclasses. > > regards, tom lane > -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Performance woes
id = 18) AND ((schoolyear)::text = '2005 - 2006'::text)) -> Hash (cost=5.19..5.19 rows=12 width=20) (actual time=0.102..0.102 rows=10 loops=1) -> Bitmap Heap Scan on tr_summary (cost=2.04..5.19 rows=12 width=20) (actual time=0.043..0.063 rows=10 loops=1) Recheck Cond: (scope_id = 18) -> Bitmap Index Scan on unique_rollover (cost=0.00..2.04 rows=12 width=0) (actual time=0.027..0.027 rows=10 loops=1) Index Cond: (scope_id = 18) Total runtime: .063 ms ** Configuration: I tried tweaking shared_buffers, but adding more/less did nothing to improve performance. Current values: Dual proc Opteron 2.0 Ghz, 4 GB ECC RAM. 10k SCSI drives, software RAID 1 Centos 4.2 (Redhat ES clone) PostgreSQL 8.1, 64 bit, loaded with RPMs from the PG website for Redhat ES. max_connections 64 shared_buffers 25 temp_buffers 1 max_prepared_transactions = 0 work_mem 1024 maintenance_work_mem = 16384 max_stack_depth 9240 autovacuum on autovacuum_naptime 600 Side note: When I add indexes or change table definitions to try to get PG to use indexes, performance tanks instantly to very, poor. (> 30 seconds query time) But, when I run the query a few times, and then run vacuum analyze, it snaps back down to the 2-ish second range. -Ben On Saturday 10 December 2005 11:50, Stephan Szabo wrote: > > On Sat, 10 Dec 2005, Benjamin Smith wrote: > > > A few questions: > > > > 1) Let's assume that I have some multipile foreign keys, and I join on three > > values. For example: > > > > Create table gangsters ( > > name varchar not null, > > birthdate integer not null, > > shirtnumber integer not null, > > primary key (name, birthdate, shirtnumber); > > > > create table children ( > > father_name varchar not null, > > father_bd integer not null, > > father_shirtnumber integer not null, > > birthdate integer not null, > > name varchar not null, > > foreign key (father_name, father_bd, father_shirtnumber) REFERENCES > > gangsters(name, birthdate, shirtnumber) > > ); > > > > We have two table declarations, each with implicit indexes: > > 1) table gangsters has a primary_key index on name, birthdate, shirtnumber. > > 2) children has an implicit index on father_name, father_bd, > > father_shirtnumber. (right?) > > AFAIK, not unless you create one. > > Explain analyze output for the query would probably be useful as well. > > ---(end of broadcast)--- > TIP 9: In versions below 8.0, the planner will ignore your desire to >choose an index scan if your joining column's datatypes do not >match > -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Performance woes
I have a small company growing fast, selling a product based largely on Postgres. We have a rapidly growing database with (currently) 117 tables. Our primary server, an aging system based around an IDE, 2.4 Ghz P4, is being replaced by a new, 10k SCSI, dual-proc AMD opteron/64, with 4 GB of ECC RAM. At just about every task, the newer machine just blows away the P4. Rebuilding the database happens in < 1 minute, instead of nearly 1/2 hour! Copying GB of data files is blazing fast! But, the real money shot is a single query. It's big, nasty, and complex, and hit pretty hard. It's not IO bound, it is clearly CPU bound. I've allocated up to 3 GB of RAM for pg, and tweaked the shmmax and shared_buffers. And, it's exactly as fast on the new, dual-proc Opteron as the aging P4. 2.2 seconds. It's literally within 1 ms time! (2,206 ms vs 2,207 ms) Throwing more RAM at it makes no difference. WTF??!?!? A few questions: 1) Let's assume that I have some multipile foreign keys, and I join on three values. For example: Create table gangsters ( name varchar not null, birthdate integer not null, shirtnumber integer not null, primary key (name, birthdate, shirtnumber); create table children ( father_name varchar not null, father_bd integer not null, father_shirtnumber integer not null, birthdate integer not null, name varchar not null, foreign key (father_name, father_bd, father_shirtnumber) REFERENCES gangsters(name, birthdate, shirtnumber) ); We have two table declarations, each with implicit indexes: 1) table gangsters has a primary_key index on name, birthdate, shirtnumber. 2) children has an implicit index on father_name, father_bd, father_shirtnumber. (right?) If I were to join on gangster and children, EG: Select gangster.name AS father, gangster.birirthdate AS father_bd, children.name AS kid_name from gangster, children where gangster.name=children.father_name AND gangster.birthdate = children.father_bd AND gangster.shirtnumber=children.father_shirtnumber; Wouldn't this use the indexes implicitly created in the primary_key and foreign key constraints? -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Errors upgrading from 7.3 to 8.1
Currently running Postgres 7.3.9 32bit on Fedora Core 1, and upgrading to PG 8.1 64bit on Centos 4. When I load the file, psql -U dbname < dbname.sql I get this error: ERROR: invalid UTF-8 byte sequence detected near byte 0x96 when inserting fields that seem to contain HTML. What could be causing this? My understanding is that pg_dump should properly escape things so that I'm not trying to dump/load things improperly. The dumps are made (on the PG 7.3 server) pg_dump -d -f $OUTPUT.pgsql $db Are being restore with (on the new 8.1 server) psql -U $db -e < $OUTPUT.pgsql -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] SQL injection
Prepared statements are the way to go. I developed my own prepared statements methodology (I called it "SafeQuery") some time back before discovering that others had done it. It's so nice, since I've not worried about SQL injection for YEARS. Sample of my API: $_REQUEST['username'], 'password'=>$_REQUEST['password']); if (!$res=$MDB->SafeQuery($sql, $todb)) return Error("Database query failure"); ?> SafeQuery checks: 1) That the variables in the query (in brackets) and in the input array all match up. 2) Runs pg_escape_string on all elements in $todb; 3) Copy/Pastes strings from the array into the query. 4) Runs query against DB, returns results from pg_exec(); -Ben On Tuesday 01 November 2005 05:27, Kevin Murphy wrote: > Can some knowledgeable person set the record straight on SQL injection, > please? I thought that the simple answer was to use prepared statements > with bind variables (except when you are letting the user specify whole > chunks of SQL, ugh), but there are many people posting who either don't > know about prepared statements or know something I don't. > > Thanks, > Kevin Murphy > > P.S. I don't use PHP, but google informs me that PHP definitely has > prepared statement options: PEAR::DB, PDO in 5.X+, etc. > > ---(end of broadcast)--- > TIP 2: Don't 'kill -9' the postmaster > -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Dumb Questions - upgrade notes?
I'm running 7.3, and considering the upgrade to 8.1 to make use of multiple indexes. Where is the upgrade notes from 7.3->7.4, and from 7.4-> 8.x so that I can see what impact this would have on my app? I can't seem to find them... Thanks -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Frequency of Analyze?
Wow. Does it really produce the expected (probably dramatic, in my case!) improvement in performance? (I'll be trying it out anyway..., but I'd love your feedback) How stable is it? Looks like I have a PG upgrade in my near future... -Ben On Friday 28 October 2005 14:51, you wrote: > > 3) What's the best way to handle indexes when only 1 index is used per table > > in a query, but there are several due to the various constraints on it? Is > > there a way to benefit from all of these other indexes somehow? Is there a > > definitive, detailed book for optimizing PG queries? > > 8.1 can utilize multiple indexes per table. -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Frequency of Analyze?
I have a rapidly growing database with a very complex schema, and I'm looking to improve performance. It's typical to have 2-4 foreign keys in each table, and there are currently 113 tables, and queries with 5-10 tables with combined inner/outer joins are pretty typical. (I avoid subqueries anywhere I can) So far, my guiding philosophy has been "constrain everything" with primary keys, unique, foreign keys and the like, relying on the implicit indexes and the query scheduler to handle things, and so far, it's worked very well. The database has grown in total size (calculated by the size of a pg_dump) 25% in the past month, and the growth rate seems to be accellerating. (yikes!) We are buying new equipment now, but I'd still like to optimize as best as possible. A few questions: 1) Any problem with running "Analyze" hourly via cron? 2) Does "Vacuum analyze" also do the actions performed by "Analyze"? 3) What's the best way to handle indexes when only 1 index is used per table in a query, but there are several due to the various constraints on it? Is there a way to benefit from all of these other indexes somehow? Is there a definitive, detailed book for optimizing PG queries? -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Why different execution times for different instances for the
I ran into something like this once, where a complex update occurred inside a transaction. When the update happened, I saw what you describe - the DB hung, and the load average shot out thru the roof until I restarted the PG daemon. The query otherwise worked fine, but only failed with this specific set of input data, about which nothing was unusual. (and now long since gone) I never found out exactly what it was that caused the problem, but rewording the query to something functionally identical but differently structured solved it. I've never seen the problem since. Wish I had more to say that might help identify the actual problem. I'm using Postgres 7.3.9 on Fedora Core 1. -Ben On Tuesday 25 October 2005 13:33, Oliver Elphick wrote: > On Tue, 2005-10-25 at 13:14 -0700, Kishore B wrote: > > Hi , > > I am Kishore doing freelance development of J2EE applications. > > > > > > We switched to use Postgresql recently because of the advantages it > > has over other commercial databases. All went well untill recently, > > untill we began working on an application that needs to maintain a > > huge database > > > > > > > > We are executing a single query that returned very fast on the first > > instance. But when I executed the same query for multiple times, it is > > giving strange results. It is not coming back. > > > > When I checked with the processes running in the system, I observed > > that multiple instances of postmaster are running and all of them are > > consuming very high amounts of memory. I could also observe that they > > are sharing the memory in a uniform distribution across them. > > > > Can you please let me know if you have experienced the same and how do > > you resolved it. > > > > Thank you, > > Kishore. > > No, I haven't seen anything like that. > > I am forwarding this mail to the general enquiries list, in case anyone > else can help. > > -- > Oliver Elphick olly@lfix.co.uk > Isle of Wight http://www.lfix.co.uk/oliver > GPG: 1024D/A54310EA 92C8 39E7 280E 3631 3F0E 1EC0 5664 7A2F A543 10EA > >Do you want to know God? http://www.lfix.co.uk/knowing_god.html > > > ---(end of broadcast)--- > TIP 5: don't forget to increase your free space map settings > -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Dumb question about count()
I'm sure I've done this before, but for some reason, my main noodle is drawing a blank. Assume we have three tables defined thusly: create table classrooms ( id serial unique not null, name varchar ); create table seats ( classrooms_id integer not null references classrooms(id), position varchar ); create table students( classrooms_id integer not null references classrooms(id), name varchar ); Now, I want to get a result like: classroom | students | seats 101A0 25 101B22 30 102A11 0 ... etc. Something somewhat akin to select classroom.title, count(students.id) AS students, count(seats.id) AS seats from classrooms, students, seats where classrooms.id=students.classrooms_id and classrooms.id=seats.id Except that it counts 0s for seats/students. Why can't I recall/find how to do this particular join? -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Deleting a rule?
I wrote a rule a while back that, due to the software being extended, now needs to be deleted. How do I drop a rule? DELETE FROM pg_rules WHERE rulename='foo'; doesn't seem to cut it... -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Postgres vs Firebird?
Based on the extensive feedback here, as well as other information from other websites found since asking here, I've decided that I'm still, very happily, a PG user. No significant issues to date - PG has "just worked" for me for 5 years now, and the frustrating limitations (EG: alter table drop field, row size limits) functionally disappeared some time ago. It's always good to reevaluate where you are now to make sure you're not doing something stupid. Replication is the only thing that remains, and there appears to be commercially viable solutions for that, as well, though we've not yet implemented it. We're still in a very heavy product development phase, and we have a pretty good backup solution. When will PG replication come ready "out of the box" with the RH RPMs? Anyway, once again, Postgres is my friend... -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 8: explain analyze is your friend
[GENERAL] Postgres vs Firebird?
As a long-time user of Postgres, (First started using it at 7.0) I'm reading recently that Firebird has been taking off as a database. Perhaps this is not the best place to ask this, but is there any compelling advantage to using Firebird over Postgres? We have a large database (almost 100 tables of highly normalized data) heavily loaded with foreign keys and other constraints, and our application makes heavy use of transactions. I say this as my company's growth has been exponential, showing no sign of letting up soon, and I'm reviewing clustering and replication technologies so that we can continue to scale as nicely as we have to date with our single server. (now with a load avg around .30 typically) -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[GENERAL] Lost in Foreign Key land
Ok, I have a stupid-simple table: create table files ( id serial unique not null, mime varchar not null, name varchar not null ); Idea being that I can save values from a file upload into a table, and use throughout my application. This gives me a central repository to look for files of a particular name, etc. while allowing multiple files with the same name to be uploaded. It might be used something like this: create table personalinfo ( name varchar, address varchar, resume integer not null references files(id) ); But, I want to keep it "clean" - meaning that if the file orphaned, (isn't used anywhere), I want to delete it. I asked a related question a few months ago, and never understood the responses. (See thread "Search for restricting foreign keys") I just spent the last few hours trying to make sense of Forian Plug's query from an email dated 1/25/05 and reading up on the attribute tables, and I am lost. I'm sure it's very logical, and I'm just as sure that the logic, for now, escapes me. What I'd like to be able to do is get a list of files table id fields that have no values tied to them. If I execute "delete from files;", it won't delete them, because of foreign keys that refer to one or more of the files records. How can I get a list of files records with no foreign key records referencing the id field, without having to join on every single table that refers to files(id)? (now maybe a dozen, and growing) Something like "select id from files where id not in (select references to files.id)"; -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[GENERAL] pseudo-serial values in dual primary key?
Is it possible to have the equivalent of a serial data type in a table, sub-categorized? Assume the following: create table categories (id serial, title varchar); Now, I want to create an entries table, and by default, count serially by category, so that category 1 has entries.sequence of 1, 2, 3, and so does category 2. (where sequence= 1, 2, 3...) Something like: create table entries ( categories_id integer not null references categories(id), sequence default max(entries.sequence WHERE categories_id=this.categories_id), primary key (categories_id, sequence) ); I'm not sure about the semantics of this, but i want sequence to start at 1, and count up, for its category as defined by categories_id. I already know that I can set enforce the uniqueness of categories_id and sequence with thte primary key, and I could just write some more app code to do a query to get the max value of sequence where categories_id=$categories_id, but can this be done without adding a bunch of xtra application code? -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[GENERAL] Checking for schedule conflicts
Given the tables defined below, what's the easiest way to check for schedule conflicts? So far, the only way I've come up with is to create a huge, multi-dimensional array in PHP, with a data element for every minute of all time taken up by all events, and then check for any of these minutes to be set as I go through all the records. (ugh!) But, how could I do this in the database? But I'd like to see something like "select count(*) FROM events, sched WHERE sched.date=$date AND events.id=sched.events_id ... GROUP BY date, start1 " And here's where I get stumped. You can't group by start or end because we need to check if they OVERLAP any other records on the same date. Ideas? // Sometimes, recurring events create table events ( id serial not null primary key, title varchar ); // date=MMDD, start/end: HH:MM (24-hour) create table sched ( events_id integer not null references events(id), date integer not null, start integer not null, end integer not null ); insert into events (title) VALUES ('Tuesday Pickup'); insert into sched(events_id, date, start, end) VALUES (1, 20050308, 0900, 1300); insert into sched (events_id, date, start, end) VALUES (1, 20050315, 0900, 1300); -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 8: explain analyze is your friend
[GENERAL] PG and OpenSSI?
Has anybody had any experience using PG with OpenSSI (Single System Image) for a high-availability cluster? http://openssi.org/cgi-bin/view?page=openssi.html Is this feasible? Possible? Easier/harder than other PG clustering solutions? -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[GENERAL] Search for restricting foreign keys
Is there a way in PG 7.3, given a field, to find out what other tables & records are linked to it via a foreign key? EG: create table cities (id serial primary key, title varchar not null); insert into cities(title) values ('San Fransisco'); insert into cities(title) values ('Los Angeles'); create table stores (id serial primary key, city integer not null references cities(id), title varchar); insert into stores(city, title) values (1, 'North City'); insert into stores(city, title) values (2, 'Central District'); insert into stores (city, title) values (1, 'Beachfront"); Given the above, and I wanted to know all the tables/records that relate to id 1, San Fransisco, and get a result something like: table | primary key stores | 1 stores | 3 Does such functionality exist in PG? Isn't it already doing essentially this when attempting to delete a record with other records linked to it? Currently, I do this by attempting to delete a record in a transaction, and trap the error - it's a terrible way to do this, and sometimes I'm already in a transaction. -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[GENERAL] Tracking back foreign keys?
In one of my apps, I have an "images manager" that's become unmanageable. I'd like to be able to easily track key dependencies. Let's say I have a set of tables like this: create table Customers ( id serial unique not null primary key, name varchar not null, address varchar not null image integer references images(id) ); create table Vendors ( id serial unique not null primary key, name varchar not null, vendor_account varchar, picture integer references images(id) ); create table Images ( id serial unique not null, filename varchar not null, mime varchar not null ); I know that in the images table I have lots of cruft, "dead wood", but when I delete from images, is there a "nice" way of finding out what dependencies there are? Something like Select pg_table.name from pg_table where pg_field references images.id ? How else do I put it? The output I'd like would be something like images.id / tablename / table.primary key 11 / Vendors / 14 12 / Customers / 9 Can this be done? -Ben -- "I kept looking around for somebody to solve the problem. Then I realized I am somebody" -Anonymous ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] Non-aggregate values attached to aggregates?
Thanks much for your help! It took a few tries to get what it was all about, but I got it. On Thursday 16 December 2004 14:09, Michael Fuhr wrote: > > Dates are kept as ]MMDD', eg 2004114 for Nov 14, 2004. > > Why not use a DATE type? You can reformat it with to_char() if > need be. Not a bad idea. I used int because it was simple and "good enough". My app is written in PHP, and by using strtotime() I can trap typo errors before they get near the database (which frequently are difficult to display in a layperson readable format) EG: $sql="INSERT INTO enrollments (date) values ('[date]')"; $pass=array('date'=>$_ENROLL['date']; if (!$DB->SafeQuery($sql, $pass)) return error("Database Error: ".$DB->Error()); If there's an error here, it'd say something like: "ERROR: Bad date external representation 'mya 11 2004'. Compare with: $sql="INSERT INTO enrollments (date) VALUES [date]"; IF (($date=strtotime($_REQUEST['date'])) ==-1) return Error("I'm sorry, but ".$_REQUEST['date']." does not appear to be a valid date. Please fix and try again"); if (!$DB->SafeQuery($sql, array('date'=>$date))) return error("Database Error: ".$DB->Error()); which is much more "friendly" to the end user. BTW: What is to_char, and why doesn't this work in PG 7.3.x? select to_char(datefield) from TableX; -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[GENERAL] Non-aggregate values attached to aggregates?
I have a list of students, and a list of enrollment records, and I'm trying to get a list of students and their most recent enrollment/disenrollment dates. create table students (id serial primary key, name varchar); create table enrollments ( students_id integer not null references students(id), start integer not null, finish integer not null default 0); insert into students (name) VALUES ('johnny'); insert into enrollments (students_id, start, finish) VALUES (1, 20030901, 20040530); insert into enrollments (students_id, start, finish) VALUES (1, 20040901, 0); Student enrolled last year, and is currently enrolled. If students are currently enrolled, the finish date is "0". Dates are kept as ]MMDD', eg 2004114 for Nov 14, 2004. I want to be able to export the student name, most recent enrollment date, and disenrollment date. I've successfully gotten the student name and most recent enrollment date, but never the associated exit date. This returns most recent enrollment date: select students.name, max(enrollments.start) as start from students, enrollments where enrollments.students_id=students.id group by students.name; Now, to get the exit date, I've tried select students.name, max(enrollments.start) as start, finish from students, enrollments where enrollments.students_id=students.id AND max(enrollments.start)=enrollments.start group by students.name, enrollments.finish which results in "ERROR: Aggregates not allowed in WHERE clause" and also: select students.name, max(enrollments.start) as start, finish from students, enrollments where enrollments.students_id=students.id group by students.name, enrollments.finish having enrollments.start=max(enrollments.start); which returns "ERROR: Attribute enrollments.start must be GROUPed or used in an aggregate function" How can this be done? Can it be done? -Ben -- "The best way to predict the future is to invent it." - XEROX PARC slogan, circa 1978 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org