Re: [GENERAL] Config for fast huge cascaded updates
On Mon, Jun 26, 2017 at 07:26:08PM -0700, Joshua D. Drake wrote: > Alternatively, and ONLY do this if you take a backup right before hand, you > can set the table unlogged, make the changes and assuming success, make the > table logged again. That will great increase the write speed and reduce wal > segment churn. Note that this is not for just that table, but for all of the implicated ones because of the CASCADE statements. It sounds like the OP is basically rewriting a significant chunk of the entire database, so nothing is going to be super fast: all those CASCADEs have to fire and all those other tables need to be updated too. > However, if that fails, the table is dead. You will have to reload it from > backup. Right, and that goes for all the affected tables. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Config for fast huge cascaded updates
On Tue, Jun 27, 2017 at 10:17:49AM +1200, Craig de Stigter wrote: > We're doing a large migration on our site which involves changing most of > the primary key values. We've noticed this is a *very* slow process. Indeed. Does the database need to be online when this is happening? If it were me, I'd try to find a way to dump it, modify the data in a dump file, and then reload it. I think that'd be faster. Another way you might try, if you need to be online while doing this, is to write the whole thing into a new SQL schema. Make the mods you need. When you think you're close to done, put a trigger in the "old schema" to update data in the new schema, then do a last pass to catch anything you missed in the interim, then cut your application over to the new schema (update the search_path, force everything to disconnect and reconnect, and when they reconnect they have the new data in place). A variation on this technique is also useful for gradual roll-out of new features -- you don't have to upgrade everything at once and you have a natural rollback strategy (but you need a more complicated set of triggers that keeps the two schemas in sync during cutover period). This second approach isn't faster, it's hard on I/O and disk space, but it keeps you up and you can do the changes at a leisurely pace. Just make sure you have the I/O and space before you do it :) Hope that helps, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] browser interface to forums please?
On Sat, Mar 25, 2017 at 04:49:33AM +, Yuri Budilov wrote: > Can these forums be moved to internet ? Last I looked, SMTP generally works on the Internet. You mean "the web", which is a part of but not all of the Internet. > So hard to follow, so hard to search for historical answers. Why do you find it hard to follow? The list is pretty well-ordered and tends to thread nicely, so you can use any threaded mail reader (or reader in threaded mode) and you should have some success. I note from the headers on your mail that you appear to be using both MS Exchange and MS Outlook. I haven't looked since the early 2000s, so things might be better, but I recall Outlook having really bad threading support at the time. Maybe you need to switch to gmail for reading the postgres lists. It threads ok, and meets your stated requirement of being in a browser. As for historical answers, I'm not exactly sure what you want but you will find at https://www.postgresql.org/list/ a "search archives" function and an "advanced" search function at https://www.postgresql.org/search/?m=1. It's as good as most online forum search tools I've ever used, though not as good as Google. Which, I note, works well too. > We really need to be able to post via browser. Why? What does "post via browser" get you that sending an email doesn't? I can think of somethign it does _not_ get you, however, and that is the attention of some of the key contributors to Postgres, who appear to work mostly in a mode where email makes things easy for them and logging into a new forum tool makes things harder. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres HA
Hi, On Wed, Feb 22, 2017 at 10:58:10PM +, Dylan Luong wrote: > > I am looking at options to improve our high availability. I suspect the central question you have to answer is, "What do you mean by 'improve'?" Do you want to increase the ability to retrieve data? Decrease the potential for data loss? Shorten the recovery time to read/write availability? And so on. The answers for these different questions will determine which trade-off you need to make. And rest assured, there is abolutely no solution in the world -- not even a really expensive commercial one -- that requires no trades. Distributing data reliably with ACID semantics and no data loss or corruption or loss in write throughput is not possible, at least today. You have to pick which poison you want :) A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] fun fact about temp tables
On Fri, Aug 05, 2016 at 06:02:08PM +0300, Grigory Smolkin wrote: > But it`s temporary table so it`s equal to saying 'I don`t care about this > data' and I can get 'out of disk space' regardless of using temporary > tables. > > What are we winning here? Surely, that the transaction operates in a predictable way? A temp table doesn't say, "I don't care about this data," it says, "I don't care about this data over the long haul." I've had lots of data go through temp tables that I really really wanted to get into some other place later, and it'd suck if the transaction failed half way through because it turns out there's nowhere to put the data I've just staged. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
pg_upgrade and not working (was Re: [GENERAL] Uber migrated from Postgres to MySQL)
I bet there are some people around here who could help you troubleshoot. Just sayin' (I'm not among them.) A On Thu, Jul 28, 2016 at 09:34:05AM +1200, Patrick B wrote: > We can't use the pg_upgrade in our 3TB database just does not work.. > that's the main reason we're still using 9.2. -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Uber migrated from Postgres to MySQL
On Wed, Jul 27, 2016 at 02:33:54PM -0500, Kevin Grittner wrote: > Until you get to the end of the upgrade and *start the cluster > under the new version* you can fall back to the old version. Yeah, but to be fair a lot of well-funded businesses (note what started this discussion) are pico-managed by people way up the stack who want a bite-sized answer. If you do not have a happy story for, "What if we're 48 hours into the upgrade and discover some critical bug corner case that makes us need to roll back?" then you're going to lose them. Never mind that such cases literally never happen (if you have a 48 hour old bug in an Internet system today, you have an emergency bugfix, not a rollback). A great deal of practical delivery of technology involves managing expectations of management who do not understand what they are asking for and basically want a glib happy answer. As people delivering such things, we must find a glib happy answer that does not get us fired if it turns out to be false. The poor story Postgres has about downgrade, _even if it's a stupid problem_, is a problem. It might not be worth fixing because it's a stupid problem. But one has to face the critique in its own terms. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Uber migrated from Postgres to MySQL
On Wed, Jul 27, 2016 at 01:58:25PM -0400, Rakesh Kumar wrote: > > I am surprised PG does not even allow minor version rollback. It almost never happens that a minor version (N.M.x, x is minor) requires an upgrade at all. Change your binaries and you're done. Catalogue incompatibility historically was the basis for something becoming a major version upgrade. (I can recall a couple bugs where you had to tickle the catalogues, so it's not exactly true that they're never incompatible, but it's incredibly rare.) Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Uber migrated from Postgres to MySQL
On Wed, Jul 27, 2016 at 04:51:42PM +0100, Geoff Winkless wrote: > technical reasons. Most developers will harp on at their boss about how > terrible their current database is and how performs > much better. Eventually one of two things happens: either a) those > developers end up in a position where their direct boss is in a position to > make the change and he or she doesn't understand how much time and money it > will actually take to change; or b) commercial considerations dictate the > change. In a different context, someone suggested to me that Postgres advocates sounded to him too often like FreeBSD advocates complaining about Linux, and I'm afraid there is a certain truth to that. Given the discussion in the post in question, the decision to use MySQL appears to have been well-justified: 1. They'd decided to use a NoSQL database and ditch relational systems, because shards. 2. They wanted an MVCC engine behind the above. 3. They wanted SQL semantics to this MVCC-enabled filesystem layer. Sounds just like MySQL+InnoDB to me. Once you've already decided on (1), the rest of it flows pretty naturally and Postgres is probably not your choice. You can dismiss any of 1-3 as commerical or political advocacy, but while I happen to think they're a somewhat questionable set of goals they're not obviously stupid, and competent people of good will could disagree about them. At the same time, there really are two serious problems with Postgres under heavy write loads. Postgres's focus on readers' speed and convenience means you have to take the hit somewhere, so writers take it instead. (The other side of the disk-layout description in the blog post is that, under MySQL, secondary index use is more expensive for readers than it is in Postgres. The post acknowledges that, but of course most important secondary indexing is useless under sharding anyway, since you have to select from shards; so they won't care.) I/O storms on Postgres are a major source of pain for large operators, and the tools for understanding are sort of primitive because many of them depend on underlying OS features and tools. The second is the upgrade-by-replica-and-fallback-plan problem. It's really an issue. There is a reason that, back in the cloudy past, we designed Slony to be able to replicate to and from any supported version of Postgres: Afilias needed to be able to upgrade without a lot of down time and with the ability to roll back if we had to, because that was our contractual obligation. This has always been a large gap, and when it was raised in the past the answer was, "Well, Slony can already do that so use it." It wasn't too satisfying then, and it's not much more satisfying now. :) > better invested in employing one of the commercial PG companies to improve > the specific postgres problems they found. I think the two big problems laid out above are deep architectural ones. I'm not sure these are the sort of improvement you can buy without getting the community on board. > For what it's worth, from what I've read uber are a company whose very > business plan relies on them taking things that they don't deserve while > they treat customers and employees with similar levels of arrogance. Nothin' for nothin', but I don't think it helps Postgres to attack others' business plans -- whatever one thinks of them -- as part of an argument about why Postgres is the right tool for a given job. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] MediaWiki + PostgreSQL is not ready for production?
On Tue, Jul 19, 2016 at 12:55:09PM +0300, Achilleas Mantzios wrote: > It seems we have made already a verdict about mysql's code migrated to > PostgreSQL being slow Experience with designed-for-MySQL code that is modified to go through a so-called abstraction layer suggests that it will be. There are severaal anti-patterns in database use and access that are ubiquitous under MySQL because of early limitations in MySQL. Those uses are often optimised away in MySQL. Nevertheless, > we accept it, it is far more productive having one dedicated small project > for this mysql2pgsql conversion rather than N dedicated small teams for > every mysql client out there. …I don't think anyone is telling you, "Don't build this." You should do what you like with your time :) A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] MediaWiki + PostgreSQL is not ready for production?
On Tue, Jul 19, 2016 at 12:37:10PM +0300, Achilleas Mantzios wrote: > > Better to run, even slowly, than not run at all, or require special porting > team for every mysql client out there. > I'm not sure I agree. If you teach every naïve user that, when they compare Postgres to MySQL, MySQL always wins, what you teach them is "Postgres performance sucks." Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Thoughts on "Love Your Database"
On Tue, May 03, 2016 at 11:11:06PM -0500, Guyren Howe wrote: > I've long been frustrated with how most web developers I meet have no idea > how to use an SQL database properly. I think I'm going to write a book called > Love Your Database, aimed at web developers, that explains how to make their > apps better by leveraging the power of SQL in general, and Postgres in > particular. > For whatever it's worth, more than 10 years ago I reviewed (for a publisher) a book along these lines. To my knowledge, it never got published, though I thought it was quite good. The problem is that most development on the web does not, as far as I can tell, regard the development as what I'd think of as a real database-backed system. An awful lot of web systems, in particular, treat the database as a fancy and expensive but somewhat portable filesystem. (This is even more true now that SQL isn't as ubiquitous as it once was.) I still think this is worth promoting, but it seems to me that attempting to create some sort of buzz around these ideas at various developer community events. But there are three things that I think would help make this work for more developers: 1. This will make your application faster. (This needs to be proved -- I agree with the "benchmarks" point in the original post.) 2. This will make your time to ship/continuous integration/time to fix bugs shorter. This is harder to prove, but it's an important value for most developers (and developer managers). 3. This is way cooler than $x. (3) is a distant 3d, but my experience of managing less-experienced developers who go to conferences suggest that it's a good way to get people interested. I think there's something that we're going to have to accept, however, and that's that there are way more application coders than there are people who really get database systems. Fixing this problem requires years of efforts. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to manually force a transaction wraparound
On Sat, Apr 30, 2016 at 02:16:48PM +1200, Thomas Munro wrote: > Take a look at the script repro-bogus-subtrans-error-wraparound.sh > from this email: I wonder whether something like this (and other "beat it up" kinds of tools) ought to be in contrib/ or in some sort of well-known project location. It seems that they're the sort of thing that could be useful to DBAs and data service managers who are trying to prove that Postgres is Just Fine. Maybe this should go to -advocacy? (I no longer subscribe there and can't contribute to the work, but I can report that I've been in the sort of long, boring speculative conversation that could have been shut down quickly with this kind of data.) A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On the right tool (was Re: [GENERAL] Proper relational database?)
Hi, On Sun, Apr 24, 2016 at 12:55:48PM +1000, da...@andl.org wrote: > But there is goodness there, and NoSQL is now just as hard to replace. Indeed, I wasn't trying to make some point-and-laugh argument about NoSQL. I was just observing that, as with many new techniques, some of the uses haven't really been thought out carefully. (It doesn't help that at least one of the early "successes" worked way better in theory than in practice, and that whole businesses have been taken down by the failure modes. "Oooh, can't resolve conflict! Oh well, throw it all away!" is not a great way to store critical business data.) New technologies are hard. Some regard Brooks's _The Mythical Man-Month_ and Spolsky's "Things You Should Never Do, Part I" as saying different things. But I think they're in deep agreement on a key point: understanding why the old approach is there is way harder than figuring out that old approach; so there's a natural tendency to replace rather than to understand and build further. In Brooks, this leads to the communications death, which is one of the ways that adding more people to a late project makes it later. In Spolsky, it yields the straightforward observation that reading code is harder than writing it. In both cases, though, the point is that careful software development management is considerably harder than it seems. I think that those two works -- along with _Soul of a New Machine_ -- impart certain basic things you really need to internalise to see why so many large software projects are more about people's egos than about actually making stuff better. None of them says, "Don't do new things." But all militate towards understanding what you're throwing away before you start work. In I think 2003 or 2004 I read an article in _CACM_[1] that said (in my reading) that Google proved CAP was true and that we had to get over ourselves (I'm exaggerating for effect). As a data guy, I found this both troubling and influential, and I've thought about it a lot since. The thing I found compelling about it was the observation that Google's approach to consistency was way better than good enough, so one shouldn't care too much about durability or consistency. The thing that bothered me was the obvious counter-examples. I came to believe that the point I understood was obviously true in its domain, and dangerously false in other cases. In retrospect, is is obviously true that, if you understand your domain well enough, many data handling techniques could be appropriate. But that's also _only_ true if you understand your domain well enough: applying the wrong techniques to your data can be seriously harmful, too. This explains why various NoSQL techniques are so powerful in some ways and yet often so frustrating to data people. It explains why the most successful distributed database ever is the DNS, which is the wrong tool for nearly every job yet fabulously successful in its job. And it's an excellent way to organise thinking about how to pick the right technology for a given data situation. For if you pick the wrong one, you might find you've left a lot of the value in a data set practically inaccessible. You don't need perfect foresight. But attending a little to what value there is in your data can yield great dividends. We shape our tools and then our tools shape us [2]. But in the software world, we must be more mindful than ever that we understand our tools -- the shapes that they take and that they make. Historicism in software is no vice. It is the path by which we learn to make new mistakes, as opposed to the same mistake over again. [1] Darned if I can find the article, but I confess some scepticism that my original reading was what the authors intended. Doesn't matter for these purposes! :) [2] Apparently, Marshall McLuhan didn't say this; instead, his tribune John Culkin, SJ said it. It's still an excellent point, whoever made it. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Proper relational database?
On Thu, Apr 21, 2016 at 01:36:54PM -0700, Guyren Howe wrote: > It's an enormous tragedy that all the development effort that has > gone into NoSQL database has pretty much all gotten it wrong: by all > means throw out SQL, but not the relational model with it. They're > all just rehashing the debate over hierarchical storage from the > 70s. Comp Sci courses should feature a history class. This turns out to be true in many areas of language design, mutli-user system security, virtually everything to do with networking, and application deployment. I was at an IETF meeting some years ago where someone talking about "Internet of Things" stuff was going on at length about how nobody around the IETF really understood constrained systems. Standing behind him at the mic was an assortment of grey-bearded men who'd worked directly on the original IMPs (which were 16-bit Honeywells that ran at like 5MHz and had IIRC 16Kwords of memory). It's also true that crappy interfaces that are good enough stick around anyway. The UNIX Haters' Handbook is full of evidence of how much less good UNIX was, but even Apple gave in. Also, many of the historical compromises turn out, once you start to try to make different ones, to be more obviously defensible. Most of the NoSQL trend was not a hatred of SQL the language but a carelessness about the relational syntax or a view that promises about consistency are dumb. Then the first credit card number gets lost in an eventually-consistent system, and people suddenly understand viscerally why transactions semantics are so hard. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to Qualifying or quantify risk of loss in asynchronous replication
On Wed, Mar 16, 2016 at 10:40:03PM +0100, otheus uibk wrote: > Somehow, the documentation misleads (me) to believe the async replication > algorithm at least guarantees WAL records are *sent* before responding > "committed" to the client. I now know this is not the case. *grumble*. > > How can I help make the documentation clearer on this point? Well, I never had the understanding you apparently do, but you're right that it's important to be clear. If there were an additional sentence, "Note that, in any available async option, the client can receive a message that data is committed before any replication of the data has commenced," would that help? Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] enum bug
On Sun, Mar 13, 2016 at 10:20:05PM -0400, Melvin Davidson wrote: > The point is, they are an archaic data type and it's a hell of a lot easier > to use Foreign Keys to insure integrity. So don't use them? Nobody, surely, is forcing you to use enums. I recall when enums were added. I recall thinking at the time that they were a bad idea and that you should do such things properly with CHECK constraints and so on. But people wanted them because lots of (IMO poor) designs rely on them. That seems like a good reason to me to keep them around, and not to use them :) Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] regarding table migration from sql to postgres with runmtk.sh
On Wed, Mar 09, 2016 at 06:56:38PM +0530, Durgamahesh Manne wrote: > Hi sir > i got following erro when i ran runmtk.sh > > initially i placed the jtds driver related to sql in > /opt/postgresplus/edbmtk/lib Looks like you're using postgresplus, which is EnterpriseDB's fork. But … > Exception in thread "main" java.lang.NoClassDefFoundError: > org/postgresql/Driver … since it can't find the driver, I'd bet that your classpath doesn't contain /opt/postgresplus/edbmtk/lib. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] database corruption
On Fri, Feb 12, 2016 at 07:46:25AM -0500, Bill Moran wrote: > Long term, you need to fix your hardware. Postgres doesn't corrupt > itself just because the disks fill up, so your hardware must be lying > about what writes completed successfully, otherwise, Postgres would > be able to recover after a restart. It may not be the hardware. Depending on how vmware is configured, it could just be a setting. Also, something in the OP's message made me think that this was _actually_ a network-attached disk, which can also have such problems. (But in general, I agree.) A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Catalog bloat (again)
On Wed, Jan 27, 2016 at 11:54:37PM +0100, Ivan Voras wrote: > So, question #1: WTF? How could this happen, on a regularly vacuumed > system? Shouldn't the space be reused, at least after a VACUUM? The issue > here is not the absolute existence of the bloat space, it's that it's > constantly growing for *system* tables. Some system tables (particularly pg_attribute) are heavily changed by a lot of temp table use. You need to amp up the vacuum frequency on them, and have a lot of workers, or you don't get to them until it's too late. > Question #2: What can be done about it? You may end up taking an outage in effect, because you need to compact them at least once. If you can flip to a replica, that is the easiest way to fix it. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Let's Do the CoC Right
On Fri, Jan 22, 2016 at 10:32:10PM -, Greg Sabino Mullane wrote: > that we do not attempt to "roll our own". Or at the very least, we should > strive to understand how other communities arrived at their Codes and > why it is working for them. This is why I posted all that stuff about what the IETF does some while ago. There is definitely more than one way to do this. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Auotmated postgres failover
On Thu, Jan 21, 2016 at 11:34:18AM -0800, John R Pierce wrote: > the most difficult part is reliably determining that A) the master has > crashed, and B) fencing the failed old master so it doesn't wake up and > think its still in charge. > And, depending on your workload, C) that you actually want to fail over. I've seen an awful lot of people want automatic failover who also can't afford for the already-committed transactions on the master to be lost. Unless you're running synchronous, be sure you have the workload that can actually accept lost writes. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: Things to notice (was Re: [GENERAL] Code of Conduct: Is it time?, broken thread I hope)
Hi, On Mon, Jan 11, 2016 at 10:10:23PM +1300, Gavin Flower wrote: > The phrase "Mere Male" was title of a column I read in NZ Women's Weekly > that my mother bought when I was a teenager. That's nice. I still found it offensive enough in the context to think it worthy of note. (I'm not really one for umbrage-taking, but given the topic I thought it worth calling out.) > Note that even between England and the USA there is a culture gap. Indeed, between Canada and the US there's one, too (a gap that I appreciate even more now that I am marooned in New Hampshire). But I think you're missing my point, which is that when one is working on the Internet with an unknown selection of people from widely-differing cultures, one needs to be even more sensitive than usual to the possibility of creating a chilly environment. I seem to recall that Josh suggested at the start of this discussion that the lack of a CoC discourages some class of participants. One might wonder whether that is the class one wants, and that decision is certainly past my pay grade. All I was trying to note was that the current conversation about this topic itself may create the very kind of environment people are worried about. > So my world view might be bigger than yours! Indeed, it might. And I don't think I was suggesting it was bigger or smaller; there's a reason I elided the attribution, and the "you" in what I wrote was intended in the generic sense. I apologise in case that wasn't clear. > It is the perceived intention of what one says that is important, not what > one actually says! I think that is perhaps a false dichotomy. But I also think I have said enough on this topic, so I shall stop now. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Things to notice (was Re: [GENERAL] Code of Conduct: Is it time?, broken thread I hope)
Someone (never mind who, this isn't intended to be a blame-game message) wrote: > Am I, as a mere male […] :-) Even with the smiley, _this_ is the sort of thing that causes discussions to wander into hopeless weeds from which projects cannot emerge. I _know_ it is tempting to make this kind of remark. But it's not cool, it doesn't help, and it is exactly the sort of thing that makes some people think CoCs are needed in the first place. Suppose you were an uncertain young woman from a culture where men have legal authority over you. Suppose the only interaction with programming peers you get is online. (Yes, I know of at least one such case personally.) This sort of sarcastic remark, smiley or no, causes you a new uncertainty. Just be sensitive to the fact that the Internet is bigger than your world, however big it is, and things will be better. I am not a big believer in written-down rules: I think mostly they're a fetishizing of constitutional arrangements like those of the US and Canada (which mostly don't work for those who are not already enfranchised). But we can do something about that by thinking about that possibility much more than we can do something about it by writing down rules. Still, the exercise of writing down rules may help to notice things one wouldn't say to a friend. And I hope we're all friends here. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Code of Conduct: Is it time?
On Sun, Jan 10, 2016 at 01:44:37PM -0500, Regina Obe wrote: > 1) Are helpful when I ask a question > 2) Stick to the topic > 3) Don't get into petty etiquettes like "Please stop top posting" > and if you really need to - A polite we prefer top posting would do > > 4) Are sensitive to people on other operating systems other than your > preferred. That seems like a pretty good scratch CoC to me. (See my other note about how other communities deal with this.) It's concrete, short, to the point, and a useful thing to point to when some flamewar breaks out over irrelevant stuff. If people want a CoC, I think it should be something like the above. > My other concern about CoCs is I fear someone is going to come and demand > we change Master/Slave to Leader/Follower, because Master is a male term > and Slave is insensitive to grand-children of slaves. If someone did that, it would fall under (2), no? (I note that a recent RFC, of which I am a co-author, about DNS terminology did say that "primary" and "secondary" were to be preferred over "master" and "slave". I didn't personally agree with the claim, but that's what got consensus.) Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
What another group does (was Re: [GENERAL] Code of Conduct: Is it time?)
ue that Postgres doesn't really have: some communities (including the IETF) have official community meetings of this or that sort, and there is a different kind of conduct for which one might need a policy when humans physically in the same room are involved. Because Postgres meetings of various kinds are not usually "project" meetings, but rather meetings organized by some group but open to the community, the "meetings" part is not something the Postgres community needs to have consensus about. Instead, those meetings have their own CoC. This seems normal to me: the organizer of a meeting should have such a code, and since the Postgres project is in general not the organizer the project doesn't need to have a set of rules about such meetings. Finally, and separate from all of that, the IETF has a lot of rules around intellectual property. This is mostly because the IETF is a standards organization, and so it publishes documents, and copyright can be tricky under such cirumstances. The good news there, of course, is that Postgres doesn't really have this problem either, because of the way a code contribution (which automatically gets the PGDG license) gets distributed. The IETF does have a code about disclosing patent claims, however, and it might be another thing for the Postgres project to think about. We have a lot of these rules, but if you want to have a look at how they work together you should probably start at https://www.ietf.org/about/process-docs.html#rfc.section.2.3. I don't really have an opinion about whether the Postgres project needs a CoC, but I will say that having some of these rules has helped the IETF not be dragged into contentious discussions of acceptable behaviour on some occasions. (Whether someone's behaviour fails to conform to the process documents, of course, causes its own arguments. We have an appeals process for this, which would be hard to graft onto other organizations.) The other thing I note is that the IETF got most of these documents because someone thought the problem was important enough to write a draft proposal first. As I said in a recent IETF plenary, the organization works partly because at the IETF you don't need anyone's permission to try something; you don't even need forgiveness. The worst that can happen is that people reject the proposal. It always seemed to me that the Postgres project worked in a similar way, so I'd encourage those who think there is a problem to be solved to make a scratch proposal and see whether it flies. It's always easier to discuss a concrete proposal than to try to figure out whether something is a good idea in the abstract. The shorter and easier to understand the proposal is, I think, the more useful it is likely to be. I hope this was useful. If not, please delete and ignore :) Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Session Identifiers
On Sun, Dec 20, 2015 at 11:25:45AM -0600, oleg yusim wrote: > Thanks you very much Melvin, once again, very useful. So, let me see if I > got it right, following configuration should cause my database connection > to terminate in 15 minutes, right? > > tcp_keepalives_idle = 900 > tcp_keepalives_interval=1 > tcp_keepalives_count=3 Only if your psql session ends. Psql is a client program. It keeps its connection to the database alive. In this sense, the vulnerability you're looking at is analagous to the case where someone logs into a UNIX shell and then leaves the shell open. If the system can be compromised such that someone else can get control of that shell, you have a problem. Otherwise, the session can't really be taken over. So, your exposure is exactly as great as the exposure from UNIX process takeover. You can prove to yourself that the process doesn't linger by opening up a TCP connection (or for that matter a UNIX socket connection) and somehow making the containing program fail (e.g. open a psql connection and then sever your connection to the machine that had the shell that initiated the psql session, without properly closing the shell so that the session hangs around). Eventually, the Postgres backend will try to talk to the session and discover it isn't there, and you'll get a termination logged (assuming you have loging turned up that high). A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Practical Application
On Mon, Dec 14, 2015 at 03:52:41PM +, Gene Poole wrote: > I'm looking for some real world documentation on the use of ora2pg to move > from Oracle DB 11gR2 64-bit to PostgreSQL (i.e. What's the best pgsql version > to use; Is it possible to use a exported copy of the Oracle DB to create a > pgsql image of it; These kinds of questions). > The best version of Postgres to use is the latest stable version, if you can. The second-best version of Postgres to use is the one you already have in production :) If you're just doing development, it might be worth working against 9.5. beta instead of 9.4.x, depending on your deployment plans. I have used ora2pg, but last time was a couple years ago so my experience is stale. It did a pretty good job for me. You might find that there are some things you end up wanting to modify. What I found worked best for me was to pull the database into a schema to "stage" with and then select from that. (When migrating, I often find it better to look hard at the database schema at the same time. Data types in Oracle and in Postgres are not perfect matches.) Note that even if you end up modifying the underlying "real" schema, you need not necessarily modify your application at the same time. Postgres has updatable views and so on, so from the point of view of the application, you can leave the database unchanged. I like to use this feature, in fact, to do A/B testing on new versions of the application: the old one gets app_schema_a and the new one app_schema_b, so you can roll over gradually and change your schema without a lot of outage hassle and so on. > Does PGSQL function the same as Oracle when using Foreign Keys and Lookup > Tables. Generally, yes. There are some differences. There is an old page at https://wiki.postgresql.org/wiki/Oracle_to_Postgres_Conversion that outlines some of them. I don't think the grammar differences have changed too much. The BLOB discussion is all obsolete. > This is a POC to see if we can cut costs. I'm sure you can! There are also some firms that can help with migration if you like. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Loggingt psql meta-commands
On Thu, Dec 10, 2015 at 02:52:38PM -0800, John R Pierce wrote: > thats a rather insane bunch of requirements. Reads like a wish list by > academic security researchers. Well, I don't know. Might be a wish list by insurance adjusters who want to minimise liability. At least in the United States of Liability, I think the day is rapidly approaching where people will need insurance against database breaches. Therefore, > https://www.stigviewer.com/stig/database_security_requirements_guide/2015-06-23/finding/V-58123 > > ??!? The database server has no clue about the difference between an > "application that it supports" and a user directly querying. The PSQL > shell, or dbadmin, is an 'application that it supports'. while I agree that there's no way for the RDBMS to tell when it's an attacker mimicing an application's normal connection and query pattern, I think there are some things you could do here that would catch this. For instance, • use strong authentication mechanisms for your clients. Kerberos seems like a good alternative, but TLS (SSL) certificates might do. Log connections and the connection origin. If you get connections for a given user from the wrong place, you know you have a problem. • isolate your users, so that your application (or better, each instance of your application) has an associated user. Your humans are not allowed to log in with this username. Then, queries issued by non-application usernames are your candidate queries. Again, this will not defend against, "Attacker got into my system and subverted the application user." You need to have other lines of defence for that. But depending on your auditor, this might be enough to satisfy the condition. Also, of course, there is the application_name (string) parameter. In principle, you ought to be able to filter on this. Again, won't help you if your application login is somehow compromised. I agree that all of this depends on logging everything and filtering, however. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.
On Wed, Nov 18, 2015 at 04:46:11PM -0500, Melvin Davidson wrote: > 'm still trying to understand why you think someone can access old data but > not current/live data. I don't. It's just another risk. When you're making a list of risks, you need to list them all. It turns out that in Postgres, you have to worry about (1) data that's currently in the database and (2) some data that used to be there but isn't now. > If you encrypt the live data, wouldn't that solve both concerns? I have no idea, because I don't know what the theoretical risk to be mitigated is. It might, sure. The security profiler would still need to make a list of this fact and then ask how countermeasures mitigate it. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.
On Wed, Nov 18, 2015 at 01:38:47PM -0800, Adrian Klaver wrote: > Alright, I was following you up to this. Seems to me deleted data would > represent stale/old data and would be less valuable. If the data that was deleted is sensitive, then the fact that you deleted it but that it didn't actually go away means you can be lulled into complacency about your vulnerability with respect to that data in a way that you're unlikely to be in respect of data you still have (only with new values). Lots of people forget about deleted data once it's deleted. Keep in mind that sometimes people delete data from a system because it's been archived somewhere else or something like that -- not all databases have the totality of all the relevant data in them, but can often represent just "current" data. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.
On Wed, Nov 18, 2015 at 03:22:44PM -0500, Tom Lane wrote: > It's quite unclear to me what threat model such a behavior would add > useful protection against. If you had some sort of high-security database and deleted some data from it, it's important for the threat modeller to know whether the data is gone-as-in-overwritten or gone-as-in-marked-free. This is the same reason they want to know whether a deleted file is actually just unlinked on the disk. This doesn't mean one thing is better than another; just that, if you're trying to understand what data could possibly be exfiltrated, you need to know the state of all of it. For realistic cases, I expect that deleted data is usually more important than updated data. But a threat modeller needs to understand all these variables anyway. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Where do I enter commands?
On Sun, Oct 25, 2015 at 07:57:48AM -0700, David Blomstrom wrote: > In phpMyAdmin, I've become accustomed to simply copying existing tables, > then adding, deleting and renaming columns as needed. Oh! Interesting. I suspect you're actually _undermining_ your ability to use the database (because often when you copy a table it's a sign that you're using the database like a spreadsheet, and you're giving up a lot of functionality that way). But, suppose you're wanting to do that, then here's the easy way to do it: --- if you want the same table structure without the data CREATE TABLE newtable AS SELECT * FROM oldtable WHERE 1=0; -- if you want the same table with some data CREATE TABLE newtable AS SELECT columns, you, want, in, order FROM oldtable [WHERE conditions]; If you want only some columns or new order or something, the WHERE clause in the latter statement should be 1=0. It makes a null set always. Handy trick. > I can see PostgreSQL is going to have a learning curve - hopefully shorter > than the years it took me to learn MySQL - but it looks interesting. It should be much easier. You have the basics from MySQL already. Consistency and rigour are the changes ;-) A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Where do I enter commands?
On Sat, Oct 24, 2015 at 07:33:15PM -0700, David Blomstrom wrote: > It seems like a command-line tool would be > incredibly tedious when creating tables, modifying them, filling them with > data, etc. For whatever it's worth, I find quite the opposite: once you have the hang of the command line, it is so much more efficient for these things (you can script everything up in your favourite editor) that I find I never go back to the GUI unless I need diagrams and so on. I think the others in the thread are giving you the right instructions, so I've nothing to add on the GUI. But do consider trying out the command line. You'll be surprised at the power you get once the initial learning curve is over. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Very puzzling sort behavior
On Thu, Sep 10, 2015 at 02:54:31PM -0700, Ken Tanzer wrote: > Thanks, but I guess I should have been clearer. Thanks to y'all wonderful > mailing list folks, I get it now as to why the two sorts are not the same. > I'm hoping for practical suggestions or advice about how to get C locale > sorting without having to rewrite all my existing queries. Why not change the collation for the column? http://www.postgresql.org/docs/9.4/interactive/sql-altertable.html > be clinging to futile hope, but is there really no way to specify a > collation for the return value of a function? I don't believe so. I think you need to specify the collation for the data itself. It strikes me that you might be able to similate this with a materialized view or something like that, which has a different collation than the source table. That seems like it'd be pretty awkward, but if there's some reason you can't use C collation on the source table itself that might work. I guess you could do the same thing with a temporary table inside the function, thereby getting a different collation on the data than in the source table. These both seem like hacks, but if you need a bugfix prior to a real solution they'd give you a path. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Delete rule does not prevent truncate
On Fri, Jul 24, 2015 at 10:24:55AM +0100, Tim Smith wrote: > > If that's how the developers feel about rules, i.e they can't be > bothered to make any changes to the rules code any more, no matter how > minor (e.g. this TRUNCATE issue) Who is this "they"? As I think I suggested in another mail, if you think this is trivial and easy then I think you should propose the patch to solve it. I understand what you're saying; I think the solution is self-evident (add a statement trigger that captures TRUNCATE and DO INSTEAD NOTHING), so I just wouldn't be bothered to fix this. But I suspect things are the way they are partly because nobody proposed or implemented a patch for this behaviour before. The manual is also quite clear in what statements you can write rules about; by implication, other statements are not covered, so I'm not actually sure why you think the manual is misleading. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Delete rule does not prevent truncate
On Thu, Jul 23, 2015 at 12:57:20PM +0100, Tim Smith wrote: > It is important to realize that a rule is really a command transformation > mechanism, or command macro. The transformation happens before the > execution of the command starts. If you actually want an operation that > fires independently for each physical row, you probably want to use a > trigger, not a rule Well, yes, but the discussion of the rules system in earlier manuals was actually, I thought, somewhat more detailed; and it outlined what rules really did, which was alter the command at the parse tree. That's what I think the above is saying also, but it may not be quite as plain. So it's rather more like a statement-level trigger. > Thus, I should not have to use a trigger for TRUNCATE because the "each > row" concept does not apply. Plus it makes perfect sense to want to > transform the truncate command and transform into ignore Well, yes, but really in this case you want a per-statement trigger, and there's not the same distinction in rules, either. I can't believe that people would reject a patch (though you should ask on -hackers, not here); but you asked what was behind the design decision and I told you. But in general, the experience seems to be that triggers are easier to get right (novice or no, _pace_ section 38.7). Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Delete rule does not prevent truncate
On Thu, Jul 23, 2015 at 08:06:19AM +0100, Tim Smith wrote: > What exactly is was the design decision that lead to TRUNCATE being > supported by triggers but not by rules ? There are two things. First, probably the design decision was, "I care about triggers." TRUNCATE was added (I believe) in version 7.0, and even then there was some caution indicated about the use of rules. See for instance http://www.postgresql.org/docs/7.0/static/rules19784.htm. So you might be partly right. But second, it isn't clear what it would mean for TRUNCATE to be supported by rules. Rules do query parse tree rewriting. That is, they rewrite the query on the way through the system before they can possibly have any effect, changing one SQL statement into (effectively) a different one by the time it executes. There is only one possible effect from TRUNCATE, and that is to eliminate all the data in the table. I don't know what rewriting such a query would mean. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Error prone compilation of stored procedure
On Mon, Jul 06, 2015 at 04:52:52AM -0700, pinker wrote: > Another minus is that my colleagues which use to work on oracle think that > postgresql is at least one league below oracle. I find that people who are used to any one tool always point out how some other tool is deficient, even if they're raving about other advantages. This is especially the case when there are trade-offs involved in the way a tool does a thing. My suggestion is to point out that if one hates $feature, one can extend Postgres to make it go away, a capability not available in Oracle at any price. At least, I found that to be useful when talking to Oracle partisans. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] advocating LTS release and feature-train release cycles
On Tue, Jun 02, 2015 at 12:59:14PM +0200, Tomas Vondra wrote: > I disagree. The fact that we have 1 release per year means there's one > deadline, and if you miss it you have to wait another year for the feature > to be available in official release. That's a lot of pressure and > frustration for developers. With more frequent releases, this issue gets > less serious. Of course, it's not a silver bullet (e.g. does not change > review capacity). But it's the second part of this that is the main issue. For the people who are driving features in postgres now are overwhelmingly the most advanced users, who also want rock solid database reliability. After all, the simple use cases (the ones that basically treat the DBMS as an expensive version of a flat filesystem) have been solved for many releases quite well in Postgres. These are the cases that people used to compare with MySQL, and MySQL isn't any better at them any more than Postgres. But Postgres isn't really any better at them than MySQL, either, because the basic development model along those lines is low sophistication and is automatically constrained by round tripping between the application and the database. Anyone who wants to scale for real understands that and has already figured out the abstractions they need. But those are also the people with real data at stake, which is why they picked Postgres as opposed to some eventually-consistent mostly-doesn't-lose-data distributed NoSQL system. The traditional Postgres promise that it never loses your data is important to all those people too. Yet they're pressing for hot new features because it's the nifty database tricks you can do that allow you to continue to build ever-larger database systems. If the model switched to more frequent "feature releases" with less frequent "LTS" releases for stability, one of two things would happen: 1. There'd be pressure to get certain high-value features into the LTS releases. This is in effect the exact issue there is now. 2. The people who really need high quality and advanced features would all track the latest release anyway, because their risk tolerance is actually higher than they think (or more likely, they're doing the risk calculations wrong). The effect of this would be to put pressure on the intermediate releases for higher quality, which would result in neglect of the quality issues of the LTS anyway. And on top of the above, you'd split the developer community between those working on LTS and those not. Given that the basic problem is "not enough developers to get the quality quite right against the desired features", I don't really see how it helps. As nearly as I can tell, noting that I'm watching almost entirely from the sidelines, what really happened in the case that has everyone worried is that one highly-esteemed developer claimed something and maybe should have relinquished sooner given his workload. That happens; nobody's perfect. It's frustrating, but this is not the only community to have had that issue (cf. Linux kernel, for an approximately infinite series of examples of this). I am not sure that the answer to this is a rejigging of the basic development model. Hard cases make bad law. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] odbc to emulate mysql for end programs
On Tue, Jun 02, 2015 at 01:31:55PM +0200, Thomas Kellerer wrote: > > 2) as program double-quotes the schema,table and column names. > > Don't use quoted identifiers. Neither in Postgres nor in MySQL (or any other > DBMS) I think a better rule of thumb is either always to use them (and spell everything correctly) or never to use them. Where you get in trouble is the case where sometimes identifiers are quoted and sometimes not. (I find the unquoted use more convenient, and I think it's subject to fewer surprises like overloaded identifiers where one has an uppercase in it; but I think that's a matter of taste, and if your system framework quotes for you automatically then you have no choice but to stick with that convention always and everywhere.) This isn't really any different from any other development rule. For instance, in some environments there are various rules about single and double quoting. If you have no conventions imposed across all your developers about when you use which, pretty soon you'll have an unmaintainable mess. And everyone has their favourite story of frustration about indentation style or variable naming convention. This case is no different. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Please not on this list (was Re: [GENERAL] Fwd: [pgsql-jobs] PostreSQL Engineer and DBA! Atlanta, GA)
On Wed, May 06, 2015 at 08:29:06PM +0530, Sujit K M wrote: > experts on this. I find this particular job ad to be very offensive in > terms of an expectancy from > any person looking for job. In that case, I suggest you not apply for the job. I can't see how taking up the community's time with crowdsourced editing of job postings is in any way appropriate for the pgsql-general list. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] database migration question between different ubuntus and different postgresql server versions
On Thu, Apr 16, 2015 at 06:14:20PM -0400, Octavi Fors wrote: > at first glance, option 1) seems to me simpler. But does it guarantee > server version upgrade compatibility? Yes. Use the pg_dump from the later postgres, which can read old versions and generate any output needed for the new version. It's just like any other pg_dump otherwise. > Could you/someone please provide an example of commands which I could use? Usually pg_dump [connection options] databasename | psql [connection options] databasename For instance, if you wanted from the new machine to dump egdb from the old machine and restore locally, you could do pg_dump -U postgres -h 192.0.2.1 -C egdb | psql -U postgres I recommend reading the pg_dump (and if you like, pg_dumpall) manuals before proceeding. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] database migration question between different ubuntus and different postgresql server versions
On Thu, Apr 16, 2015 at 02:38:56PM -0700, Adrian Klaver wrote: > Well it is an upgrade from one major version to another, so you have the > following options using Postgres core utilities: And, if you don't want to use core utilities, you can use one of the trigger-based replication systems to move all the data from one to the other. Depending on your uptime requirements and the size of the database, this approach can either be a life saver or a total waste of time and will to live. More often the latter, please be aware. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 9.4+ partial log-shipping possible?
On Thu, Mar 26, 2015 at 04:32:00PM +0100, Sven Geggus wrote: > > We need to somehow save the state of the osm database to be able to apply > future changes. This is currently done using a few tables in the target > database. However these tables are not needed for map rendering and are > consuming by far the most disk space (still somewhat expensive on SSD)! This doesn't actually solve your problem, but you could mitigate the cost by putting those tables on spinning-rust disks using tablespaces or symlinks or whatever. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Reg: PostgreSQL Server "base/data" recovery
On Thu, Mar 19, 2015 at 07:02:28PM +0900, Prajilal KP wrote: > > When i see check the this file, the file itself exists but the size is "0" > byte. That suggests you have data corruption, and that you need to restore from backup. > The server is writing the whole log in to the mounted network storage, NFS. There are reasons that people get nervous about databases on NFS. Are you ensuring that Postgres fsync() calls (like when COMMIT happens) are not being handled asynchronously? Also, a trivial scan of the release notes in the 9.0.x series shows a number of data corruption fixes since 9.0.4. You should always try to stay on the latest minor release of your version of Postgres. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Basic Question on Point In Time Recovery
On Wed, Mar 11, 2015 at 09:40:09AM -0400, Joseph Kregloh wrote: > Have you looked into Barman? http://www.pgbarman.org/ It does what you > want. You can take a full daily backup and it keeps track of the WAL files > to allow for a PITR. I just had a look at the documentation (and the rest of your mail), and this doesn't actually seem to do what the OP wanted, which is to get PITR _per database_ in the same cluster. Upthread someone suggested a way around this, which is to PITR a cluster to a known-good point and then pg_dump the target database. But if Barman can do this automatically, that'd be cool (it's just not in the docs). Barman does look like a nice convenience package for managing WAL-shipping type backup installations instead of building one's own scripts, so this note isn't intended as a criticism of the package. I'm just not sure it does the thing requested in this case. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Basic Question on Point In Time Recovery
On Wed, Mar 11, 2015 at 10:54:56AM +, Robert Inder wrote: > But, at least while the system is under rapid development, we also > want to have a way to roll a particular client's database back to a > (recent) "known good" state, but without affecting any other client. > > My understanding is that the WAL files mechanism is installation-wide > -- it will affect all clients alike. Right. It's the WAL, so everything in the relevant Postgres system is involved. > So before we go down that route, I'd like to check that we're not > doing something dopey. No, frequent pg_dumps are indeed hard on I/O. > Is our current "frequent pg_dump" approach a sensible way to go about > things. Or are we missing something? Is there some other way to > restore one database without affecting the others? Slony-I, which is a PITA to administer, has a mode where you can ship logs off and restore them in pieces. The logs are not WAL, but Slony logs (which are produced by triggers and some explicit event writing for schema changes). So they work per-database. Schema changes are really quite involved for Slony, and there's overhead resulting from the triggrs, and as I said it's rather clunky to administer. But it's been around some time, it still is actively maintained, and it has this functionality. The PITR tools were, last I checked, pretty primitive. But the tool might work for your case. I don't know whether Bucardo or Londiste (two alternative systems that work on roughly the same principle) have this functionality, but I kind of doubt it since both were designed to get rid of several of the complexities that Slony presented. (Slony had all those complexities because it was trying to offer all this functionality at once.) Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] express composite type literal as text
On Sun, Feb 22, 2015 at 12:18:21PM -0800, Eric Hanson wrote: > Got it. Ok, I'm reporting this as a bug. Is this a bug? Being able to > always express literals as text is a really valuable assumption to be able > to rely on. If I had to guess (I guess someone more authoritative than I will chime in), I suspect this is a side effect of the change a few years ago that removed a very large number of automatic typecasts. Experience with those automatic typecasts revealed that while they were handy lots of the time, when they failed they did really bad things. So the developers dramatically reduced the number of such cases. Some convenience was lost (I still get tripped up from time to time, but I'm not doing Pg work every day), but the overall reliability of things was increased. So I'd say it's probably not a bug. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] #Personal#: Reg: Multiple queries in a transaction
On Thu, Feb 19, 2015 at 11:11:59AM +0530, Medhavi Mahansaria wrote: > > But savepoint concept will not work for me as desired. I don't see why not. > > Case 1: When Q2 fails (we delete the error), i want to continue to Q3 and > commit changes done by Q1 and Q3 once Q3 has executed successfully. > So, Q1; SAVEPOINT foo; Q2; if error then ROLLBACK TO SAVEPOINT FOO; Q3; COMMIT or ROLLBACK; else COMMIT; > Case 2: When Q2 fails, I want it to throw an error. and rollback the changes > made by Q1 and not proceed to Q3 at all. > Q1; SAVEPOINT foo; Q2; if error then ROLLBACK; These both work. The problem is, I think, that you have different rules for "when Q2 fails", and without knowing your exact circumstances I suspect we can't say much more. Indeed, however, it sounds to me like you think these are in the same workflow, but they're not. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding
On Tue, Dec 30, 2014 at 12:53:42AM +, Mike Cardwell wrote: > > Hmm. How did you get the original, then? > > The "original" in my case, is the hostname which the end user supplied. > Essentially, when I display it back to them, I want to make sure it is > displayed the same way that it was when they originally submitted it. Ah. This gets even better™ for you, then, because whereas in IDNA2003 you can pass it an old fashioned LDH name (letter, digit, hypen), IDNA2008 treats those as _outside_ the spec. So basically, you first have to get a label and determine whether it is LDH or not (you can do this by checking for any octet outside the LDH range) and then you can decide which way to process it. In IDNA2003, the punycode output from an LDH label turns out always to be the LDH label. The reason for this is that you're supposed to validate that a U-label is really a U-label before registering in IDNA2008, and lots of perfectly good LDH labels (like EXAMPLE) are not valid under IDNA2008 because of upper case. (If by now you think that maybe it's time for this DNS thing to get replaced, you have company.) > I was unaware of the different versions of IDNA. I basically started using > the Perl module IDNA::Punycode in my project and assumed that this was the > only type. Seems like I need to do some more reading. Yeah, this is all made much harder by the fact that several IDN libraries still do 2003. Here is one that many people are using for IDNA2008: <https://gitorious.org/libidn2/libidn2/source/0d6b5c0a9f1e4a9742c5ce32b6241afb4910cae1:> It's GPLv3, though, which brings its own issues. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding
On Tue, Dec 30, 2014 at 12:18:58AM +, Mike Cardwell wrote: > > This is exactly the same method that we commonly use for performing case > insensitive text searches using lower() indexes. Hmm. How did you get the original, then? If you have the original Unicode version, why don't you switch to IDNA2008 publication rules, which are way more reliable? In that case, you do have a 1:1 lookup and you shouldn't have a problem. If you need variants, then you have a different problem, but that actually can be specified for the much narrower range of UTF-8 permissible under IDNA2008. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding
e IDNA2008 A-label. The reason I hate this is because the lookup is insanely complicated. > It seems to me that Postgres would benefit from a native hostname type and/or > a pair of punycode encode/decode functions. A pair won't work. > And perhaps even a simple unicode case folding function. Unicode case folding is _way_ more complicated than you seem to be thinking here, and importantly has some nasty edge conditions. For instance, the natural uppercase of the lowercase sharp s, ß, that we've been talking about now turns out to be capital sharp S, ẞ (that's U+1E9E in case you can't see it). That is not, however, the uppercase, because the case folding rules in earlier versions of Unicode (which didn't have U+1E9E) was SS, and the stability rules require that things not break across versions. (There are other problems like this. For instance, the upper case of é in French is officially E, and in Québecois is officially É. And then there's the Turkic dotless-i and dotted-i rules.) To do case folding really according to what people expect, you need to be locale sensitive. Since the DNS has no locale information, we couldn't do that in IDNA, so the answer the first time was naïve case folding (along the lines of the Unicode standard caseFold file), and the second time to leave the case folding to the user agent, on the principle that it has a hope of knowing the locale. > With the end result that these return TRUE: > > unicode_case_fold('ß') = 'ss' But that's false. What's really going on there is that the Unicode case fold of ß is SS, and that case folded again is ss. > A native type would also be able to apply suitable constraints, e.g a maximum > length of 253 octets on a punycode-encoded trailing-dot-excluded hostname, a > limit of 1-63 octets on a punycode encoded label, no leading or trailing > hyphens > on a label, etc. You seem to want a bunch of label constraints, not all of which are related to IDNA. I think it would be better to break these up into a small number of functions. As it happens, I have a colleague at Dyn who I think has some need of some of this too, and so it might be worth spinning up a small project to try to get generic functions: to_idna2003, to_idna2008, check_ldh, split_labels, and so on. If this seems possibly interesting for collaboration, let me know & I'll try to put together the relevant people. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Hostnames, IDNs, Punycode and Unicode Case Folding
On Mon, Dec 29, 2014 at 11:50:54PM +, Mike Cardwell wrote: > > CREATE UNIQUE INDEX hostnames_hostname_key ON hostnames > (lower(punycode_encode(hostname))); This wouldn't work to get the original back if oyu have any IDNA2003 data, because puncode-encoding the UTF-8 under IDNA2003 and then punycode-decoding it doesn't always result in the same label. See my other message. Did I mention that IDNA is a mess? A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Strange behavior in generate_series(date, date, interval) with DST
On Sun, Dec 07, 2014 at 08:25:48PM -0200, Sérgio Saquetim wrote: > > I wasn't paying attention to the fact that generate_series really expects > for timezone inputs. So when I was passing the upper bound > as '2014-10-20'::DATE, the value was being cast to 2014-10-20 00:00:00-02. > > postgres=# SELECT '2014-10-20'::TIMESTAMPTZ; > timestamptz > > 2014-10-20 00:00:00-02 > (1 row) […] > Using a larger upper bound solved my problem. As a more general lesson, I'd suggest that when you're working with dates your best bet is to do it with your time zone as UTC. If you then want to format the output in the local time zone, you can do that in the outer SELECT with AT TIME ZONE. This isn't because Postgres is going to get this wrong, but because it's far too easy to confuse yourself with those time changes. It makes debugging easier, particularly because the time change only happens twice a year so nobody _ever_ thinks of it when troubleshooting. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Mimicking Oracle SYSDATE
On Tue, Aug 19, 2014 at 07:41:00PM +0530, Sameer Thakur wrote: > We are thinking of building our own version of Oracle's sysdate, in > the form of PostgreSQL extension. I thought that was the point of the statement_timestamp() function? A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] check database integrity
On Sun, Jul 20, 2014 at 02:57:20PM +0200, Torsten Förtsch wrote: > I ran this query in a separate transaction. The memory was freed only > when the backend process exited. > > Is there a way to work around this memory leak? Why do you think it's a memory leak. You asked for the full dataset; you need to have the memory to allocate for it. At least, that's what I'd expect. > expensive locks. Then I could also use it in production. But currently I > need it only to verify a backup. If you need to verify a backup, why isn't pg_dump acceptable? Or is it that you are somehow trying to prove that what you have on the target (backup) machine is in fact production-ready? I guess I don't really understand what you are trying to do. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Alternative to psql -c ?
On Wed, Jun 25, 2014 at 03:37:11PM +0100, James Le Cuirot wrote: > Sorry, you're missing the point. I'm trying not to alter the existing > behaviour of the Chef database cookbook Ah, got it. Sorry, I'm clueless. No, I don't think I have a suggestion, then. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Alternative to psql -c ?
On Wed, Jun 25, 2014 at 03:16:19PM +0100, James Le Cuirot wrote: > Same problem as stdin, the transactional behaviour is different. There > is the --single-transaction option but as the man page says... > > "If the script itself uses BEGIN, COMMIT, or ROLLBACK, this option will > not have the desired effects." Hmm. I've _used_ transactions in such files, I'm pretty sure. You don't need the --single-transaction setting for this, just do the BEGIN; and COMMIT; yourself. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Replacing a production db
On Wed, Jun 18, 2014 at 05:05:22PM -0300, Nicolás Lichtmaier wrote: > Where I work we have a scheme of pushing code+db releases to "testing" and > then to "production". Most of our databases use MySQL and I was told they > can just rename the db and it works. We are adopting PostgreSQL for some > new developments and we'd like to do something similar. I've tried loading > the dump in a single transaction, but that has many problems as the > database is fairly big for that (some GBs). Is there a trick I'm missing > here? I guess I don't understand why you'd need to rename the database. What is the problem you're trying to solve? Is the idea that you have changes in the schema in the new database that are not reflected in the old database? If so, what do you do with all the data? Doesn't that have to be in the new schema somehow? One thing you can do with PostgreSQL that you can't do with MySQL is change the schema in a transaction. So you could make the schema changes that way. If the idea is instead to run two schemas in parallel (so that you can have old and new versions of the application running at the same time), what you really want to do us use the schema (or namespace) support in Postgres. Be careful with this, however, as it is easy to make a system so convoluted that nobody can understand it. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trigger to convert UNIX time to timestamp without time zone.
On Fri, Jun 06, 2014 at 02:19:50PM +0100, Alberto Olivares wrote: > How can I create a trigger to transform the time from UNIX to timestamp > without time zone every time a new record is inserted into my database? This is in the manual, section 9.8: to_timestamp(double precision) It's always a little confusing when you go to look it up, because it's not with the date and time functions, because it's actually a formatting issue. (There's a cross reference, but if you don't know this is just a formatting issue you won't know to follow the reference.) You probably don't need a trigger, just put that in your query. Are you sure you want this without time zone? In my experience, almost every time people think they want "without time zone" they actually don't. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] openssl heartbleed
On Wed, Apr 09, 2014 at 11:54:43AM -0400, "Gabriel E. Sánchez Martínez" wrote: > self-signed. In light of the heartbleed bug, should we create a new > server certificate and replace all client certificates? My guess is > yes. This depends mostly on what version of openssl you were actually using. If it were me, I'd say yes. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres as In-Memory Database?
On Tue, Apr 08, 2014 at 12:46:37AM +0200, Stefan Keller wrote: > Hi Andrew > > 2014-04-07 23:37 GMT+02:00 Andrew Sullivan wrote: > > (1) this has been discussed many times in the past (...) > > Can you point me to one of these discussions? > > Actually, I browsed once again the mailing list and this is one of the few > posts I found: > "In-Memory Columnar Store" 9.12.13 by knizhnik. I think you have to go back in time further than that. I recall in-memory table pinning being a recurrent topic during the 8.x series. I also seem to recall it being mostly on the hackers list. > I said, that I'd like to discuss things before I code. Ok. I think in the past what has been successful is some basic design combined with a POC or some such, generally discussed on -hackers since that's where all the people who really know the back end hang out. It seems to me that most of the "built in replication" stuff that ended up happening worked that way, and that seems to be roughly similar size of work to this, but I haven't thought about it too much. > To me it's unclear why design of Postgres should prevent implementation of > "in-memory tables" e.g. as foreign data wrappers (see e.g. white papers > for SQL Server mentioned before). I don't think it does. But new code in the back end isn't free: it presents a future maintenance burden that others may not be willing to pay. These things always have to be traded off. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres as In-Memory Database?
On Mon, Apr 07, 2014 at 10:43:58PM +0200, Stefan Keller wrote: > running out-of-memory [1][2] - i.e. pretty much what has been discussed > here - although little bit reluctantly :-) It is just possible that some of the reluctance is because (1) this has been discussed many times in the past, partly with the arguments you've already seen in this thread, and with much the same results; (2) nobody seems to be saying, "I have $n to spend on this effort and $thesepatches to contribute towards this end along with $thisdesign," but instead to be saying, "It'd be nice if someone else did this work;" and (3) there _are_ several in-memory-only databases on the market, including free-software ones, so it isn't clear what Postgres would contribute, especially since its basic design isn't obviously amenable to this sort of use. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Initial queries of day slow
On Mon, Apr 07, 2014 at 10:32:59AM +0100, Rebecca Clarke wrote: > normally execute promptly, are taking a long time when they are executed > first thing in the morning (after the database has been inactive for > several hours). After the first execution, everything is back to normal. Just guessing, but perhaps because your system's disk buffers have all been blown away, so things that are normally in memory aren't any more. In particular, > A while back I turned autovacuum off and now instead I run a daily cron at > 3am that executes a script which does a VACUUM ANALYZE on each table. this goes through every table in the database, and probably not in the order such that the most-frequently-used tables are last in the set. But also, why did you turn off autovacuum? In the earliest implementations of autovacuum that was sometimes worth doing for very specific workloads, but in more recent releases (9.1.x certainly qualifies) you are much better to tune autovacuum. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pagination count strategies
On Thu, Apr 03, 2014 at 12:44:23PM -0300, Leonardo M. Ramé wrote: > Sounds nice, is it possible to modify my "count(*) over()" to what you > suggest?. I think the window_definition inside over() can contain a LIMIT, can't it? I didn't check just now, but I can't think any reason why not. (ISTR when I did this in the past we didn't have window functions, so I simulated it another way.) A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pagination count strategies
On Thu, Apr 03, 2014 at 10:34:32AM -0300, Leonardo M. Ramé wrote: > > What strategy for showing the total number of records returned do you > recommend?. The best answer for this I've ever seen is to limit the number of rows you're counting (at least at first) to some reasonably small number -- say 5000. This is usually reasonably fast for a well-indexed query, and your pagination can say something like "First n of at least 5000 results", unless you have fewer than 5000 results, in which case you know the number (and the count returned quickly anyway). As you're displaying those first 5000 results, you can work in the background getting a more accurate number. This is more work for your application, but it provides a much better user experience (and you can delay getting the detailed number until the user pages through to the second page of results, so you don't count everything needlessly in case the user just uses the first page, which IME happens a lot). Note that even Google doesn't give you an accurate number -- they just say "about ten trillion" or whatever. Hope that's useful, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Insert zero to auto increment serial column
On Wed, Apr 02, 2014 at 08:17:12AM -0400, loc wrote: > value, MySQL also works this way. With PostgreSQL I will need to do a lot > of code modification to my Aubit4GL programs, since I will need to either > insert with the key word default or omit the serial column in the insert > statement. Why don't you put a DO INSTEAD trigger or rule (I'd suggest the former) when you put in a 0 to do nextval() instead on the sequence for the column? Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] simple update query stuck
On Tue, Apr 01, 2014 at 07:00:16PM -0400, Tom Lane wrote: > one of the clients, in a way that isn't visible to the deadlock detector. > One way for that to happen without any external interconnections is if the > client is waiting for a NOTIFY that will never arrive because the would-be > sender is blocked. I bet the case I was thinking of was the NOTIFY example. That never occurred to me as an explanation, but now that you mention it, it seems quite likely to me. More generally (and for the OP's problem), my experience is that lots of updates against the same rows in an unpredictable order is an excellent way to run into trouble, and long-running transactions are a major source of these problems. Without a more detailed report about what is going on in the present case, I don't think it's going to be possible to diagnose better than has been done. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] simple update query stuck
On Tue, Apr 01, 2014 at 01:37:17PM -0700, Si Chen wrote: > You are right. That was the problem. I tried the query from > http://wiki.postgresql.org/wiki/Lock_Monitoring and found a COMMIT > transaction that was blocking it. > > I restarted postgresql again, and (it seems) everything went back to > normal. Was there another way to unlock the table then? Probably you could have killed one of the queries. But it sounds like what's happening is that you have multiple queries that are all trying to update the same rows in a different order. It may be that none of these is strictly deadlocked, in that no query is waiting on a lock that another query has, but rather is waiting on a lock that another query will release only when _it_ gets a lock that another query has and so on. (Maybe things have gotten better, but in my experience it's possible to set up a chain of locks such that it doesn't look like a deadlock to the detector, but the lock chain is such that no query will ever be able to release.) I suspect you need to get your locks in a consistent order or you'll continue to have this problem. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Getting sequence-generated IDs from multiple row insert
On Mon, Mar 31, 2014 at 01:34:04PM -0700, David Johnston wrote: > > If order is an implicit property of the source data then you need to > explicitly encode that order during (or before) import. Sure, but the problem the OP had I thought was that the RETURNING clause doesn't guarantee that the rows coming back are in the order they were inserted. This is just a SQL thing. (I guess you could ORDER BY the RETURNING clause, right?) > There are numerous > ways to implement such but except for extremely simple cases PostgreSQL will > not do the appropriate thing automatically in the face of concurrency. It _is_ doing the appropriate thing, though: this is SQL. The rows aren't ordered unless you tell them to be. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Getting sequence-generated IDs from multiple row insert
On Mon, Mar 31, 2014 at 03:28:14PM -0400, Ben Hoyt wrote: > , but I've just hit a case where two sessions each doing a > multiple insert don't use sequential IDs. For example, the range code above > for the first insert gave 2117552...2117829. And the second insert gave > 2117625...2117818. Which are obviously overlapping and is a nasty bug > waiting to happen. Thankfully it caused an IntegrityError further down in > my code so I didn't screw things up. Good thing you caught it. But yes, just from the description it looked like an obvious race to me. Concurrency is hard. > Second, what's the right thing to do here? The first thing I found was > PostgreSQL's "RETURNING" clause, but somewhat frustratingly for this use > case, even that's not guaranteed to return the results in the order you > specified. In SQL, _nothing_ is guaranteed to return in the order you specified. This isn't really a Postgres thing; unless you use ORDER BY, SQL's sets are not ordered. > I need the IDs in insertion order so I can do further processing. This sets off alarm bells for me. What further processing are you doing? Is it possible that you could move that into a single step in the database (maybe with a function or even a trigger) so that the result of your RETURNING really would provide you with what you need? > So currently I've changed my code to use RETURNING and then I'm ordering > the results based on a secondary column that I know the order of. This > works, but seems clunky, so I'm wondering if there's a nicer way. This is probably what I'd do, assuming that "further processing" isn't more data transformation. If it _is_, then I'd do the whole thing in a single step (in the database, once I inserted). A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Synchronizing a table that is two different databases : Need to dump a table a insert from db1 and change the insert statements into UPDATE statements
Is there data in db2 that is not in db1, and ought not to be? If not, then I suggest dumping the table from db1, loading it into a different schema in db2, then moving the table in db2 out of the way and the new-table into place, all in one transaction. If you have data in db2 that is _not_ in db1 but that should be there, then the easier way is likely again to load up the table from db1 into a different schema, then perform updates and inserts as needed on the db2 table you're trying to modify. Finally, and I suppose this is obvious, if there are a lot of records involved you may want to do this in hunks so that you can perform the necessary vacuums and so on on the target table, or else have a very long-running transaction. A On Thu, Mar 27, 2014 at 12:16:32PM +0200, Khangelani Gama wrote: > Hi all > > > > Synchronizing a *table* that is in two different databases(e.g *db1 and db2*). > > > > > Please help me with this. I need to dump a table as INSERTS from db1 > (postgres 9.1.2) and change the INSERT statements into UPDATE statements in > order to apply that change in *db2(postgres 8.3.0.112)* which has the same > *table* as *db1*. Where the record does not exist I need to insert that > record. There is more than 1000 INSERTS I need to convert to UPDATES for > the same table. Please help . > > > > *Example:* > > > > Below it's the INSERT from postgres 9.1.2 DATABASE which should convert to > UPDATE statement. > > > > INSERT INTO* table* (br_cde, br_desc, br_active, br_closed, grp_cde, > ctry_cde, state_cde, br_addr_line1, br_addr_line2, br_addr_line3, > br_addr_line4, br_addr_cde, br_post_addr_line1, br_post_addr_line2, > br_post_addr_line3, br_post_addr_line4, br_post_addr_cde, br_phone, > tax_cde, br_server_name, br_lang_ind, bureau_cde, br_risk_typ_cde, > br_access_ccd, br_access_bureau, br_network_active, br_desc_short, > br_is_trading, br_is_test, br_is_nomodem, br_is_whs, br_is_merch, > br_is_cco, br_is_central, br_is_merch_active, br_central_brn, br_merch_brn, > br_cco_brn, br_cgo_brn, br_cluster_brn, br_contact, br_tollfree, br_fax, > br_telex, br_email, ops_cde, ops_director, br_cmpy_reg, br_tax_reg, > audit_id, comp_cde, br_batch, br_release_pilot_type, br_is_order_active, > bnk_id, br_class_cde, br_erp_xref, br_to_open, br_to_close, br_comments, > br_is_hire, br_is_voucher, br_is_clear_center, br_is_headoffice, > br_is_nsp_active, usr_pass_history, br_network_protocol, > br_gen_ins_central_brn, br_cpi_ins_central_brn, br_rsc_area, br_rsc_perc, > br_switch_deduc, br_tvlic_dealer_no, br_cco_followup, br_triad_live, > br_do_cust_locking, br_is_rainbow, br_nca_live, br_ncr_reg_no, prov_cde, > br_is_data_archive, br_icu_brn, br_living_expenses_live, br_ab_dorder_live, > br_ab_receipts_active, br_spiv_max_alloc_perc, br_merger_brn, > br_dep_bnk_acc, br_merchant_number, br_goods_ins_live, br_cgs_connection, > whs_cde, br_crm_brn, subscription_active, br_prn_doc_default_active, > br_use_jms, br_sso_active, br_paym8_properties, creditors_supp_no_prefix, > br_block_debtors_obo, ccms_branch, br_is_ccms_live, ceres_conv_flag, > cims_branch, br_is_cims_live, br_accept_sql_releases) VALUES ('9940', > 'ELLERINES CENTRAL - 9940', true, NULL, '1', 'BAF', 'BAF', '209 SMIT > STREET', 'BRA', 'Jgf', 'Jrg', '3025', '209 SMIT STREET', 'BRA', 'Jrg', > NULL, '2017', '(4562) 712 1300', '1', 'dfgike.com', 'ENG', 'ITC', 'L', > false, false, false, 'BATCH - 9940', false, false, false, false, false, > false, true, false, '9940', NULL, NULL, '10.52.1.31', '9940', 'DOUG', NULL, > '(4562) 712 1300' ', NULL, NULL, '001', NULL, '1969/02687/07', NULL, 0, > NULL, '9940', NULL, false, NULL, NULL, NULL, NULL, NULL, NULL, false, > false, false, false, false, 30, 'WS2~WS2', '1002', '1002', NULL, NULL, > false, NULL, NULL, false, true, false, true, 'NC', NULL, true, NULL, true, > true, NULL, NULL, NULL, NULL, NULL, true, 'oracle_live', NULL, NULL, true, > NULL, true, false, NULL, NULL, NULL, NULL, NULL, NULL, NULL, false, false); > > > > CONFIDENTIALITY NOTICE > The contents of and attachments to this e-mail are intended for the addressee > only, and may contain the confidential > information of Argility (Proprietary) Limited and/or its subsidiaries. Any > review, use or dissemination thereof by anyone > other than the intended addressee is prohibited.If you are not the intended > addressee please notify the writer immediately > and destroy the e-mail. Argility (Proprietary) Limited and its subsidiaries > distance themselves from and accept no liability > for unauthorised use of their e-mail facilities or e-mails sent other than > strictly for business purposes. -- Andrew Sullivan a...@anvilwalrusden.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Timezone information
On Fri, Feb 21, 2014 at 12:14:42AM +0530, Dev Kumkar wrote: > These are the two clients which I have currently who communicate with > database. > Do you control the client code? If so, why not set the TimeZone locally when you connect? That's the right way to handle this, really. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Timezone information
On Thu, Feb 20, 2014 at 03:37:20AM +0530, Dev Kumkar wrote: > But better is to set the TimeZone. Now haven't done anything special but > JDBC is working with setting TimeZone and ODBC not. So what should I look > from here now? I think your client should set the TimeZone at connection time. That's going to yield the most predictable behaviour for the users, I suspect. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Timezone information
On Thu, Feb 20, 2014 at 03:22:15AM +0530, Dev Kumkar wrote: > > Hmm. Missed one observation here, created a test table with timestamp > column of type 'default current_timestamp'. > When the query is executed from JDBC then it stores OS specific local time > into this column. Probably the JDBC driver is setting its TimeZone. Really, try it: SET TimeZone="UTC"; SELECT now(); SET TimeZone="EST5EDT"; SELECT now(); and so on. Try selecting from your table, too, and you will discover that the time zone of the timestamps changes. If you're used to certain other RDBMSes, this mode of functioning will be unusual, but that really is how it works. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Timezone information
On Thu, Feb 20, 2014 at 02:46:46AM +0530, Dev Kumkar wrote: > Since the database is created on SRC-INDIA the timezone parameter gets set > as "Asia/Calcutta", but when shipped locally then now() would return > incorrect time as per target TGT-USA local time. > No, select now() would return the time in whatever timezone is set, or the timezone that the server defaulted to if there's nothing set by the client. So in your installation, set up the server to use UTC by default and, if you like, set the client's time zone according to locale or whatever when the client connects. The timestamps in the server are not actually "in" a time zone. They're all stored as UTC, and the display is altered according to what your time zone settings are at the time of query. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Timezone information
On Thu, Feb 20, 2014 at 02:24:53AM +0530, Dev Kumkar wrote: > Ok but am confused then, as if no timezone parameter is defined in > postgreSQL.conf file then yes UTC time is returned. "Select now()" > but if timezone parameter is set then "Select now()" returns date time as > per timezone. Try issuing SET TimeZone commands. I think you'll find that the client can set whatever time zone it wants. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to turn off DEBUG statements from psql commends
Somehow your postgres log statements are getting echoed to the front end. Did you change anything about the postgres (server) configuration file? On Mon, Feb 10, 2014 at 07:43:33AM -0800, peterlen wrote: > We are using PostgreSQL 9.3. Something seems to have changed with our psql > command-line output since we first installed it. When I run commands at my > plsql prompt, I am getting a lot of debug statements which I was not getting > before. I am just trying to find out how to tell psql not to display this > output. As an example, if I were to create a new 'test' schema, the output > looks like: > > gis_demo=# create schema test; > DEBUG: StartTransactionCommand > DEBUG: StartTransaction > DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR, > xid/subid/cid: > 0/1/0, nestlvl: 1, children: > LOG: statement: create schema test; > DEBUG: ProcessUtility > DEBUG: CommitTransactionCommand > DEBUG: CommitTransaction > DEBUG: name: unnamed; blockState: STARTED; state: INPROGR, > xid/subid/cid: > 15099/1/1, nestlvl: 1, children: > CREATE SCHEMA > gis_demo=# > > > Before, all I would get is the CREATE SCHEMA feedback. Not sure what may > have caused this change on our end but does anyone know how to turn it off? > > Thanks - Peter > > > > -- > View this message in context: > http://postgresql.1045698.n5.nabble.com/How-to-turn-off-DEBUG-statements-from-psql-commends-tp5791250.html > Sent from the PostgreSQL - general mailing list archive at Nabble.com. > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Andrew Sullivan a...@anvilwalrusden.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Re: Drop all overloads of a function without knowing parameter types
On Tue, Feb 04, 2014 at 10:56:28AM -0800, David Johnston wrote: > If you are doing version controlled upgrades you should not be using this > function but during the R&D phase I can imagine it would come in quite > handy. Or add Tom's remarks to a little corner of contrib/, or as Tom suggested, the docs, though I don't have a clear way to state it. One could actually add a reference to Tom's note to the comments section of the docs so that some mroe thinking could go into how to putting words about this in the docs. I agree that the function signature is part of the function. This is strange to people because when you're developing it's normal to think of functionname(args) as the thing you're changing, but in a system that allows overloading like Postgres that's not really true for production. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
On Wed, Jan 29, 2014 at 03:05:23PM +0800, Craig Ringer wrote: > This is a painful issue for a significant group Yes. > They've been taught not to care about the DB and write > "database-agnostic" code, but they're seeing Pg features that're so > useful that they'd like to bend that and start using some Pg features. > Only to find they can't do it without throwing away everything they have. Also yes. This is important. The _whole point_ of ORMs is that they're hiding the underlying details of the database implementation; in practice, this turns out to be lowest common denominator where "lowest" is pretty low. "Database-agnostic" code is, by definition, not going to use database-specific features. The idea that you can have a database-agnostic ORM that can use all the clever features of the underlying database system is just confused. You can't have this both ways, and other strongly-typed database systems don't do a whole lot better with this (often, you will discover that the automatically-generated schemas these ORMs produce use type text everywhere, for exactly this reason). People who insist that this ought to be possible in the general case are saying, "I want a pony." What you might do, however, is generate a bunch of CREATE CAST statements for the implicit casts from text you want. This is dangerous for all the reasons Tom noted, but it might be actually good enough for the ORM cases you're worried about. If you think that, maybe the right answer is to start up a project like "ORM-cast-footgun" or whatever and create the relevant casts in a generalized way. > I guess this comes down to whether the goal is to be like Haskell - > pure, perfect, and used by nobody for anything real - or a pragmatic > tool for real world productive use. That's a fun false dichotomy, but Postgres is a general purpose tool and therefore needs to ship by default with the safest general purpose behaviour. Unlike many other systems, however, Postgres actually gives you the power to adjust its behaviour according to your use case, and there's no reason not to use that when appropriate. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
On Tue, Jan 28, 2014 at 03:38:49PM -0800, Mike Christensen wrote: > Oh. The CREATE CAST command. Wow, I was totally unaware of this entire > feature! See, this is why Postgres really is better than you ever think. Just when you're convinced that you have a totally impossible problem, it turns out that one of the brilliant developers has not only thought of it, but has fixed it while you weren't looking. In a large-scale generalized way that doesn't have whirling knives sticking out of it. (Yes, I have been using some other DBMSes recently, and no, I don't like it.) A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC
On Tue, Jan 28, 2014 at 02:55:03PM -0800, Mike Christensen wrote: > I'd be curious as to what types of bugs were caused by these implicit > casts.. Typically, they were cases when there was an ambiguity that the programmer didn't understand, causing applications to blow up in surprising and wonderful ways. There are things you can do if you're really prepared for the gun aimed at your feet. Since you can create casts in Postgres, you can actually add back many of the implicit casts yourself. > Such as implicitly cast iff the intention is not ambiguous I think if the developers could write code that read minds, they'd be working on more profitable enterprises ;-) Best, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Fully-automatic streaming replication failover when master dies?
On Thu, Jan 23, 2014 at 03:12:07AM +0400, Dmitry Koterov wrote: > for all the machines). At least MongoDB does the work well, and with almost > zero configuration. Mongo's data guarantees are, um, somewhat less robust than PostgreSQL's. Failover is easy if you don't have to be exactly right. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] vacuum vs pg_repack for clearing bloat?
On Wed, Jan 15, 2014 at 05:37:27PM -0800, Lists wrote: > it's clearing out the cruft that results from creating temp tables, > loading a bunch of data, then dropping the table, either explicitly > or when the connection is terminated. This causes PG disk usage to > climb without causing any change in pg_dump output. Oh. You need to up your autovacuum settings for the system catalog tables. I built a system that did this sort of thing. If your autovacuum settings are correct, this will stabilize. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] vacuum vs pg_repack for clearing bloat?
On Wed, Jan 15, 2014 at 04:09:28PM -0800, Lists wrote: > Our app makes extensive use of temp tables, and this causes a > significant amount of bloat that can often only be cleared with a Note what Tom Lane said, but why do you have bloat that can only be cleared by vacuum? Why not drop them or whatever (I presume they're not going away because your connection is long lived)? They're supposed to be temporary, after all: cheap and disposable. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Possible multiprocess lock/unlock-loop problem in Postgresql 9.2
On Sat, Jan 04, 2014 at 07:07:08PM +0100, Yngve N. Pettersen wrote: > I tried that before, but ran into some issues, IIRC a similar looping > problem as this where queries never ended. I split it up in an attempt to > solve that problem. Pulling the data out into the application and sending it back in won't improve things. Exactly the same number of rows need to be visited, but the way you have it now you have to marshall all the data and ship it to the application too. So it's automatically slower. Indeed, making it slower might have masked your problem. > In the select/update case there is no sorting in the query; there is an > offset/limit clause though, number of records retrieved are currently > restricted to <1 per query (out of 20 million in the active subset). > > SELECT id from queue where state = E'I' and job_id = offset > limit <1-6000> This could be part of problem. Are the different threads working on different job_ids, or is that the same job_id? If you don't SORT that query before the OFFSET, then the rows will come back in whatever order the system likes. If your random function isn't very good, you could well be selecting the same rows for updating. As you increase the number of workers, the chances for collisions go up -- particularly if you're pulling 6000 things at a go. Remember that the other updating workers are affecting the order in which rows are going to come. > However, in the UPDATE case, the looping processes are all UPDATE queries, > no SELECTs involved. But you said it's all in the same transaction scope. The lock is a the transaction scope. Anyway, what I'd do is try to cause the condition and post the pg_locks information. When I've done this in the past, usually the best thing to do is also to have query logs on for everything (don't forget to log the pid!) so you can see what the other transaction you're waiting on touched already. You can usually find the inversion that way. Once you see it, it's always obvious what you've done, in my experience (and completely mystifying before that, unfortunately). A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Possible multiprocess lock/unlock-loop problem in Postgresql 9.2
On Sat, Jan 04, 2014 at 12:14:42PM +0100, Yngve N. Pettersen wrote: > The update query looks like this: > > UPDATE queue SET state = E'S' WHERE state = E'I' AND id IN ( of integers>) RETURNING id; > > There is a BEGIN/COMMIT wrap around the operation, including the > SELECT query. Am I right that you're doing the SELECT, pulling that up into the application, and then that becomes your ? If so, my first advice is, "Don't do that," because you're making the transaction run longer (which is probably part of the problem). Instead, you probably want something like UPDATE . . . AND id IN (SELECT . . .LIMIT. . .) RETURNING id; This won't solve your problem perfectly, though. > My guess is that the active processes get into a lock/unlock loop > regarding the "state" field because the list of ids overlap, and for > some reason, instead of completing the operation according to some > kind of priority order. My guess is that you have a "lock inversion", yes, but it's hard to guess what. You want to look at the pg_locks view to figure what's blocking what. It seems likely that your case is not strictly a deadlock. A deadlock is a case where transaction A has a lock on something that transaction B needs, and needs to wait for a lock on an object that is locked by transaction B. Neither one could possibly complete, and you get a deadlock detection. There's an additional possibility that is suggested by your description, and that is that it's not locks at all, but that you're running into some kind of system issue. Did you adjust the setting of sort_mem? It's a long shot, but it could be that if there are enough sorts in the SELECT (which you don't show us), you're chewing through a lot of sort_mem. Remember that sort_mem is allocated _per sort_, so it could be that a lot of these allocations fit fine in real memory if only 8 processes are doing it; but if 10 do, you pass your threshold for physical memory and start swapping. I wouldn't expect high CPU under that case, though, but high I/O. So I think it's doubtful. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why the size (PQ_BUFFER_SIZE) of backend send buffers is 8192 ?
On Fri, Jan 03, 2014 at 10:00:42AM -0500, Tom Lane wrote: > know the end-to-end MTU size with any reliability. Well, you could try PMTU discovery, though I agree that it's not great. It also seems pretty low-level for something like the DBMS to be doing. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] bulk loading table via join of 2 large staging tables
On Mon, Dec 30, 2013 at 07:53:06PM -0600, Seb wrote: > Given that the process involves a full join, I'm not sure I can do this > in chunks (say breaking down the files into smaller pieces). Any > suggestions would be greatly appreciated. First, what I would probably do is merge the two files outside Postgres, run sort|uniq > outfile over the whole thing, and then just bulk import that. It's probably going to be faster. But if the files are too big, you'll run out of memory. Are there possibly duplicates _within_ each file, or just between them? If not within, then load the first file into the target table (well, probably with the staging table just so you can get the timestamp sorted out), then create the staging table as you suggest for the second file, but create some indexes and do a WHERE NOT EXISTS to get just the subset from that second table. (This might be faster if you update the staging table with the timestamp first, then create the relevant multicolumn index.) If there are dupes within the file, you can do the same thing except that in the first step, you do SELECT DISTINCT instead. Again, I suspect a multicolumn index is going to be your friend. Anyway, those are two ways I've done this sort of thing in the past. Best regards, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Some good news ([i...@arin.net: [arin-announce] ARIN Database Migration Completed])
I thought people'd like to know about this. For those who don't know, ARIN is the Regional Internet Registry for North America and the Carribean. That is, if you have an IP address and you are operating in that region, you depend on ARIN. Hurray! A - Forwarded message from ARIN - Date: Sat, 14 Dec 2013 17:15:41 -0500 From: ARIN To: arin-annou...@arin.net Subject: [arin-announce] ARIN Database Migration Completed List-Id: ARIN Announcements and Notices We are pleased to announce that ARIN successfully completed migration of its database from Oracle to PostgreSQL. All ARIN systems are now operating normally. We thank you for your patience as we continue to strive to improve our services. Please send any questions, comments, or issues to hostmas...@arin.net. Regards, Mark Kosters Chief Technical Officer American Registry for Internet Numbers (ARIN) ___ ARIN-Announce You are receiving this message because you are subscribed to the ARIN Announce Mailing List (arin-annou...@arin.net). Unsubscribe or manage your mailing list subscription at: http://lists.arin.net/mailman/listinfo/arin-announce Please contact i...@arin.net if you experience any issues. - End forwarded message - -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Case sensitivity
On Wed, Dec 11, 2013 at 04:55:07PM +0530, Dev Kumkar wrote: > creating database. I have been looking at other discussions and doesn't > look like anything of that coming up soon that makes database case > insensitive. You could build lower() indexes on any column you want to search CI and lower() all the input text during searches, in order to avoid any work on the schema. Bit of a kludge, though. Best, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Use environment variables in postgresql.conf
Hi, On Sat, Nov 30, 2013 at 08:43:53PM -0500, imagene...@gmail.com wrote: > The questions are: > > 1. Has var expansion in configuration files been contemplated? > 2. Why not do it? Several years ago on the -hackers list (I couldn't say how many, but I was still supervising the database operations at Afilias so it's at least pre-2006) there was a big fight about environment variables and how to handle them. Feelings ran high, and I seem to recall that one of the participants involved basically pulled back from Postgres work after that because of his frustration. I'd suggest trolling the -hackers archives for "environment variable" during the period (say) from 2003 through 2005. But in answer to (2), basically the reason is that if you have multiple ways of setting a value, it is extremely hard for an operator to figure out what the setting of a running process _actually_ is. Environment variables are especially bad, because they're hard to retrieve from a running session in an interoperable way and they're impossible to inspect if the calling shell is gone. If you say, "But I choose to take that risk," you forget that the project incurs support costs whenever such inconsistencies pop out. > Reasons why it's perhaps useful to change the presumed workflow: > > 1. It's perhaps inconvenient I don't see why. If you're going to write a shell script anyway to set these things, I don't see what the big deal is. > 2. Variables are a fundamental concept for configuration > 3. Moving configuration to os specific scripts defies the DRY (don't repeat > yourself) paradigm You can use include files against a base postgres.conf file to avoid repeating yourself. You're certainly going to have to repeat at least the setting of some variables. > Proposed workflow: > 1. Environment initialization, meaning the declaration of environment > variables (in the sense that "env -i" is probably spawned in the OS > specific scripts and is thus quite empty) for "pg_ctl" should be done in a > postgresql specific shell file. Why in the world do you want to involve the shell in an already-complex configuration situation? Particularly since Postgres also runs on Windows? And that the "standard" Bourne shell is either no standard at all, or else sufficiently limited for interoperation that the paradigm effectively becomes "write once, test everywhere"? What you are really saying is, "I have this idea to do something I can already do, only using a different work flow, so someone else should do the development work to support that." I guess if you want this badly enough you could propose the feature over on -hackers. I think it'd be a crazy feature, though. > Thanks for your reply Andrew, however I do not necessarily wish to conform > to arbitrary expectations forced by the current implementation if it is > inconvenient/incomplete. All engineering is a trade-off. In this case, the program is designed with a different mechanism to achieve the same end. It is a bad idea -- because it's a dangerous foot-gun -- to implement a different way to achieve the same ends: it increases code complexity and dramatically increases the risk that one part of the code will be altered without correctly altering the other part. What is inconvenient for you incereases the reliability of Postgres, and I humbly suggest that your evaluation of convenience in this case is anyway incomplete. Best, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Use environment variables in postgresql.conf
On Sat, Nov 30, 2013 at 06:54:31PM -0500, imagene...@gmail.com wrote: > Is it possible to use environment variables in postgresql.conf and perhaps > related configuration files? > > For example, I would like to use an env variable to specify the data > directory. On top of what John Pierce says in the thread, I would very strongly urge you not to do this. It makes troubleshooting on failure notoriously hard. Use the mechanisms the start-up files do, I suggest. That said, see section 18 of the manual -- in particular, 18.1.3 in the 9.3 manual ("Other ways to set parameters"). I recommend against all that on the basis of some somewhat painful experience, but you can do it that way. If you're determined to do it, I _strongly_ suggest using several different postgres.conf files and then only using the config_file parameter to control this. At least then you can look at what the settings were when you have a crashed or otherwise troublesome system. Best, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Any advantage of using SSL with a certificate of authority?
On Tue, Nov 26, 2013 at 02:18:58PM -0500, Vick Khera wrote: > Using self-signed certs you can give them longevity of 10+ years, so never > have to worry about them again :) Unless of course you turn out to have a weak algorithm and, say, No Such Agency decides to take up residence on your network. (It's not clear that CAs are any protection against that either, though, of course.) In general, 10+ years is probably too short a time to be using a cert unless you are completely certain to whom it could be exposed. (Some would argue that if you had that certainty, you might not need TLS/SSL anyway. I guess I'd respond that you could use TLS anyway because it would help in case of a network compromise.) Best, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] restore crashes PG on Linux, works on Windows
On Tue, Nov 26, 2013 at 02:48:34PM +, Albe Laurenz wrote: > I beg your pardon, but Windows-1252 has nothing to do with Unicode Sorry, you're quite right, I'm having a brain fade (I meant ISO 8859-1, of course). The point I wanted to make, however, is that the collation often causes trouble with UTF-8 encoding. I liked your conversion suggestion, however, in your other mail. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] restore crashes PG on Linux, works on Windows
On Tue, Nov 26, 2013 at 09:25:17AM -0500, Chris Curvey wrote: > > CREATE DATABASE "TestDatabase" WITH TEMPLATE = template0 ENCODING = 'UTF8' > LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United > States.1252'; Guess guessing, but I bet the collation is what hurts, just because that collation causes problems generally. Maybe you could get them to ditch that in favour of something else? LC_COLLATE = en_US.UTF-8 would perhaps be better. (The background for my guess: on your Linux box UTF-8 is likely the normal local encoding, but on Windows that isn't true, and 1252 is _almost_ but not quite Unicode. This bites people generally in internationalization.) A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_xlog is getting bigger
On Mon, Nov 25, 2013 at 03:00:56PM +0530, Pankaj Mishra wrote: > We do this to our customer on regular basis when > pg_xlog exceed more then 15 gb Wow. Your poor customers. But I'm glad you phrased it "do this to our customer" rather than "do this _for_ our customer", since your customer certainly is having something done to them. -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres as In-Memory Database?
On Tue, Nov 19, 2013 at 02:39:17AM +0100, Stefan Keller wrote: > Referring to the application is something you can always say - but > shouldn't prevent on enhancing Postgres. With respect, that sounds like a sideways version of, "You should optimise for $usecase". You could be right, but I think the judgement of the Postgres developers has generally been that special cases are not the mainline case. And indeed, given the specifics of the use case you're outlining, it's as much a demonstration of that evaluation as a repudiation of it. I don't think there's any evidence that the Postgres developers ignore useful optimisations. What you're arguing is that the optimisation you have in mind isn't covered. What you need is an argument that it is generally useful. Otherwise, the right thing to do is get a specialised tool (which might be a special optimisation of the Postgres code). Best, A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sharing data directories between machines
On Tue, Aug 06, 2013 at 12:08:57PM -0700, Steve Atkins wrote: > > You can't do it by sharing the disk files, at all. The two instances will > trash each others data. > Right. This is why products that do this sort of hardware fail-over have something akin to the "STONITH" (Shoot The Other Node In The Head) program, to make sure the failed box is actually turned off before the failover box starts up.[1] Given Postgres's hot standby features today, you don't need this sort of solution, I'll wager. Best, A [1] Or anyway, they have those kinds of program unless they are designed and sold by IBM, whose product once made my afternoon more amusing than I wanted by blowing up the data area on fail over not once, or even twice, but three times. (This was attributed to operator error, because the operator had dared to run the failover sequence.) -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general