Re: [GENERAL] delete is getting hung when there is a huge data in table
This delete runs in a single transaction. That means the entire transaction has to complete before you will see anything deleted. Interrupting the transaction simply rolls it back, so nothing is deleted. Tom already pointed out the potential foreign key slowdown, another slowdown may simply be drive speed. My recommendation: cut the delete in chunks. For example delete the data one week at a time. That way the transaction is smaller, the dataset to delete is smaller and it will finish quicker. Uwe On Sun, May 03, 2015 03:24:25 AM Mitu Verma wrote: > Hi, > > I am facing an issue with the deletion of huge data. > We have a cronscript which is used to delete the data of last 3 months from > one of the tables. Data in the table is large (8872597 as you can see the > count below) since it is from last 3 months. > > fm_db_Server3=# select count(*) from audittraillogentry ; > > > count > - > 8872597 > (1 row) > > Now issue is that when this script for the deletion of data is launched , it > is taking more than 7 days and doing nothing i.e not a single row has been > deleted. > > Then we stopped the script,terminated the database sessions by using SELECT > pg_terminate_backend(proc pid) and run the following command > > delete from audittraillogentry where intime <= to_timestamp('2015-01-30 > 23:59:59.999', '/MM/DD-HH24:MI:SS.FF3') OR outtime <= > to_timestamp('2015-01-30 23:59:59.999', '/MM/DD-HH24:MI:SS.FF3'); still > this delete operation is not working and not a single row has been deleted > from the table. > > Now i have following questions - > > 1. If postgreSQL has some limitations for deletion of large data? > 2. If i should run the vacumm, after stopping the cron script ? because > probably to get the "smaller" table? 3. if dropping the indexes can help > here? now sure. > 4.if i should think about partitioning , if there is any limitation while > delaing with large data in postgreSQL? > > regards > Mitu > > _ > ___ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] earthdistance
How accurate do you need it? My website has a lot of "local" listing stuff based on a distance from the viewer and I use the earthdistance module in contrib to do it. Given, it's not accurate enough to calculate a surgical missile strike, but for "within 20 miles" type of things it's good enough - give or take a mile. The problem I have is to determine the location of the viewer/poster. Aside from asking precice lat/long coordinates (which no user will have any clue about), the next best thing is to rely on smartphones and their GPS - but what of regular browser/computer users? When I let google map my location it shows as San Francisco - which is about 56 miles off. So network location makes no sense for this. I'm using a zipcode based geographical mapping, which already has flaws since a zipcode is an area, not a point. The commonly available zipcode databases give you the geographical center of the zipcode - which certainly will be some distance off from the real location. I found that the inaccuracies work for my application - nobody cares about a few more or less miles when looking for something. The advantage is that it also protects the privacy of the poster to some degree - nobody really needs to know exactly where the post originated... If "openbarter" is what I think it is (kinda craigslist just with bartering), I think a few miles off won't make a difference. Chances are, your members won't divulge their true location anyways. We have members from South Africa using a US zipcode - which is what ... 5000 miles off? Earthdistance is definitely easy to deal with - just give a latitude/longitude and off you go.. Uwe On Fri, 08/09/2013 09:29:49 PM Olivier Chaussavoine wrote: I develope a project openbarter that needs to match objects based on a maximum distance between their positions on earth. I saw that the documentation of the extension earthdistance was interesting, but the promise was not in the code. It would be nice to have these functions available independently of sophisticated geographic systems. There is a circle object for flat two dimensional space, but earth deals with spherical caps. It would not be exact but enough to suppose that earth is a sphere and that all dimensions latitude, longitude and distance are in radian. What would need to be done to adapt the circle type to a new type 'spherical cap' that would allow simple geographic indexing? -- Olivier Chaussavoine
Re: [GENERAL] Views versus user-defined functions: formatting, comments, performance, etc.
> Hi: > > I notice when I save a view, I lose all the formatting and comments. > As I was writing a complicated view, wanting to retain the format and > comments, I thought I could just save it as a function that returns a > table value. A function would evaluate to the same value as a view, > but changing it later might be less confusing. > > However, I'm guessing (since I don't know anything about the > internals) that the loss of formatting and comments is a result of the > view being processed and stored in a more computer-friendly format, > while functions are simply stored as the text that I type. That gives > me reason to suspect there may be performance or other differences > between the same SQL statement stored either as a view or a > user-defined function. > > So that's my question: as someone who doesn't have a problem with > putting a pair of empty parentheses at the end of a table variable > name, what factors should I be thinking of while deciding whether to > store my self-composed, multi-hundred-line long SQL statement as a > view or a function? To put it simple: views are basically stored SQL queries. Functions are just that - functions. The later require parsing most of the time, the former are simply executed like any other query (I think views are actually kind of prepared, so less overhead than executing the same query several times, but I may be wrong and there's far more qualified people on this list to answer that) I run a website with a few million pages a month and every page is assembled out of database records (around 200 tables). I use mostly views for pretty much everything, since a normalized database almost always requires joins over several tables to get the result you want. The framework I use requires SQLAlchemy (a python object oriented SQL mapper) - and I'm just not the person coding SQL in python when I have a perfectly good database much better at that task. So I use views and only use the mapper to map the result from SQL to python. I learned by experience that functions/procedures are slower and in my eyes more cumbersome to maintain and debug. I maintain all the database source in a SCM and I don't use any graphical tools for the SQL - just a good old emacs does it for me nicely. I also have SQL scripts that allow me to update views. My views often depend on each other, so replacing one ususally cascades to others breaking the whole scheme. A script applying the views in correct order helps a lot on that one. So basically, use views for performance , maintenance and sometimes programming reasons (views behave like tables and often your application layer can't tell the difference, which helps) and use functions where you need the extra functionality which a view simply can't provide (i.e. you need to update a record when someone views a different record.) Also think about triggers, they can be quite useful for i.e. my example about needing to update a record. Triggers don't require you to explicitly call the function - the database will do that for you (which kind of obscures that there is something happening...) Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Hint for a query
> I have this tables > > > Table: Contact > IdContact > First Name > Second Name > … other columns > > Table: Employee > IdEmployee > IdContact, related to Contact table > … other columns > > Table: Salesman > IdSaleman > IdEmployee, if salesman is employee, related to Employee table > IdContact, if salesman is not an employee, related to Contact table > > > I need a query > > Id Salesman - Second name - First name > > But I can't figure how to do it, can someone can give advise? > > Thanks Needless to say, this is bit of an odd table layout. You always end up at the contact table, but the layout makes it harder to query. Personally I'd have a foreign key from the contact table to the employee table, rather than the employee table to the contact table - that would also eliminate the employee foreign key in the salesman table. It would also allow you to just join the salesman table to the contact table and then figure out if the contact is an employee. well, that said. Here's a quick one without a lot of deep thought... select a.idsalesman, b.firstname, b.secondname from salesman a join contact b on b.idcontact=a.idcontact union select c.idsalesman, d.firstname, d.secondname from salesman c join employee e on e.idemployee=c.idemployee join contact d on d.idcontact=e.idcontact No guarantees though. It's midnight here and I had a long day... -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Searching for "bare" letters
> Hi, everyone. Uwe wrote: > > What kind of "client" are the users using? I assume you will have some > > kind of user interface. For me this is a typical job for a user > > interface. The number of letters with "equivalents" in different > > languages are extremely limited, so a simple matching routine in the > > user interface should give you a way to issue the proper query. > > The user interface will be via a Web application. But we need to store > the data with the European characters, such as ñ, so that we can display > them appropriately. So much as I like your suggestion, we need to do > the opposite of what you're saying -- namely, take a bare letter, and > then search for letters with accents and such on them. > > I am beginning to think that storing two versions of each name, one bare > and the other not, might be the easiest way to go. But hey, I'm open > to more suggestions. > > Reuven That still doesn't hinder you from using a matching algorithm. Here a simple example (to my understanding of the problem) You have texts stored in the db both containing a n and a ñ. Now a client enters "n" on the website. What you want to do is look for both variations, so "n" translates into "n" or "ñ". There you have it. In the routine that receives the request you have a matching method that matches on "n" (or any of the few other characters with equivalents) and the routine will issue a query with a "xx like "%n%" or xx like "%ñ%" (personally I would use ilike, since that eliminates the case problem). Since you're referring to a "name", I sure don't know the specifics of the problem or data layout, but by what I know I think you can tackle this with a rather primitive "match -> translate to" kind of algorithm. One thing I'd not do: store duplicate versions. There's always a way to deal with data the way it is. In my opinion storing different versions of the same data just bloats a database in favor of a smarter way to deal with the initial data. Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Searching for "bare" letters
> Hi, everyone. I'm working on a project on PostgreSQL 9.0 (soon to be > upgraded to 9.1, given that we haven't yet launched). The project will > involve numerous text fields containing English, Spanish, and Portuguese. > Some of those text fields will be searchable by the user. That's easy > enough to do; for our purposes, I was planning to use some combination of > LIKE searches; the database is small enough that this doesn't take very > much time, and we don't expect the number of searchable records (or > columns within those records) to be all that large. The thing is, the > people running the site want searches to work on what I'm calling (for > lack of a better term) "bare" letters. That is, if the user searches for > "n", then the search should also match Spanish words containing "ñ". I'm > told by Spanish-speaking members of the team that this is how they would > expect searches to work. However, when I just did a quick test using a > UTF-8 encoded 9.0 database, I found that PostgreSQL didn't see the two > characters as identical. (I must say, this is the behavior that I would > have expected, had the Spanish-speaking team member not said anything on > the subject.) So my question is whether I can somehow wrangle PostgreSQL > into thinking that "n" and "ñ" are the same character for search purposes, > or if I need to do something else -- use regexps, keep a "naked," > searchable version of each column alongside the native one, or something > else entirely -- to get this to work. Any ideas? > Thanks, > Reuven What kind of "client" are the users using? I assume you will have some kind of user interface. For me this is a typical job for a user interface. The number of letters with "equivalents" in different languages are extremely limited, so a simple matching routine in the user interface should give you a way to issue the proper query. Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] (another ;-)) PostgreSQL-derived project ...
> First, I wonder what kind of technical person would say there are > "de-facto truth(s)". I thought only politicians would talk like that. Well, politicians and Microsoft, Oracle etc. :-) > Now, in a sense you are right, I am talking from the background of my > own experiences (and so are you). When I have developed relatively > complicated and heavily accessed websites I only use DBMS when I need > to actually persist any data. For example there are options in Tomcat > (the java-based web serverrr) to offload session handling to a DBMS > (which is great when you need to stat(istically trace and infer users' > navigation) and establish transactions offloading a timed-out session > to an actual database hitting thhard drivevee (some user got > distracted ...) ...) and that sounds great, but anything dealing with > I/O would invariably slow your apps, so what I do is use in-mem (lite) > DBMS such as Hypersonic SQL (hsqldb.org) to keep sessions off the I/O > subsystem and the speed increase is --very-- noticeable > ~ > Since any piece of code engaging the I/O such as database access code > should be as fast and simple as possible; yes, I would design, say, > java code wrappers doing anything that is not strictly INSERT and > SELECT raw data ... let me deal with the "semantics" and "business > intelligence" of the data myself So you're keeping a lot in memory, which to me suggests plenty of hardware is available. One of my current apps chews up 8Gb of memory just for the app and I can't afford to get a 64Gb or more server. If I wanted to keep permanently accessed data in memory, I'd need somewhere around 1/2 a terrabyte of memory - so obviously not an option (or maybe really bad database design :-) ) That said, just considering the cost/effort it takes to strip Postgresql down, why don't you go with a server that has 1TB of solid state discs? That strips down the I/O bottleneck considerably without any effort. > > Data types aren't stored in the database as character strings (unless you > > define your columns as text, of course). > > ~ > I was talking about text and any formatting option in NUMERIC or DATE data In my experience "data formatting" goes both ways, in and out. Out is obviously not a major issue because errors don't cause data corruption. In, however, is a different issue. Errors in "inwards" conversion will cause data corruption. So unless you have an application that does very little "in" and a lot of "out", you still have to code a lot of data conversion which otherwise someone else (the postgresql developers) have already done for you. > > Take dates for example: you'd have to code very carefully to catch all > > the different ways dates are represented on this planet. > > ~ > Yeah! And java does an exceptionally good job at that (including > internationalization) Maybe it does. I never coded Java because I don't like to use technology where Oracle can come sue me :-) I do know however that a lot of languages have quirks with dates and internationalization (python you mentioned earlier being one of them) > http://download.oracle.com/javase/7/docs/api/java/ {util/Date.html, > text/DateFormat.html} > ~ > So, you would ultimately just have to store a long value into the DBMS Yes, a long value - which can represent pretty much any valid and invalid date ever devised, so again you don't really know what's in the database when you leave the validation to the application. > This is something I would do with wrapping code using input and > output bound command objects which are indexed after the same column > index the DBMS uses Which still depends on your use case. Your assumption is that every piece of code is coded in Java - which is fine if that's what your application calls for. It's going to be a major hassle when you ever have to re-code in a different language though. > > To me, that's moving data integrity into the application. > > ~ > Not exactly! Integrity is still a matter of the DBMS which can now > handle it in an easier way in someone write a date in bangla and > somebody else in Ukranian this is still the same date/time value > ultimately determined by the rotation of our planet around the sun ... > and all we need for that is a long value. Now, aren't we easying > things for the DBMS? I agree to disagree on this one. The date value the database stores in this case is a long. Any "long" can be converted into a valid date - but is it really the date that was entered in the first place? If I give a date representation, i.e. 12/31/2010 to the database, I personally don't really care how the database stores the date underneath. All that interests me is that the next time I ask for that field I get 12/31/2010 back. There is no error that can be made other than user error if you ask the database to store a specific date representation. There are errors you can make in your own conversion code which can lead to a different "long" stor
Re: [GENERAL] (another ;-)) PostgreSQL-derived project ...
> > ... you're looking for a non-sql compliant SQL database where a lot of > > the data integrity is actually coded in the application :-) > > ~ > First past of your statement I acknowledged, but how is it exactly > that "lot of the data integrity is actually coded in the application" > ~ Take dates for example: you'd have to code very carefully to catch all the different ways dates are represented on this planet. Your application has to handle this since all the database knows at this point is an absolute time (i.e. seconds since epoch or something the like) and your app has to convert every occurrence of a date to this format or the database won't find anything or even worse store something wrong. Same goes for numbers: if everything is stored in a byte sequence, how does the database know you try to compare the number 1 with the character "1"? Again, your app would have to handle that. To me, that's moving data integrity into the application. > > > That approach strips down on application complexity. My apps don't have > > to do any post-processing > > of the data - I query the records I need and the app merely displays them. > ~ > Again have you actually tested those assumptions? Yes, I have, as have many others. Simple example: program a website like, say Facebook. So you have thousands of users from all over the world. Your website code handles all the data conversions. Now Apple comes along and sells an iPhone which silly enough a lot of people like and try to use to access your website. You now face the problem that you need a second website doing the same thing as the first website except solely made for touch-screen devices. You will be forced to rewrite a lot of your code because all the data conversion is in the code. Even worse, if you'd have to make an iphone or android app in lieu of the second website, you'd have to recode everything you did in a different language - i.e. objective C. If you leave these things to the database, you "simply" write a second client for a different platform and you don't have to fuzz around to get the conversions correct because the application receives the data already converted. Sure this all depends on what application you need this specialized database engine for. If it's an application for a very defined environment you can dictate how data is to be input and train users. If it's an application for the big wild world you will have problems with users doing stupid things beyond your control like writing "P.O. Box 1" into the zipcode field where you expected a 10 digit number. I rather have the database catch those cases and reject storing the bad input. That saves me a lot of validation code in my app. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] (another ;-)) PostgreSQL-derived project ...
> ~ > I have been searching for a PostgreSQL-derived project with a > "less-is-best" Philosophy. Even though I have read about quite a bit > of PG forks out there, what I have in mind is more like a baseline > than a fork. > ~ > My intention is not wrapping the same thing in a different package or > code add-ons/value-added features on top of PG, but ridding PG of > quite a bit of its internal capabilities and just use its very > baseline. > ~ > All I would need PG for is raw data warehousing, memory, > I/O-subsystem management, MVCC/transaction management ... No fanciness > whatsoever. What do you need to, say, format dates in the database if > formatting/pretty-printing and internalization can be taken care more > appropriately in the calling environment say Python or Java? All is > needed is to store a long representing the date. Why are arrays needed > in a the DB proper when serialization and marshaling/casting can be > taken care of in the calling environment. If you are using say, java, > all you need PG to do is to faithfully store a sequence of bytes and > you would do the (de)serialization very naturally indeed. Maybe you let us in a little more on what you're trying to accomplish. What it looks like to me right now is that you're looking for a non-sql compliant SQL database where a lot of the data integrity is actually coded in the application :-) I bet there are database systems out there that do exactly or nearly what you want. Maybe an object oriented one may suit you better than a relational system? For me, I sure don't use all that postgresql has to offer, but I like that it does a lot of things for me and I code most of what my application does inside the database using views, stored procedures and triggers. That approach strips down on application complexity. My apps don't have to do any post-processing of the data - I query the records I need and the app merely displays them. Yes, sometimes I wish postgresql was more "high performance" - but then, I drive an old, paid for, practical car and not a formula one racer without a boot or spare tire. My point being: postgresql does what it does very reliably and although not the best performer on the market, it is a database I would trust my payroll with - and there are few where I'd make that statement. Never had any data loss ever, never had it crash on me. Give good hardware to postgresql and you will get good performance with exceptional stability and integrity. Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Oracle / PostgreSQL comparison...
> On 06/24/2011 09:14 AM, Rodrigo E. De León Plicet wrote: > I'm quite surprised the article didn't mention the importance of having > "somebody to sue if it goes wrong", which is a comment I often see made > re Oracle. It's lucky they didn't try to stress that point, because > evidence of succesful suits by customers against Oracle are ... hard to > come by ... and Oracle's army of lawyers makes it unlikely that suits > would succeed. Despite that and the fact that a suit is unlikely to > award damages sufficient to make up for your losses unless you can show > negligence or willful misbehavior, people keep on talking about needing > to have somebody to sue. The article also forgot to mention what I call the "chair cutting problem". Most decisions in IT departments, particularly in larger companies, are based on the probability to take the grunt when something goes wrong. Therefor, no IT Manager who wants to keep his job will buy anything else than what the CEO thinks is the market leader. If you buy the market leader's product, you take no risk. Clearly you made a good choice, so only the manufacturer is to blame if something goes wrong (and we all know, sooner or later something will go wrong). I've seen this over and over in 20+ years of IT consulting. Let's buy Oracle, Microsoft, IBM - clearly that's the good choices. Well, not always, but it sure is a choice that ensures the safety of the CTO. I've had one - ONE - client who accepted my recommendation to use PostgreSQL for their not all that large custom system. Ever since we've implemented the software and taken it to production the server and software functioned flawless and they don't even have anyone to check on the server (no DBA or sysadmin there). Sure, they're relatively small with 50+ employees, but for the little money it cost they got something that has been running fine since 2003. Last time I checked the server was up close to 500 days. They didn't use my recommendation for mostly linux servers (they did well before the housing slump, but ever since are short on money). Instead they hired a "sysadmin" company - which happens to be a OEM for HP. By the end of the day they spent close to half a million to replace 2 linux servers with 8 windows servers - which don't even work as well as the old servers did (although, that may be an issue with the "sysadmin company" - they want to sell more and are probably not interested or incapable of proper system administration). Oh, well, all I can say is I'd trust PostgreSQL with my payroll any day of the week (and actually do, as these days my main business - not IT consulting anymore - is running on PostgreSQL and again has been pretty much flawless for 3+ years now - with over 100.000 transactions a day) Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Oracle / PostgreSQL comparison...
> > http://cglendenningoracle.blogspot.com/2011/06/oracle-vs-postgres-postgre > > sql.html > > > > Any comments? > > Amusing. > > " > What kind of support is available if we have a problem? What is their > patching schedule to address exploits and known security defects? If > there is a bug, how responsive is the organization to resolving the > issue? These are questions that imply a need to have an organization > behind your technology. In this case, clearly PostgreSQL is not an > appropriate choice, because you won't get acceptable answers to these > questions for any open source software. > " > > Um. What? Let's look at this: > >"What kind of support is available if we have a problem?" With >PostgreSQL, I can correspond with *the people that wrote the >code*. They're friendly, responsive, and are very reasonable >about looking at possible bugs. Every time I have emailed the >community lists for help, I have received reasonable answers >within a few hours (usually *under* an hour). I've never found >a bug, but I've seen plenty of them squashed, as the developers >themselves admit a mistake and announce a fix. > I've got support and even a permanent fix in HEAD for a problem I reported a while ago in under 24 hours. Oracle would never ever do that, particularly not for free (as in beer - which reminds me, we should collect a couple bucks on this list and send a beer and pizza truck over to Tom :-)) ) Back in the day I've worked for Sun and IBM in tech support. Both companies stressed on large customers (i.e. BMW would get their own highly skilled technician stationed on site). All I can say is: "it's amazing what kind of support you can get for 3.3 Million a year" :-) Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why is 8.4 and 9.0 so much slower on some queries?
> Uwe Schroeder writes: > > I have a 8.3 database and decided for various reasons to upgrade to 8.4. > > I also tried 9.0 - same results. On the exactly same hardware with the > > exactly same configuration, some queries perform a factor of 10 > > slower on 8.4 and 9.0 than on 8.3 > > Hmm ... the problem here seems to come from the much poorer size > > estimate from the t/ugt antijoin: > >-> Hash Left Join (cost=71.37..2445.93 rows=52 width=9) > >(actual time=0.791..2.017 rows=104 loops=1) > > > > Hash Cond: ((subplan) = ugt.user_id) > > Filter: (ugt.group_id IS NULL) > > > >-> Hash Anti Join (cost=62.53..2432.89 rows=1 width=9) > >(actual time=0.639..2.976 rows=105 loops=1) > > > > Hash Cond: ((SubPlan 2) = ugt.user_id) > > Now, 8.3 has no idea what it's doing here, and it's just luck that it > produces an estimate that's only a factor of 2 off from reality. 8.4 > actually understands that it's dealing with an antijoin, and with that > greater knowledge it produces ... a much worse estimate :-(. And that > sends it into the weeds about how to perform the higher-level join. > > While I don't have your specific example to try, I did some > experimenting with queries of this form, and I noticed that 8.4's > heuristic in eqjoinsel_semi() was going completely nuts and estimating > that all rows in the lefthand side have join partners (thus, no rows out > of the antijoin). This is because it has stats for one side of the > comparison operator but not the other side (the one with the > sub-select). But it's taking the totally-made-up ndistinct estimate for > the sub-select at face value. It needs to be a bit warier I think. > > In general you might want to think about whether you can recast this > query to avoid the sub-selects in the join conditions. The planner > isn't terribly bright about sub-selects in the first place, and you're > putting them in places where quality estimates are really critical to > getting a good plan. Ahhh, that actually makes sense. Since this example isn't the only spot where I have this kind of query (I tweaked those for weeks to run fast, but always on a 8.3 server), I'll just downgrade to a 8.3 for production. If you're interested, I can provide database and data for you to toy with as there's nothing particularly secretive in that data except email addresses which can be erased as they're not needed for anything. By the end of the day I'll set up another machine with a 9.0 and start changing the queries - simply because 9.0 is a heck of a lot faster with most things - except these kind of left outer joins and subselects. Thanks for your response. Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Why is 8.4 and 9.0 so much slower on some queries?
Hello everyone, this is very odd, particularly because there's obviously nobody having these issues when I google for it. I have a 8.3 database and decided for various reasons to upgrade to 8.4. I also tried 9.0 - same results. On the exactly same hardware with the exactly same configuration, some queries perform a factor of 10 slower on 8.4 and 9.0 than on 8.3 A factor that large leads me to believe that I'm missing something or the queries in question were flawed to begin with (which may very well be) Here's an example: I have a forum with a table for the threads and a table for the posts. There's also a table which keeps user specific things (email, user_id) and a table that lets me assign roles (not database roles) to a user (i.e. is an admin, is a normal user etc). The posts table references the threads table (obviously, a post belongs to a certain thread). I can post table specs, but I think it's rather obvious, so I'll go to the explain outputs. There is a view which contains the base query: create or replace view v_grouptopics_helper_club16 as select t.id, t.group_id, p.id as postid from groups_threads t left outer join groups_posts p on p.thread_id=t.id AND t.deleted=false AND p.id = (SELECT p.id FROM groups_posts p LEFT OUTER JOIN user_group ugi on ugi.user_id=p.posted_by_uid AND ugi.group_id=16 WHERE ugi.group_id IS NULL AND p.thread_id=t.id ORDER BY date_posted desc limit 1) LEFT OUTER JOIN user_group ugt on ugt.user_id=(select uid from bprofile where id=t.posted_by) AND ugt.group_id=16 WHERE ugt.group_id IS NULL AND t.deleted=false ORDER BY p.date_posted DESC; So basically what I want is a list of latest forum topics with the latest post where the user who posted either is not in the role with number 16. When I do a explain analyze select * from v_grouptopics_helper_club16 where group_id=199; on the 8.3 server I get this result: Subquery Scan v_grouptopics_helper_club16 (cost=4117.52..4118.17 rows=52 width=12) (actual time=5.959..6.173 rows=104 loops=1) -> Sort (cost=4117.52..4117.65 rows=52 width=20) (actual time=5.957..6.029 rows=104 loops=1) Sort Key: p.date_posted Sort Method: quicksort Memory: 21kB -> Nested Loop Left Join (cost=84.63..4116.04 rows=52 width=20) (actual time=0.842..5.775 rows=104 loops=1) Join Filter: ((NOT t.deleted) AND (p.thread_id = t.id)) -> Hash Left Join (cost=71.37..2445.93 rows=52 width=9) (actual time=0.791..2.017 rows=104 loops=1) Hash Cond: ((subplan) = ugt.user_id) Filter: (ugt.group_id IS NULL) -> Index Scan using groups_threads_group_idx on groups_threads t (cost=0.00..237.18 rows=104 width=13) (actual time=0.021..0.257 rows=104 loops=1) Index Cond: (group_id = 199) Filter: (NOT deleted) -> Hash (cost=67.41..67.41 rows=317 width=8) (actual time=0.741..0.741 rows=318 loops=1) -> Index Scan using user_group_club16_idx on user_group ugt (cost=0.00..67.41 rows=317 width=8) (actual time=0.024..0.430 rows=318 loops=1) Index Cond: (group_id = 16) SubPlan -> Index Scan using bprofile_pkey on bprofile (cost=0.00..5.87 rows=1 width=4) (actual time=0.005..0.006 rows=1 loops=104) Index Cond: (id = $2) -> Index Scan using groups_posts_pk on groups_posts p (cost=13.26..18.84 rows=1 width=16) (actual time=0.005..0.006 rows=1 loops=104) Index Cond: (p.id = (subplan)) SubPlan -> Limit (cost=0.00..13.26 rows=1 width=12) (actual time=0.024..0.024 rows=1 loops=104) -> Nested Loop Left Join (cost=0.00..238.68 rows=18 width=12) (actual time=0.022..0.022 rows=1 loops=104) Filter: (ugi.group_id IS NULL) -> Index Scan Backward using groups_posts_thread_date_idx on groups_posts p (cost=0.00..85.96 rows=35 width=16) (actual time=0.012..0.012 rows=1 loops=104) Index Cond: (thread_id = $0) -> Index Scan using user_group_uid_idx on user_group ugi (cost=0.00..4.35 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=104) Index Cond: (ugi.user_id = p.posted_by_uid) Filter: (ugi.group_id = 16) -> Limit (cost=0.00..13.26 rows=1 width=12) (actual time=0.024..0.024 rows=1 loops=104) -> Nested Loop Left Join (cost=0.00..238.68 rows=18 width=12) (actual time=0.022..0.022 rows=1 loops=104) Filter: (ugi.group_id IS NULL) -> Index Sca
Re: [GENERAL] Web Hosting
Godaddy virtual hosting does in fact support postgresql. You have a root account on the virtual server and you can install whatever you want. I run several servers with them and all have postgresql, some virtual, some dedicated servers. Haven't tried their shared servers though, so I can't say anything about those. Hope that helps. PS: for a company that size their customer support isn't too shabby either. Uwe > Gentlemen- > > Go-daddy *claims* to support postgres > http://help.godaddy.com/article/2330 > > YMMV > Martin-- > __ > Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité > > Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene > Empfaenger sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte > Weiterleitung oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht > dient lediglich dem Austausch von Informationen und entfaltet keine > rechtliche Bindungswirkung. Aufgrund der leichten Manipulierbarkeit von > E-Mails koennen wir keine Haftung fuer den Inhalt uebernehmen. Ce message > est confidentiel et peut être privilégié. Si vous n'êtes pas le > destinataire prévu, nous te demandons avec bonté que pour satisfaire > informez l'expéditeur. N'importe quelle diffusion non autorisée ou la > copie de ceci est interdite. Ce message sert à l'information seulement et > n'aura pas n'importe quel effet légalement obligatoire. Étant donné que > les email peuvent facilement être sujets à la manipulation, nous ne > pouvons accepter aucune responsabilité pour le contenu fourni. > > > Date: Sat, 5 Mar 2011 16:40:57 -0800 > > Subject: Re: [GENERAL] Web Hosting > > From: m...@kitchenpc.com > > To: urlu...@gmail.com > > CC: pgsql-general@postgresql.org > > > > On Sat, Mar 5, 2011 at 1:08 PM, matty jones wrote: > > > I already have a domain name but I am looking for a hosting company > > > that I can use PG with. The few I have contacted have said that they > > > support MySQL only and won't give me access to install what I need or > > > they want way to much. I don't need a dedicated host which so far > > > seems the only way this will work, all the companies I have researched > > > so far that offer shared hosting or virtual hosting only use MySQL. I > > > will take care of the setup and everything myself but I have already > > > written my code using PG/PHP and I have no intention of switching. > > > Thanks. > > > > Well there's this list: > > > > http://www.postgresql.org/support/professional_hosting > > > > Also, maybe something like Amazon EC2 if you want your own "box"? I > > think the small instances are even free.. > > > > Mike -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PG9.0 planner difference to 8.3 -> majorly bad performance
> The duration suddenly goes from 270 milliseconds to 173 seconds! The index > scan on bprofile_comments_status_idx suddenly shows 15288 loops, where it > should be 1 loop just like before. So shomehow the 9.0 planner gets it all > wrong. > > I also noticed that normally I get an iowait with a few percent during such > operations (on 8.3), where with pg9 I get 0 iowait and 100% CPU. PG9 has a > much smaller memory footprint than 8.3 in the same configuration - so this > all makes very little sense to me. Maybe someone here has an idea. Usually someone here has a lot more knowledge than me and comes up with a viable hint rather quickly. Not so this time I'm afraid. That tells me that something deep down changed. Here are a few observations I made: in PG9 NOT IN queries with a subselect (aka "select x from y where x.id not in (select id from some_other_table)" ) perform a heck of a lot better than in 8.x. On the same note, when you re-wrote the query to use a left outer join with a "IS NULL" where clause, PG9 performs horribly slow. A query like "select x from y left outer join z on z.id=y.id where z.id is null" performs like a 1000 times slower than in 8.x I'm not an expert looking at explain output, but every time I try the "left outer" solution to something that's basically a "not in" I get lousy performace. Looking at the explain, now a "left outer join" always implies as many loops over a nested block as there are rows in the referenced table. 8.x actually returns the rows in one loop. This seems to be an issue of what is done in what order. 8.x puts the "left outer" scan pretty early in the query, so the remaining joins already use a limited dataset. 9.0 puts the left outer towards the end of the nested blocks, which makes it work on the full set. Maybe this makes sense to someone with more in-depth knowlege of the changes. So far to me it looks like "if you use 9.x, avoit left outer joins and use "NOT IN". On pg versions prior to 9.x avoid NOT IN and use left outer joins odd :-) Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PG9.0 planner difference to 8.3 -> majorly bad performance
> * Uwe Schroeder (u...@oss4u.com) wrote: > > Now I turn off the 8.3 instance and start the 9.0 instance. Remember, everything is identical. Here the same query again: > Everything isn't identical if you just started PG 9.0 though- presumably > the 8.3 instance had everything cache'd already. What happens if you > run this query again under 9.0..? The 8.3 instance is also just started. I run both on the same system (for testing) so I turn one off to have the memory available. But yes, I did run the queries multiple times in a row with no major improvement. > > > The duration suddenly goes from 270 milliseconds to 173 seconds! The > > index scan on bprofile_comments_status_idx suddenly shows 15288 loops, > > where it should be 1 loop just like before. So shomehow the 9.0 planner > > gets it all wrong. > > You do have a different plan, but both of them have a Nested Loop, with > a Hash table built inside it. The 9.0 does also do the index scan > inside the loop, but if you look at the actual time, that's not really > causing a huge difference. One thing I'm wondering about is if 9.0 is > getting a more accurate view of the amount of data and is realizing that > it might go over work_mem with the big Hash Left Join, and so decides > against it. What does your work_mem setting look like on each system? > Have you tried increasing it? What has me bummed is the index scan on Index Scan using bprofile_comments_status_idx on bprofile_comments c (cost=0.00..2558.77 rows=1531 width=12) (actual time=0.140..21.559 rows=1660 loops=1) vs Index Scan using bprofile_comments_status_idx on bprofile_comments c (cost=0.00..4328.64 rows=1751 width=12) (actual time=0.033..8.097 rows=1872 loops=15288) Unless I read this wrong, the upper (8.3) index scan fetches 1660 rows in up to 21ms the 9.0 plan comes up with an index scan on the same data which fetches 1872 rows in 8 ms but loops 15288 times (that's actually the number of records in the referenced table), which in my book makes this scan take up to 8 x 15288 = 122304 ms or 122 seconds work_mem is set to 50MB and increasing it to 80MB makes no difference > > Thanks, > > Stephen > > > I also noticed that normally I get an iowait with a few percent during > > such operations (on 8.3), where with pg9 I get 0 iowait and 100% CPU. > > PG9 has a much smaller memory footprint than 8.3 in the same > > configuration - so this all makes very little sense to me. Maybe someone > > here has an idea. > > > > Thanks > > > > Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PG9.0 planner difference to 8.3 -> majorly bad performance
> On 29 January 2011 09:11, Uwe Schroeder wrote: > > Maybe someone here can make sense of this. > > I'm trying to upgrade a 8.3 system to a 9.0 system. Usual procedure dump, > > restore, vac full, reindex. > > > > Both - old and new - run on the same hardware and the postgresql.conf > > settings are identical. > > > > You'll probably ask for the table definitions, which I'm happy to > > provide, but I'll omit them here for the sake of a shorter message. > > Basically everything is identical, data, tables, indexes, harware, > > config. > > > > I should mention that the "tables" are really views - maybe something in > > the views changed in 9.0.2 > > > > I run this query on the 8.3 system: > > > > explain analyze SELECT count(v_bprofile_comments_club16.id) FROM > > v_bprofile_comments_club16 WHERE v_bprofile_comments_club16.profile_id = > > '5584' AND v_bprofile_comments_club16.approved = true; QUERY PLAN > > > > - > > - Aggregate > > (cost=6294.37..6294.38 rows=1 width=4) (actual time=269.633..269.635 > > rows=1 loops=1) -> Nested Loop (cost=1889.71..6273.06 rows=8523 > > width=4) (actual time=156.585..266.325 rows=1641 loops=1) -> Nested > > Loop (cost=1889.71..5858.47 rows=779 width=16) (actual > > time=156.565..237.216 rows=1641 loops=1) -> Nested Loop > > (cost=1889.71..4488.01 rows=763 width=12) (actual time=156.453..200.174 > > rows=1641 loops=1) -> Index Scan using bprofile_pkey on bprofile m > > (cost=0.00..4.27 rows=1 width=4) (actual time=0.140..0.145 rows=1 > > loops=1) Index Cond: (id = 5584) > > -> Hash Left Join (cost=1889.71..4476.11 rows=763 > > width=16) (actual time=156.302..194.762 rows=1641 loops=1) Hash Cond: > > (b.uid = ug.user_id) > > -> Hash Join (cost=1821.55..4399.44 rows=763 > > width=20) (actual time=151.372..183.103 rows=1641 loops=1) Hash Cond: > > (c.from_id = b.id) > > -> Index Scan using > > bprofile_comments_status_idx on bprofile_comments c (cost=0.00..2558.77 > > rows=1531 width=12) (actual time=0.140..21.559 rows=1660 loops=1) Index > > Cond: ((profile_id = 5584) AND (approved = true)) Filter: approved > > -> Hash (cost=1726.15..1726.15 > > rows=7632 width=8) (actual time=151.131..151.131 rows=14782 loops=1) -> > > Hash Left Join (cost=61.50..1726.15 rows=7632 width=8) (actual > > time=2.622..119.268 rows=14782 loops=1) Hash Cond: (b.uid = ugi.user_id) > > Filter: (gi.group_name IS NULL) -> Seq Scan on bprofile b > > (cost=0.00..1579.44 rows=15265 width=8) (actual time=0.058..64.033 > > rows=15265 loops=1) Filter: (NOT deleted) -> Hash (cost=55.12..55.12 > > rows=510 width=13) (actual time=2.526..2.526 rows=231 loops=1) -> > > Nested Loop (cost=0.00..55.12 rows=510 width=13) (actual > > time=0.136..1.909 rows=231 loops=1) -> Seq Scan on tg_group gi > > (cost=0.00..1.07 rows=1 width=13) (actual time=0.041..0.050 rows=1 > > loops=1) Filter: ((group_name)::text = 'Club16'::text) -> Index Scan > > using user_group_group_idx on user_group ugi (cost=0.00..45.80 rows=660 > > width=8) (actual time=0.084..1.071 rows=231 loops=1) Index Cond: > > (ugi.group_id = gi.group_id) -> Hash (cost=55.35..55.35 rows=1025 > > width=4) (actual time=4.866..4.866 rows=1025 loops=1) -> Index Scan > > using user_group_group_idx on user_group ug (cost=0.00..55.35 rows=1025 > > width=4) (actual time=0.058..2.766 rows=1025 loops=1) Index Cond: > > (group_id = 2) -> Index Scan using bphotos_profile_primary_idx on > > bphotos p (cost=0.00..1.78 rows=1 width=4) (actual time=0.012..0.015 > > rows=1 loops=1641) Index Cond: ((p.profile_id = b.id) AND (p.is_primary > > = true)) -> Index Scan using bphotos_profile_primary_idx on bphotos p > > (cost=0.00..0.52 rows=1 width=4) (actual time=0.008..0.011 rows=1 > > loops=1641) Index Cond: ((p.profile_id = b.id) AND (p.is_primary = > > true)) Total runtime: 270.808 ms > > (33 rows) > > > > > > As you can see, the query performs nicely (for the hardware used). > > > > Now I turn off the 8.3 instance and start the 9.0 instance. Remember, > > everything is identical. Here the same query again: > > > > explain analyze SELECT count(v_bprofile_comments_club
[GENERAL] PG9.0 planner difference to 8.3 -> majorly bad performance
Maybe someone here can make sense of this. I'm trying to upgrade a 8.3 system to a 9.0 system. Usual procedure dump, restore, vac full, reindex. Both - old and new - run on the same hardware and the postgresql.conf settings are identical. You'll probably ask for the table definitions, which I'm happy to provide, but I'll omit them here for the sake of a shorter message. Basically everything is identical, data, tables, indexes, harware, config. I should mention that the "tables" are really views - maybe something in the views changed in 9.0.2 I run this query on the 8.3 system: explain analyze SELECT count(v_bprofile_comments_club16.id) FROM v_bprofile_comments_club16 WHERE v_bprofile_comments_club16.profile_id = '5584' AND v_bprofile_comments_club16.approved = true; QUERY PLAN -- Aggregate (cost=6294.37..6294.38 rows=1 width=4) (actual time=269.633..269.635 rows=1 loops=1) -> Nested Loop (cost=1889.71..6273.06 rows=8523 width=4) (actual time=156.585..266.325 rows=1641 loops=1) -> Nested Loop (cost=1889.71..5858.47 rows=779 width=16) (actual time=156.565..237.216 rows=1641 loops=1) -> Nested Loop (cost=1889.71..4488.01 rows=763 width=12) (actual time=156.453..200.174 rows=1641 loops=1) -> Index Scan using bprofile_pkey on bprofile m (cost=0.00..4.27 rows=1 width=4) (actual time=0.140..0.145 rows=1 loops=1) Index Cond: (id = 5584) -> Hash Left Join (cost=1889.71..4476.11 rows=763 width=16) (actual time=156.302..194.762 rows=1641 loops=1) Hash Cond: (b.uid = ug.user_id) -> Hash Join (cost=1821.55..4399.44 rows=763 width=20) (actual time=151.372..183.103 rows=1641 loops=1) Hash Cond: (c.from_id = b.id) -> Index Scan using bprofile_comments_status_idx on bprofile_comments c (cost=0.00..2558.77 rows=1531 width=12) (actual time=0.140..21.559 rows=1660 loops=1) Index Cond: ((profile_id = 5584) AND (approved = true)) Filter: approved -> Hash (cost=1726.15..1726.15 rows=7632 width=8) (actual time=151.131..151.131 rows=14782 loops=1) -> Hash Left Join (cost=61.50..1726.15 rows=7632 width=8) (actual time=2.622..119.268 rows=14782 loops=1) Hash Cond: (b.uid = ugi.user_id) Filter: (gi.group_name IS NULL) -> Seq Scan on bprofile b (cost=0.00..1579.44 rows=15265 width=8) (actual time=0.058..64.033 rows=15265 loops=1) Filter: (NOT deleted) -> Hash (cost=55.12..55.12 rows=510 width=13) (actual time=2.526..2.526 rows=231 loops=1) -> Nested Loop (cost=0.00..55.12 rows=510 width=13) (actual time=0.136..1.909 rows=231 loops=1) -> Seq Scan on tg_group gi (cost=0.00..1.07 rows=1 width=13) (actual time=0.041..0.050 rows=1 loops=1) Filter: ((group_name)::text = 'Club16'::text) -> Index Scan using user_group_group_idx on user_group ugi (cost=0.00..45.80 rows=660 width=8) (actual time=0.084..1.071 rows=231 loops=1) Index Cond: (ugi.group_id = gi.group_id) -> Hash (cost=55.35..55.35 rows=1025 width=4) (actual time=4.866..4.866 rows=1025 loops=1) -> Index Scan using user_group_group_idx on user_group ug (cost=0.00..55.35 rows=1025 width=4) (actual time=0.058..2.766 rows=1025 loops=1) Index Cond: (group_id = 2) -> Index Scan using bphotos_profile_primary_idx on bphotos p (cost=0.00..1.78 rows=1 width=4) (actual time=0.012..0.015 rows=1 loops=1641) Index Cond: ((p.profile_id = b.id) AND (p.is_primary = true))
Re: [GENERAL] Subselect AS and Where clause
> Uwe Schroeder, 26.01.2011 08:34: > > I have a query like this: > > > > SELECT a,b,c, (select problem from other_table where id=a) as problem > > FROM mytable WHERE a=1 > > > > So far so good. Actually "problem" always resolves to one record, so it's > > not the "multiple records returned" problem. > > > > What I try to do is this: > > > > SELECT a,b,c, (select problem from other_table where id=a) as problem > > FROM mytable WHERE a=1 and problem = 3 > > > > see the "problem=3" part in the where clause? The error I get is > > > > SQLError: (ProgrammingError) column "problem" does not exist > > You need to wrap the whole SELECT in order to be able to use the column > alias: > > SELECT * > FROM ( >SELECT a, > b, > c, > (select problem from other_table where id=a) as problem >FROM mytable > ) t > WHERE a=1 >AND problem = 3 > > Regards > Thomas Thanks a lot Thomas! there's the reason why open source like postgresql is far supperior to anything commercial - an answer when you need it! Works like a charm now. Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Subselect AS and Where clause
Google being useless tonight - now that's new :-) What I'm trying to do is the following and I'm certain there is a simple solution which eludes me: I have a query like this: SELECT a,b,c, (select problem from other_table where id=a) as problem FROM mytable WHERE a=1 So far so good. Actually "problem" always resolves to one record, so it's not the "multiple records returned" problem. What I try to do is this: SELECT a,b,c, (select problem from other_table where id=a) as problem FROM mytable WHERE a=1 and problem = 3 see the "problem=3" part in the where clause? The error I get is SQLError: (ProgrammingError) column "problem" does not exist Do I miss something? Shouldn't the "as" assignment make this virtual column available to the where clause? I think this should work and I know it works if I'd make a view out of the query. However, the query is built dynamically, so turning it into a view isn't really an option. Any pointer will be greatly appreciated. Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Moving from SQL Anywhere to PostGres - First Time
> Hi, > > Price and features. There are some changes in V12 which is causing us to > do some steps with our support and resellers. We've always wanted to move > away from Sybase so we can make more of a profit off our software and lower > costs for us and our clients. Sybase is a wonderful product, and when > things stay the same - we stay the same. But, since V12 is requiring us > to make internal changes, step changes, documentation changes, training > changes ... then it's a good time for us to stop and look at alternatives > if we were going to look at them anyway. > > If we can save $100K a year moving to PostGres and give our customers > better pricing, then it only makes sense to move - if PostGres does the > same thing that Sybase does. This is quite a jump. I've worked with Sybase on military projects - pretty much all the application logic coded in Sybase stored procedures. This would be a scenario I'm not happy to convert to any other database - maybe except MS SQL because that's historically an off-spring of Sybase and rather similar at the base of things (although a decade of MS intervention sure changed things a lot). However, if your app doesn't rely on stored procedures heavily, the switch may well be worth it. I've used postgresql long before it was even called that - even back in the day when it was postgres and there was no sql whatsoever. Still, this database has yet to fail me. I've written software for government, military and financial industry and most of the projects I worked on where I recommended postgres(sql) as the DB backend are still around, maintained and run like a charm. One of my insurance clients had a postgresql 8.2 database running for well over a year with nobody checking in on it (see -> recession - they fired all their IT staff). The old server is still running just fine - just checked in on it the other day and I was the first to log on in over a year - runs, no questions asked. So yes, if you want a database that just does the job I'd really recommend postgresql. Sure, you can do the same with mucho commercial products, but personally I'd rather shoot a message to the mailing list here and get a very professional/knowledegable/qualified answer within hours - that's much better than any commercial big business support I've experienced (usually you call the support for your vendor and you get someone who basically has no clue what you're talking about - been there, done that - worked for Sun Microsystems 3rd level operating system support for years and was always amazed that only few people else could answer the questions of customers who paid 7+ digit support contracts - i.e. BMW, Audi etc.) > > They are a premium software vendor and have a premium product, but charge a > premium price for it. They've made huge changes that affect us, and won't > make any exceptions or changes for us, so now is the time to start looking > at our options. Have to say: here you get a premium product at cost of zip, nada, nothing and when you shout out to the mailing list - and you provide useable information - chances are you'll get premium support from people like Tom Lane. No company I know of can beat that kind of "easy access to those who know what they're talking about" Don't get me wrong: Sybase is a good product and has been around for years. But if you're on your road to change, going open source is definitely an option to look into. To sum it up: I'd put Postgresql on par with Oracle or DB2 any day of the week - except it's not subject to exorbitant license fees and the support is much better. It's not the fastest database on the market, but if you want something that just works relieably, PostgreSQL will serve you well. > > -Robert > > > "Michael Gould" wrote in message > news:393bf55cebd45c71fec623552acbd...@intermodalsoftwaresolutions.net... > > Robert, > > We used Data Manager from JP to do this. Worked well. He recently added > the ability to read OEM versions of ASA. How's come your moving away from > SQL Anywhere? > > Best Regards > > Mike Gould > > "Robert Paresi" wrote: > > Hello, > > > > We have 700 user install base using Sybase SQL Anywhere 9.02 > > > > We are looking at migrating these installations over to PostGres > > > > 1. Very Very Short Answer Please - why should we? > > > > 2. Does anyone have a utility or migration application to read SQL > > Anywhere > > > to go to PostGres > > > > 3. Does PostGres handle column descriptions (ie: you can give each > > column > > a > > > 50 character description) and then access it via SQL Result Set (like I > > can > > > do in Sybase) > > > > 4. Is there any Date/TimeStamp issues and conversions I need to know > > about. > > > I use simply a DATE field and a TIME field - but do not use DATE/TIME > > stamp > > > fields together. > > > > 5. What UI/Developer tools (GUI) are available to manage the database as > > well as add/c
Re: [GENERAL] What is better method to backup postgresql DB.
> In response to Steeles : > > new to postgresql. need to backup postgresql DB, which way is better to > > backup DB. > > > > from training, I learned that we can backup the whole PGdata and other > > directories to achieve backup goal, originally I was planned to schedule > > jobs to use pgdump to backup all DBs. > > > > so which way is better for backup or there is other way to backup PGDB. > > Use pg_dump or pg_dumpall. If you're at the level that you have to ask > this question, then you'll have nothing but trouble getting reliable > backups by backing up directories. > > Also, since you're new to PostgreSQL, I _HIGHLY_ recommend that you don't > assume that you're getting backups until you can demonstrate that you can > restore them. Good advise! For all my production systems I do a life backup/replication using slony plus a nightly pg_dump which is then copied to a remote server. That way, no matter what happens, I have a backup that's a maximum of 1 day old (if I have to use the pg_dump) or a few seconds behind (if the replicated database isn't hit - which sure is also remote and not in the same physical location) Whatever you use, always make sure you actually can restore the backup. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trade Study on Oracle vs. PostgreSQL
> On 25/09/2010 4:55 AM, Leif Biberg Kristensen wrote: > > On Friday 24. September 2010 20.04.26 Lincoln Yeoh wrote: > >> Then again, does Tom ever sleep? Maybe he sleeps in the afternoon? Or > >> is that when he does intensive coding? > > > > Once there was a suggestion on this list to give Tom every other week off > > to level the competition with the other RDBMSes. > > > > That being said, the PostgreSQL mail lists is an excellent and totally > > free support channel. Speaking for myself, I've never needed any support > > outside the mail lists. And Tom is of course a great big part of that. > > I couldn't agree more. I really appreciate the time and effort people > here have spent helping out. Again, Tom Lane particularly so, for > amazing patience, psychic debugging skills, and endless willingness to > help anybody and everybody. Don't omit other major contributors like Joshua or Scott... (and others I didn't mention by name). Certainly Tom doesn't seem to need sleep, but there's quite a few more people willing and able to contribute and answer questions. Helped me out on more than one occasion and I've been using postgresql long before it was named postgresql (even back in the days when the linux kernel had a 0.9 version number). Sure many of the current contributers weren't around back then, but they picked up where others left off and made postgresql what it is today: a quite viable alternative to commercial "pay lots of dollars" databases. Suffice it to say, I have yet to see a showstopper (provided that you run postgresql on decent hardware). Since they just opened a beer tasting room around the corner of my home, here's to all the active contributors - cheers. (and if any of the core developers/contributors ever come to Santa Rosa, CA - drop me a note and I'll have you over for a beer or wine and a nice BBQ) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Transposing rows and columns
> I'm working with some people who live and breath Excel. I need to be able > to move data back and forth between formats which make sense for Excel and > for PostgreSQL. In some cases, this is just to accommodate what people are > used to. In other cases, like statistical clustering, it's something that > really has to be done. > > Here is a simplified example: > > I'm given data in Excel with one sheet each for a bunch of experiments. In > each sheet, there are rows with different drugs at different doses and > columns for each subject. The cells contain the response data. I wrote a > Perl script which automates the process of extracting that data into a csv > file which can be imported into a table like the following: > > create table results( > expt_no int references experiments(id), > subject int references subjects(id), > drug text references drugs(name), > dose numeric, > response numeric > ) > > Now, suppose I do some computation on the results in the database and want > to export it back out to the same kind of format that I received it (drugs > and doses in rows and subjects in columns.) One method would be to use > Perl. I could use DBD::Pg and loop through a bunch of queries to build a > two dimensional array and then spit that back out but is there a good way > to do this just in SQL? Is there a better way than creating a temporary > table for each subject and then joining all the temp tables? You may want to look into the tablefunc contrib module. It contains a crosstab which will transpose rows and columns in the result. This may be slow though. HTH -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] calculating distance between longitude and latitude
> Does postgresql have functions to calculate the distance between two > sets of longitude and latitude. > You're looking for the earthdistance contrib module. With most Linux distros it's installed under /usr/share/postgresql/8.xx/contrib You may have to install a "postgresql-contrib" package depending on your distro. Typing "locate earthdistance.sql" should reveal the location if it's available already. To activate you'd just do a "pgsql [database] < /whereever/earthdistance.sql" HTH Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] virtualidx exclusive lock
On Sunday 08 November 2009 11:38:28 pm Uwe Schroeder wrote: > I've googled, but there's 0 hits. > > I have an issue with a ton of "idle in transaction" backends. > What I noticed is when I look at pg_locks, pretty much all of the processes > being idle in transaction have an exclusive lock of locktype "virtualidx". > > Well, that doesn't make sense to me, but maybe someone here can tell me > where a "virtualidx" locktype would come from. I'm sure it has to be some > type of query. There is no info about table or anything, all the records > look like: > Is there a way to find out what query, or in lack of that at least what > table is involved? Thanks everyone. No wonder I didn't find anything on google :-) Turns out the issue was related to the ORM my app is using. That darn thing keeps a cursor open for every select - which certainly keeps the transaction alive (why it uses a transaction for a simple select is the other thing). Anyways, I got it fixed. Thanks Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] virtualidx exclusive lock
I've googled, but there's 0 hits. I have an issue with a ton of "idle in transaction" backends. What I noticed is when I look at pg_locks, pretty much all of the processes being idle in transaction have an exclusive lock of locktype "virtualidx". Well, that doesn't make sense to me, but maybe someone here can tell me where a "virtualidx" locktype would come from. I'm sure it has to be some type of query. There is no info about table or anything, all the records look like: locktype| database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted ---+--+--+--+---++---+-+---+--++---+--+- virtualxid| | | | | 63/10150 | | | | | 63/10150 | 31932 | ExclusiveLock| t virtualxid| | | | | 48/48530 | | | | | 48/48530 | 31323 | ExclusiveLock| t virtualxid| | | | | 47/52387 | | | | | 47/52387 | 31321 | ExclusiveLock| t virtualxid| | | | | 76/4086| | | | | 76/4086| 32074 | ExclusiveLock| t virtualxid| | | | | 15/6007096 | | | | | 15/6007096 | 31169 | ExclusiveLock| t virtualxid| | | | | 10/5689919 | | | | | 10/5689919 | 31595 | ExclusiveLock| t virtualxid| | | | | 32/603998 | | | | | 32/603998 | 31213 | ExclusiveLock| t virtualxid| | | | | 42/117511 | | | | | 42/117511 | 31270 | ExclusiveLock| t virtualxid| | | | | 39/279415 | | | | | 39/279415 | 31267 | ExclusiveLock| t Is there a way to find out what query, or in lack of that at least what table is involved? Thanks Uwe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Backups
Personally I prefer multiple stages of backups off-site. So for my production webserver database, I have slony replicating the database to a different location. In addition I run full dumps every 12 hours which in turn I replicate using rdist to a remote system. That way, whatever happens, the max data loss in the worst case scenario is about 12 hours - which for my application is acceptable. Oh, yes, forgot to mention: the replicated database also gets dumped and goes onto a tape every day. On Saturday 07 November 2009 11:41:55 pm Bret wrote: > I need to back up a production database every night > on FreeBSD 7.2, running Postgresql 8.3. > > Any good backup tips I should be aware of. > > Typically, I make a backup based on the current day, > and rotate the seven days in the backup file name > (eg; sat_backup, sun_backup etc). > > Thanks for all the chatter. > > Bret Stern -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general