Re: [GENERAL] delete is getting hung when there is a huge data in table

2015-05-02 Thread Uwe Schroeder

This delete runs in a single transaction. That means the entire transaction 
has to complete before you will see anything deleted. Interrupting the 
transaction simply rolls it back, so nothing is deleted.
Tom already pointed out the potential foreign key slowdown, another slowdown 
may simply be drive speed.

My recommendation: cut the delete in chunks. For example delete the data one 
week at a time. That way the transaction is smaller, the dataset to delete is 
smaller and it will finish quicker.  

Uwe



On Sun, May 03, 2015 03:24:25 AM Mitu Verma wrote:
> Hi,
> 
> I am facing an issue with the deletion of huge data.
> We have a cronscript which is used to delete the data of last 3 months from
> one of the tables. Data in the table is large (8872597 as you can see the
> count below) since it is from last 3 months.
> 
> fm_db_Server3=# select count(*) from audittraillogentry ;
> 
> 
>   count
> -
> 8872597
> (1 row)
> 
> Now issue is that when this script for the deletion of data is launched , it
> is taking more than 7 days and doing nothing i.e not a single row has been
> deleted.
> 
> Then we stopped the script,terminated the database sessions by using SELECT
> pg_terminate_backend(proc pid) and run the following command
> 
> delete from audittraillogentry where intime <= to_timestamp('2015-01-30
> 23:59:59.999', '/MM/DD-HH24:MI:SS.FF3') OR outtime  <=
> to_timestamp('2015-01-30 23:59:59.999', '/MM/DD-HH24:MI:SS.FF3'); still
> this delete operation is not working and not a single row has been deleted
> from the table.
> 
> Now i have following questions -
> 
> 1. If postgreSQL has some limitations for deletion of large data?
> 2. If i should run the vacumm, after stopping the cron script ? because
> probably to get the "smaller" table? 3. if dropping the indexes can help
> here? now sure.
> 4.if i should think about partitioning , if there is any limitation while
> delaing with large data in postgreSQL?
> 
> regards
> Mitu
> 
> _
> ___



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] earthdistance

2013-08-09 Thread Uwe Schroeder

How accurate do you need it?  My website has a lot of "local" listing stuff 
based on a distance from the viewer and I use the earthdistance module in 
contrib to do it. 
Given, it's not accurate enough to calculate a surgical missile strike, but 
for "within 20 miles" type of things it's good enough - give or take a mile. 

The problem I have is to determine the location of the viewer/poster. Aside 
from asking precice lat/long coordinates (which no user will have any clue 
about), the next best thing is to rely on smartphones and their GPS - but what 
of regular browser/computer users? When I let google map my location it shows 
as San Francisco - which is about 56 miles off. So network location makes no 
sense for this. 
I'm using a zipcode based geographical mapping, which already has flaws since a 
zipcode is an area, not a point. The commonly available zipcode databases give 
you the geographical center of the zipcode - which certainly will be some 
distance off from the real location.

I found that the inaccuracies work for my application - nobody cares about a 
few more or less miles when looking for something. The advantage is that it 
also protects the privacy of the poster to some degree - nobody really needs 
to know exactly where the post originated...

If "openbarter" is what I think it is (kinda craigslist just with bartering), 
I think a few miles off won't make a difference. Chances are, your members 
won't 
divulge their true location anyways. We have members from South Africa using a 
US zipcode - which is what ... 5000 miles off?
Earthdistance is definitely easy to deal with - just give a latitude/longitude 
and off you go..

Uwe




On Fri, 08/09/2013 09:29:49 PM Olivier Chaussavoine wrote:

I develope a project openbarter that needs to match objects based on a maximum 
distance between their positions on earth. I saw that the documentation of the 
extension earthdistance was interesting, but the promise was not in the code. 
It would  be nice to have these functions available independently of 
sophisticated geographic systems. There is a circle object for flat two 
dimensional space, but earth deals with spherical caps. It would not be exact 
but enough to suppose that earth is a sphere and that all dimensions latitude, 
longitude and distance are in radian.
What would need to be done to adapt the circle type to a new type 'spherical 
cap' that would allow simple geographic indexing? 

-- 
Olivier Chaussavoine





Re: [GENERAL] Views versus user-defined functions: formatting, comments, performance, etc.

2012-08-17 Thread Uwe Schroeder


> Hi:
> 
> I notice when I save a view, I lose all the formatting and comments.
> As I was writing a complicated view, wanting to retain the format and
> comments, I thought I could just save it as a function that returns a
> table value.  A function would evaluate to the same value as a view,
> but changing it later might be less confusing.
> 
> However, I'm guessing (since I don't know anything about the
> internals) that the loss of formatting and comments is a result of the
> view being processed and stored in a more computer-friendly format,
> while functions are simply stored as the text that I type.  That gives
> me reason to suspect there may be performance or other differences
> between the same SQL statement stored either as a view or a
> user-defined function.
> 
> So that's my question: as someone who doesn't have a problem with
> putting a pair of empty parentheses at the end of a table variable
> name, what factors should I be thinking of while deciding whether to
> store my self-composed, multi-hundred-line long SQL statement as a
> view or a function?

To put it simple: views are basically stored SQL queries. Functions are just 
that - functions. The later require parsing most of the time, the former are 
simply executed like any other query (I think views are actually kind of 
prepared, so less overhead than executing the same query several times, but I 
may be wrong and there's far more qualified people on this list to answer that)

I run a website with a few million pages a month and every page is assembled 
out of database records (around 200 tables). I use mostly views for pretty 
much everything, since a normalized database almost always requires joins over 
several tables to get the result you want.  The framework I use requires 
SQLAlchemy (a python object oriented SQL mapper) - and I'm just not the person 
coding SQL in python when I have a perfectly good database much better at that 
task. So I use views and only use the mapper to map the result from SQL to 
python. 
I learned by experience that functions/procedures are slower and in my eyes 
more cumbersome to maintain and debug.  I maintain all the database source in 
a SCM and I don't use any graphical tools for the SQL - just a good old emacs 
does it for me nicely.  I also have SQL scripts that allow me to update views. 
My views often depend on each other, so replacing one ususally cascades to 
others breaking the whole scheme. A script applying the views in correct order 
helps a lot on that one.

So basically, use views for performance , maintenance and sometimes 
programming reasons (views behave like tables and often your application layer 
can't tell the difference, which helps) and use functions where you need the 
extra functionality which a view simply can't provide (i.e. you need to update 
a record when someone views a different record.) Also think about triggers, 
they can be quite useful for i.e. my example about needing to update a record. 
Triggers don't require you to explicitly call the function - the database will 
do that for you (which kind of obscures that there is something happening...)

Uwe




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Hint for a query

2011-11-04 Thread Uwe Schroeder


> I have this tables
> 
> 
> Table: Contact
> IdContact
> First Name
> Second Name
> … other columns
> 
> Table: Employee
> IdEmployee
> IdContact, related to Contact table
> … other columns
> 
> Table: Salesman
> IdSaleman
> IdEmployee, if salesman is employee, related to Employee table
> IdContact, if salesman is not an employee, related to Contact table
> 
> 
> I need a query
> 
> Id Salesman - Second name - First name
> 
> But I can't figure how to do it, can someone can give advise?
> 
> Thanks


Needless to say, this is bit of an odd table layout. You always end up at the 
contact table, but the layout makes it harder to query. Personally I'd have a 
foreign key from the contact table to the employee table, rather than the 
employee table to the contact table - that would also eliminate the employee 
foreign key in the salesman table. It would also allow you to just join the 
salesman table to the contact table and then figure out if the contact is an 
employee.

well, that said. Here's a quick one without a lot of deep thought...

select a.idsalesman, b.firstname, b.secondname from salesman a join contact b 
on b.idcontact=a.idcontact union  select c.idsalesman, d.firstname, 
d.secondname from salesman c join employee e on e.idemployee=c.idemployee join 
contact d on d.idcontact=e.idcontact

No guarantees though. It's midnight here and I had a long day...

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Searching for "bare" letters

2011-10-02 Thread Uwe Schroeder
> Hi, everyone.  Uwe wrote:
> > What kind of "client" are the users using?  I assume you will have some
> > kind of user interface. For me this is a typical job for a user
> > interface. The number of letters with "equivalents" in different
> > languages are extremely limited, so a simple matching routine in the
> > user interface should give you a way to issue the proper query.
> 
> The user interface will be via a Web application.  But we need to store
> the data with the European characters, such as ñ, so that we can display
> them appropriately.  So much as I like your suggestion, we need to do
> the opposite of what you're saying -- namely, take a bare letter, and
> then search for letters with accents and such on them.
> 
> I am beginning to think that storing two versions of each name, one bare
> and the other not, might be the easiest way to go.   But hey, I'm open
> to more suggestions.
> 
> Reuven


That still doesn't hinder you from using a matching algorithm. Here a simple 
example (to my understanding of the problem)
You have texts stored in the db both containing a n and a ñ. Now a client 
enters "n" on the website. What you want to do is look for both variations, so 
"n" translates into "n" or "ñ".
There you have it. In the routine that receives the request you have a 
matching method that matches on "n" (or any of the few other characters with 
equivalents) and the routine will issue a query with a "xx like "%n%" or xx 
like "%ñ%" (personally I would use ilike, since that eliminates the case 
problem).

Since you're referring to a "name", I sure don't know the specifics of the 
problem or data layout, but by what I know I think you can tackle this with a 
rather primitive "match -> translate to" kind of algorithm.

One thing I'd not do: store duplicate versions. There's always a way to deal 
with data the way it is. In my opinion storing different versions of the same 
data just bloats a database in favor of a smarter way to deal with the initial 
data.

Uwe



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Searching for "bare" letters

2011-10-01 Thread Uwe Schroeder


> Hi, everyone.  I'm working on a project on PostgreSQL 9.0 (soon to be
> upgraded to 9.1, given that we haven't yet launched).  The project will
> involve numerous text fields containing English, Spanish, and Portuguese. 
> Some of those text fields will be searchable by the user.  That's easy
> enough to do; for our purposes, I was planning to use some combination of
> LIKE searches; the database is small enough that this doesn't take very
> much time, and we don't expect the number of searchable records (or
> columns within those records) to be all that large. The thing is, the
> people running the site want searches to work on what I'm calling (for
> lack of a better term) "bare" letters.  That is, if the user searches for
> "n", then the search should also match Spanish words containing "ñ".  I'm
> told by Spanish-speaking members of the team that this is how they would
> expect searches to work.  However, when I just did a quick test using a
> UTF-8 encoded 9.0 database, I found that PostgreSQL didn't  see the two
> characters as identical.  (I must say, this is the behavior that I would
> have expected, had the Spanish-speaking team member not said anything on
> the subject.) So my question is whether I can somehow wrangle PostgreSQL
> into thinking that "n" and "ñ" are the same character for search purposes,
> or if I need to do something else -- use regexps, keep a "naked,"
> searchable version of each column alongside the native one, or something
> else entirely -- to get this to work. Any ideas?
> Thanks,
> Reuven



What kind of "client" are the users using?  I assume you will have some kind 
of user interface. For me this is a typical job for a user interface. The 
number of letters with "equivalents" in different languages are extremely 
limited, so a simple matching routine in the user interface should give you a 
way to issue the proper query.

Uwe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] (another ;-)) PostgreSQL-derived project ...

2011-09-25 Thread Uwe Schroeder

>  First, I wonder what kind of technical person would say there are
> "de-facto truth(s)". I thought only politicians would talk like that.

Well, politicians and Microsoft, Oracle etc.  :-)

> Now, in a sense you are right, I am talking from the background of my
> own experiences (and so are you). When I have developed relatively
> complicated and heavily accessed websites I only use DBMS when I need
> to actually persist any data. For example there are options in Tomcat
> (the java-based web serverrr) to offload session handling to a DBMS
> (which is great when you need to stat(istically trace and infer users'
> navigation) and establish transactions offloading a timed-out session
> to an actual database hitting thhard drivevee (some user got
> distracted ...) ...) and that sounds great, but anything dealing with
> I/O would invariably slow your apps, so what I do is use in-mem (lite)
> DBMS such as Hypersonic SQL (hsqldb.org) to keep sessions off the I/O
> subsystem and the speed increase is --very-- noticeable
> ~
>  Since any piece of code engaging the I/O such as database access code
> should be as fast and simple as possible; yes, I would design, say,
> java code wrappers doing anything that is not strictly INSERT and
> SELECT raw data ... let me deal with the "semantics" and "business
> intelligence" of the data myself

So you're keeping a lot in memory, which to me suggests plenty of hardware is 
available.  One of my current apps chews up 8Gb of memory just for the app and 
I can't afford to get a 64Gb or more server. If I wanted to keep permanently 
accessed data in memory, I'd need somewhere around 1/2 a terrabyte of memory - 
so obviously not an option (or maybe really bad database design :-) )

That said, just considering the cost/effort it takes to strip Postgresql down, 
why don't you go with a server that has 1TB of solid state discs? That strips 
down the I/O bottleneck considerably without any effort.


> > Data types aren't stored in the database as character strings (unless you
> > define your columns as text, of course).
> 
> ~
>  I was talking about text and any formatting option in NUMERIC or DATE data

In my experience "data formatting" goes both ways, in and out. Out is 
obviously not a major issue because errors don't cause data corruption. In, 
however, is a different issue. Errors in "inwards" conversion will cause data 
corruption. So unless you have an application that does very little "in" and a 
lot of "out", you still have to code a lot of data conversion which otherwise 
someone else (the postgresql developers) have already done for you.


> > Take dates for example: you'd have to code very carefully to catch all
> > the different ways dates are represented on this planet.
> 
> ~
>  Yeah! And java does an exceptionally good job at that (including
> internationalization)

Maybe it does. I never coded Java because I don't like to use technology where 
Oracle can come sue me :-) I do know however that a lot of languages have 
quirks with dates and internationalization (python you mentioned earlier being 
one of them)

>  http://download.oracle.com/javase/7/docs/api/java/ {util/Date.html,
> text/DateFormat.html}
> ~
>  So, you would ultimately just have to store a long value into the DBMS

Yes, a long value - which can represent pretty much any valid and invalid date 
ever devised, so again you don't really know what's in the database when you 
leave the validation to the application.


>  This is something I would do with wrapping code using input and
> output bound command objects which are indexed after the same column
> index the DBMS uses

Which still depends on your use case. Your assumption is that every piece of 
code is coded in Java - which is fine if that's what your application calls 
for. It's going to be a major hassle when you ever have to re-code in a 
different language though.

> > To me, that's moving data integrity into the application.
> 
> ~
>  Not exactly! Integrity is still a matter of the DBMS which can now
> handle it in an easier way in someone write a date in bangla and
> somebody else in Ukranian this is still the same date/time value
> ultimately determined by the rotation of our planet around the sun ...
> and all we need for that is a long value. Now, aren't we easying
> things for the DBMS?

I agree to disagree on this one. The date value the database stores in this 
case is a long. Any "long" can be converted into a valid date - but is it 
really the date that was entered in the first place? If I give a date 
representation, i.e. 12/31/2010 to the database, I personally don't really 
care how the database stores the date underneath. All that interests me is 
that the next time I ask for that field I get 12/31/2010 back. There is no 
error that can be made other than user error if you ask the database to store 
a specific date representation. There are errors you can make in your own 
conversion code which can lead to a different "long" stor

Re: [GENERAL] (another ;-)) PostgreSQL-derived project ...

2011-09-25 Thread Uwe Schroeder

> > ... you're looking for a non-sql compliant SQL database where a lot of
> > the data integrity is actually coded in the application :-)
> 
> ~
>  First past of your statement I acknowledged, but how is it exactly
> that "lot of the data integrity is actually coded in the application"
> ~

Take dates for example: you'd have to code very carefully to catch all the 
different ways dates are represented on this planet. Your application has to 
handle this since all the database knows at this point is an absolute time  
(i.e. seconds since epoch or something the like) and your app has to convert 
every occurrence of a date to this format or the database won't find anything 
or even worse store something wrong. 
Same goes for numbers: if everything is stored in a byte sequence, how does 
the database know you try to compare the number 1 with the character "1"? 
Again, your app would have to handle that.
To me, that's moving data integrity into the application.

> 
> > That approach strips down on application complexity. My apps don't have
> > to do any post-processing
> 
> of the data - I query the records I need and the app merely displays them.
> ~
>  Again have you actually tested those assumptions?

Yes, I have, as have many others.  Simple example: program a website like, say 
Facebook. So you have thousands of users from all over the world. Your website 
code handles all the data conversions. Now Apple comes along and sells an 
iPhone which silly enough a lot of people like and try to use to access your 
website. You now face the problem that you need a second website doing the 
same thing as the first website except solely made for touch-screen devices. 
You will be forced to rewrite a lot of your code because all the data 
conversion is in the code.  Even worse, if you'd have to make an iphone or 
android app in lieu of the second website, you'd have to recode everything you 
did in a different language - i.e. objective C.

 If you leave these things to the database, you "simply" write a second client 
for a different platform and you don't have to fuzz around to get the 
conversions correct because the application receives the data already 
converted.

Sure this all depends on what application you need this specialized database 
engine for. If it's an application for a very defined environment you can 
dictate how data is to be input and train users. If it's an application for 
the big wild world you will have problems with users doing stupid things 
beyond your control like writing "P.O. Box 1" into the zipcode field where you 
expected a 10 digit number. I rather have the database catch those cases and 
reject storing the bad input. That saves me a lot of validation code in my 
app.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] (another ;-)) PostgreSQL-derived project ...

2011-09-24 Thread Uwe Schroeder


> ~
>  I have been searching for a PostgreSQL-derived project with a
> "less-is-best" Philosophy. Even though I have read about quite a bit
> of PG forks out there, what I have in mind is more like a baseline
> than a fork.
> ~
>  My intention is not wrapping the same thing in a different package or
> code add-ons/value-added features on top of PG, but ridding PG of
> quite a bit of its internal capabilities and just use its very
> baseline.
> ~
>  All I would need PG for is raw data warehousing, memory,
> I/O-subsystem management, MVCC/transaction management ... No fanciness
> whatsoever. What do you need to, say, format dates in the database if
> formatting/pretty-printing and internalization can be taken care more
> appropriately in the calling environment say Python or Java? All is
> needed is to store a long representing the date. Why are arrays needed
> in a the DB proper when serialization and marshaling/casting can be
> taken care of in the calling environment. If you are using say, java,
> all you need PG to do is to faithfully store a sequence of bytes and
> you would do the (de)serialization very naturally indeed.

Maybe you let us in a little more on what you're trying to accomplish. What it 
looks like to me right now is that you're looking for a non-sql compliant SQL 
database where a lot of the data integrity is actually coded in the 
application :-)
I bet there are database systems out there that do exactly or nearly what you 
want. Maybe an object oriented one may suit you better than a relational 
system?

For me, I sure don't use all that postgresql has to offer, but I like that it 
does a lot of things for me and I code most of what my application does inside 
the database using views, stored procedures and triggers. That approach strips 
down on application complexity. My apps don't have to do any post-processing 
of the data - I query the records I need and the app merely displays them.  
Yes, sometimes I wish postgresql was more "high performance" - but then, I 
drive an old, paid for, practical car and not a formula one racer without a 
boot or spare tire. My point being: postgresql does what it does very reliably 
and although not the best performer on the market, it is a database I would 
trust my payroll with - and there are few where I'd make that statement.
Never had any data loss ever, never had it crash on me. Give good hardware to 
postgresql and you will get good performance with exceptional stability and 
integrity.

Uwe
 

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Oracle / PostgreSQL comparison...

2011-06-23 Thread Uwe Schroeder


> On 06/24/2011 09:14 AM, Rodrigo E. De León Plicet wrote:
> I'm quite surprised the article didn't mention the importance of having
> "somebody to sue if it goes wrong", which is a comment I often see made
> re Oracle. It's lucky they didn't try to stress that point, because
> evidence of succesful suits by customers against Oracle are ... hard to
> come by ... and Oracle's army of lawyers makes it unlikely that suits
> would succeed. Despite that and the fact that a suit is unlikely to
> award damages sufficient to make up for your losses unless you can show
> negligence or willful misbehavior, people keep on talking about needing
> to have somebody to sue.

The article also forgot to mention what I call the "chair cutting problem".
Most decisions in IT departments, particularly in larger companies, are based 
on the probability to take the grunt when something goes wrong. Therefor, no 
IT Manager who wants to keep his job will buy anything else than what the CEO  
thinks is the market leader.  If you buy the market leader's product, you take 
no risk. Clearly you made a good choice, so only the manufacturer is to blame 
if something goes wrong (and we all know, sooner or later something will go 
wrong). 
I've seen this over and over in 20+ years of IT consulting. Let's buy Oracle, 
Microsoft, IBM - clearly that's the good choices. 

Well, not always, but it sure is a choice that ensures the safety of the CTO. 
I've had one - ONE - client who accepted my recommendation to use PostgreSQL 
for their not all that large custom system. Ever since we've implemented the 
software and taken it to production the server and software functioned flawless 
and they don't even have anyone to check on the server (no DBA or sysadmin 
there). Sure, they're relatively small with 50+ employees, but for the little 
money it cost they got something that has been running fine since 2003. Last 
time I checked the server was up close to 500 days.
They didn't use my recommendation for mostly linux servers (they did well 
before the housing slump, but ever since are short on money). Instead they 
hired a "sysadmin" company - which happens to be a OEM for HP. By the end of 
the day they spent close to half a million to replace 2 linux servers with 8 
windows servers - which don't even work as well as the old servers did 
(although, that may be an issue with the "sysadmin company" - they want to 
sell more and are probably not interested or incapable of proper system 
administration).

Oh, well, all I can say is I'd trust PostgreSQL with my payroll any day of the 
week (and actually do, as these days my main business - not IT consulting 
anymore - is running on PostgreSQL and again has been pretty much flawless for 
3+ years now - with over 100.000 transactions a day)

Uwe




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Oracle / PostgreSQL comparison...

2011-06-23 Thread Uwe Schroeder


> > http://cglendenningoracle.blogspot.com/2011/06/oracle-vs-postgres-postgre
> > sql.html
> > 
> > Any comments?
> 
> Amusing.
> 
> "
> What kind of support is available if we have a problem? What is their
> patching schedule to address exploits and known security defects? If
> there is a bug, how responsive is the organization to resolving the
> issue? These are questions that imply a need to have an organization
> behind your technology. In this case, clearly PostgreSQL is not an
> appropriate choice, because you won't get acceptable answers to these
> questions for any open source software.
> "
> 
> Um.  What?  Let's look at this:
> 
>"What kind of support is available if we have a problem?"  With
>PostgreSQL, I can correspond with *the people that wrote the
>code*.  They're friendly, responsive, and are very reasonable
>about looking at possible bugs.  Every time I have emailed the
>community lists for help, I have received reasonable answers
>within a few hours (usually *under* an hour).  I've never found
>a bug, but I've seen plenty of them squashed, as the developers
>themselves admit a mistake and announce a fix.
> 

I've got support and even a permanent fix in HEAD for a problem I reported a 
while ago in under 24 hours. Oracle would never ever do that, particularly not 
for free (as in beer - which reminds me, we should collect a couple bucks on 
this list and send a beer and pizza truck over to Tom :-)) )

Back in the day I've worked for Sun and IBM in tech support. Both companies 
stressed on large customers (i.e. BMW would get their own highly skilled 
technician stationed on site). All I can say is: "it's amazing what kind of 
support you can get for 3.3 Million a year" :-)

Uwe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is 8.4 and 9.0 so much slower on some queries?

2011-04-11 Thread Uwe Schroeder


> Uwe Schroeder  writes:
> > I have a 8.3 database and decided for various reasons to upgrade to 8.4.
> > I also tried 9.0 - same results. On the exactly same hardware with the
> > exactly same configuration, some queries perform a factor of 10
> > slower on 8.4 and 9.0 than on 8.3
> 
> Hmm ... the problem here seems to come from the much poorer size
> 
> estimate from the t/ugt antijoin:
> >->  Hash Left Join  (cost=71.37..2445.93 rows=52 width=9)
> >(actual time=0.791..2.017 rows=104 loops=1)
> >
> >  Hash Cond: ((subplan) = ugt.user_id)
> >  Filter: (ugt.group_id IS NULL)
> >
> >->  Hash Anti Join  (cost=62.53..2432.89 rows=1 width=9)
> >(actual time=0.639..2.976 rows=105 loops=1)
> >
> >  Hash Cond: ((SubPlan 2) = ugt.user_id)
> 
> Now, 8.3 has no idea what it's doing here, and it's just luck that it
> produces an estimate that's only a factor of 2 off from reality.  8.4
> actually understands that it's dealing with an antijoin, and with that
> greater knowledge it produces ... a much worse estimate :-(.  And that
> sends it into the weeds about how to perform the higher-level join.
> 
> While I don't have your specific example to try, I did some
> experimenting with queries of this form, and I noticed that 8.4's
> heuristic in eqjoinsel_semi() was going completely nuts and estimating
> that all rows in the lefthand side have join partners (thus, no rows out
> of the antijoin).  This is because it has stats for one side of the
> comparison operator but not the other side (the one with the
> sub-select).  But it's taking the totally-made-up ndistinct estimate for
> the sub-select at face value.  It needs to be a bit warier I think.
> 
> In general you might want to think about whether you can recast this
> query to avoid the sub-selects in the join conditions.  The planner
> isn't terribly bright about sub-selects in the first place, and you're
> putting them in places where quality estimates are really critical to
> getting a good plan.


Ahhh, that actually makes sense. Since this example isn't the only spot where 
I have this kind of query (I tweaked those for weeks to run fast, but always 
on a 8.3 server), I'll just downgrade to a 8.3 for production. 

If you're interested, I can provide database and data for you to toy with as 
there's nothing particularly secretive in that data except email addresses 
which can be erased as they're not needed for anything.

By the end of the day I'll set up another machine with a 9.0 and start 
changing the queries - simply because 9.0 is a heck of a lot faster with most 
things - except these kind of left outer joins and subselects.

Thanks for your response.

Uwe




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Why is 8.4 and 9.0 so much slower on some queries?

2011-04-11 Thread Uwe Schroeder

Hello everyone,

this is very odd, particularly because there's obviously nobody having these 
issues when I google for it.

I have a 8.3 database and decided for various reasons to upgrade to 8.4. I also 
tried 9.0 - same results.

On the exactly same hardware with the exactly same configuration, some queries 
perform a factor of 10 slower on 8.4 and 9.0 than on 8.3
A factor that large leads me to believe that I'm missing something or the 
queries in question were flawed to begin with (which may very well be)

Here's an example: 
I have a forum with a table for the threads and a table for the posts. There's 
also a table which keeps user specific things (email, user_id) and a table that 
lets me assign roles (not database roles) to 
a user (i.e. is an admin, is a normal user etc). 

The posts table references the threads table (obviously, a post belongs to a 
certain thread). I can post table specs, but I think it's rather obvious, so 
I'll go to the explain outputs.

There is a view which contains the base query:

create or replace view v_grouptopics_helper_club16 as 
select t.id, t.group_id, p.id as postid from groups_threads t left outer join 
groups_posts p on p.thread_id=t.id AND t.deleted=false AND p.id = (SELECT p.id 
FROM groups_posts p LEFT OUTER JOIN 
user_group ugi on ugi.user_id=p.posted_by_uid AND ugi.group_id=16 WHERE 
ugi.group_id IS NULL AND p.thread_id=t.id ORDER BY date_posted desc limit 1) 
LEFT OUTER JOIN user_group ugt on 
ugt.user_id=(select uid from bprofile where id=t.posted_by) AND ugt.group_id=16 
WHERE ugt.group_id IS NULL AND t.deleted=false ORDER BY p.date_posted DESC;

So basically what I want is a list of latest forum topics with the latest post 
where the user who posted either is not in the role with number 16.

When I do a 
explain analyze select * from v_grouptopics_helper_club16 where group_id=199;

on the 8.3 server I get this result:

Subquery Scan v_grouptopics_helper_club16  (cost=4117.52..4118.17 rows=52 
width=12) (actual time=5.959..6.173 rows=104 loops=1)
   ->  Sort  (cost=4117.52..4117.65 rows=52 width=20) (actual time=5.957..6.029 
rows=104 loops=1)
 Sort Key: p.date_posted
 Sort Method:  quicksort  Memory: 21kB
 ->  Nested Loop Left Join  (cost=84.63..4116.04 rows=52 width=20) 
(actual time=0.842..5.775 rows=104 loops=1)
   Join Filter: ((NOT t.deleted) AND (p.thread_id = t.id))
   ->  Hash Left Join  (cost=71.37..2445.93 rows=52 width=9) 
(actual time=0.791..2.017 rows=104 loops=1)
 Hash Cond: ((subplan) = ugt.user_id)
 Filter: (ugt.group_id IS NULL)
 ->  Index Scan using groups_threads_group_idx on 
groups_threads t  (cost=0.00..237.18 rows=104 width=13) (actual 
time=0.021..0.257 rows=104 loops=1)
   Index Cond: (group_id = 199)
   Filter: (NOT deleted)
 ->  Hash  (cost=67.41..67.41 rows=317 width=8) (actual 
time=0.741..0.741 rows=318 loops=1)
   ->  Index Scan using user_group_club16_idx on 
user_group ugt  (cost=0.00..67.41 rows=317 width=8) (actual time=0.024..0.430 
rows=318 loops=1)
 Index Cond: (group_id = 16)
 SubPlan
   ->  Index Scan using bprofile_pkey on bprofile  
(cost=0.00..5.87 rows=1 width=4) (actual time=0.005..0.006 rows=1 loops=104)
 Index Cond: (id = $2)
   ->  Index Scan using groups_posts_pk on groups_posts p  
(cost=13.26..18.84 rows=1 width=16) (actual time=0.005..0.006 rows=1 loops=104)
 Index Cond: (p.id = (subplan))
 SubPlan
   ->  Limit  (cost=0.00..13.26 rows=1 width=12) (actual 
time=0.024..0.024 rows=1 loops=104)
 ->  Nested Loop Left Join  (cost=0.00..238.68 
rows=18 width=12) (actual time=0.022..0.022 rows=1 loops=104)
   Filter: (ugi.group_id IS NULL)
   ->  Index Scan Backward using 
groups_posts_thread_date_idx on groups_posts p  (cost=0.00..85.96 rows=35 
width=16) (actual time=0.012..0.012 rows=1 loops=104)
 Index Cond: (thread_id = $0)
   ->  Index Scan using user_group_uid_idx on 
user_group ugi  (cost=0.00..4.35 rows=1 width=8) (actual time=0.007..0.007 
rows=0 loops=104)
 Index Cond: (ugi.user_id = 
p.posted_by_uid)
 Filter: (ugi.group_id = 16)
   ->  Limit  (cost=0.00..13.26 rows=1 width=12) (actual 
time=0.024..0.024 rows=1 loops=104)
 ->  Nested Loop Left Join  (cost=0.00..238.68 
rows=18 width=12) (actual time=0.022..0.022 rows=1 loops=104)
   Filter: (ugi.group_id IS NULL)
   ->  Index Sca

Re: [GENERAL] Web Hosting

2011-03-05 Thread Uwe Schroeder
Godaddy virtual hosting does in fact support postgresql. You have a root 
account on the virtual server and you can install whatever you want.

I run several servers with them and all have postgresql, some virtual, some 
dedicated servers.

Haven't tried their shared servers though, so I can't say anything about 
those.

Hope that helps.

PS: for a company that size their customer support isn't too shabby either.

Uwe


> Gentlemen-
> 
> Go-daddy *claims* to support postgres
> http://help.godaddy.com/article/2330
> 
> YMMV
> Martin--
> __
> Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité
> 
> Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene
> Empfaenger sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte
> Weiterleitung oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht
> dient lediglich dem Austausch von Informationen und entfaltet keine
> rechtliche Bindungswirkung. Aufgrund der leichten Manipulierbarkeit von
> E-Mails koennen wir keine Haftung fuer den Inhalt uebernehmen. Ce message
> est confidentiel et peut être privilégié. Si vous n'êtes pas le
> destinataire prévu, nous te demandons avec bonté que pour satisfaire
> informez l'expéditeur. N'importe quelle diffusion non autorisée ou la
> copie de ceci est interdite. Ce message sert à l'information seulement et
> n'aura pas n'importe quel effet légalement obligatoire. Étant donné que
> les email peuvent facilement être sujets à la manipulation, nous ne
> pouvons accepter aucune responsabilité pour le contenu fourni.
> 
> > Date: Sat, 5 Mar 2011 16:40:57 -0800
> > Subject: Re: [GENERAL] Web Hosting
> > From: m...@kitchenpc.com
> > To: urlu...@gmail.com
> > CC: pgsql-general@postgresql.org
> > 
> > On Sat, Mar 5, 2011 at 1:08 PM, matty jones  wrote:
> > > I already have a domain name but I am looking for a hosting company
> > > that I can use PG with.  The few I have contacted have said that they
> > > support MySQL only and won't give me access to install what I need or
> > > they want way to much.  I don't need a dedicated host which so far
> > > seems the only way this will work, all the companies I have researched
> > > so far that offer shared hosting or virtual hosting only use MySQL.  I
> > > will take care of the setup and everything myself but I have already
> > > written my code using PG/PHP and I have no intention of switching.
> > > Thanks.
> > 
> > Well there's this list:
> > 
> > http://www.postgresql.org/support/professional_hosting
> > 
> > Also, maybe something like Amazon EC2 if you want your own "box"?  I
> > think the small instances are even free..
> > 
> > Mike




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG9.0 planner difference to 8.3 -> majorly bad performance

2011-01-29 Thread Uwe Schroeder



> The duration suddenly goes from 270 milliseconds to 173 seconds! The index
> scan on bprofile_comments_status_idx suddenly shows 15288 loops, where it
> should be 1 loop just like before. So shomehow the 9.0 planner gets it all
> wrong.
> 
> I also noticed that normally I get an iowait with a few percent during such
> operations (on 8.3), where with pg9 I get 0 iowait and 100% CPU. PG9 has a
> much smaller memory footprint than 8.3 in the same configuration - so this
> all makes very little sense to me. Maybe someone here has an idea.

Usually someone here has a lot more knowledge than me and comes up with a 
viable hint rather quickly. Not so this time I'm afraid.
That tells me that something deep down changed.

Here are a few observations I made:

in PG9 NOT IN queries with a subselect (aka "select x from y where x.id not in 
(select id from some_other_table)" ) perform a heck of a lot better than in 
8.x. On the same note, when you re-wrote the query to use a left outer join 
with a "IS NULL" where clause, PG9 performs horribly slow. A query like 
"select x from y left outer join z on z.id=y.id where z.id is null" performs 
like a 1000 times slower than in 8.x

I'm not an expert looking at explain output, but every time I try the "left 
outer" solution to something that's basically a "not in" I get lousy 
performace. Looking at the explain, now a "left outer join" always implies as 
many loops over a nested block as there are rows in the referenced table. 8.x 
actually returns the rows in one loop. This seems to be an issue of what is 
done in what order. 8.x puts the "left outer" scan pretty early in the query, 
so the remaining joins already use a limited dataset. 9.0 puts the left outer 
towards the end of the nested blocks, which makes it work on the full set.

Maybe this makes sense to someone with more in-depth knowlege of the changes.

So far to me it looks like "if you use 9.x, avoit left outer joins and use 
"NOT IN". On pg versions prior to 9.x avoid NOT IN and use left outer joins

odd :-)

Uwe



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG9.0 planner difference to 8.3 -> majorly bad performance

2011-01-29 Thread Uwe Schroeder


> * Uwe Schroeder (u...@oss4u.com) wrote:
> > Now I turn off the 8.3 instance and start the 9.0 instance. Remember, 
everything is identical. Here the same query again:
> Everything isn't identical if you just started PG 9.0 though- presumably
> the 8.3 instance had everything cache'd already.  What happens if you
> run this query again under 9.0..?

The 8.3 instance is also just started. I run both on the same system (for 
testing) so I turn one off to have the memory available. 
But yes, I did run the queries multiple times in a row with no major 
improvement.


> 
> > The duration suddenly goes from 270 milliseconds to 173 seconds! The
> > index scan on bprofile_comments_status_idx suddenly shows 15288 loops,
> > where it should be 1 loop just like before. So shomehow the 9.0 planner
> > gets it all wrong.
> 
> You do have a different plan, but both of them have a Nested Loop, with
> a Hash table built inside it.  The 9.0 does also do the index scan
> inside the loop, but if you look at the actual time, that's not really
> causing a huge difference.  One thing I'm wondering about is if 9.0 is
> getting a more accurate view of the amount of data and is realizing that
> it might go over work_mem with the big Hash Left Join, and so decides
> against it.  What does your work_mem setting look like on each system?
> Have you tried increasing it?

What has me bummed is the index scan on 

Index Scan using bprofile_comments_status_idx on bprofile_comments c  
(cost=0.00..2558.77 rows=1531 width=12) (actual time=0.140..21.559 rows=1660 
loops=1)

vs

Index Scan using bprofile_comments_status_idx on bprofile_comments c  
(cost=0.00..4328.64 rows=1751 width=12) (actual time=0.033..8.097 rows=1872 
loops=15288)

Unless I read this wrong, the upper (8.3) index scan fetches 1660 rows in up 
to 21ms
the 9.0 plan comes up with an index scan on the same data which fetches 1872 
rows in 8 ms but loops 15288 times (that's actually the number of records in 
the referenced table), which in my book makes this scan take up to 8 x 15288 = 
122304 ms or 122 seconds

work_mem is set to 50MB and increasing it to 80MB makes no difference 



> 
>   Thanks,
> 
>   Stephen
> 
> > I also noticed that normally I get an iowait with a few percent during
> > such operations (on 8.3), where with pg9 I get 0 iowait and 100% CPU.
> > PG9 has a much smaller memory footprint than 8.3 in the same
> > configuration - so this all makes very little sense to me. Maybe someone
> > here has an idea.
> > 
> > Thanks
> > 
> > Uwe




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG9.0 planner difference to 8.3 -> majorly bad performance

2011-01-29 Thread Uwe Schroeder


> On 29 January 2011 09:11, Uwe Schroeder  wrote:
> > Maybe someone here can make sense of this.
> > I'm trying to upgrade a 8.3 system to a 9.0 system. Usual procedure dump,
> > restore, vac full, reindex.
> > 
> > Both - old and new - run on the same hardware and the postgresql.conf
> > settings are identical.
> > 
> > You'll probably ask for the table definitions, which I'm happy to
> > provide, but I'll omit them here for the sake of a shorter message.
> > Basically everything is identical, data, tables, indexes, harware,
> > config.
> > 
> > I should mention that the "tables" are really views - maybe something in
> > the views changed in 9.0.2
> > 
> > I run this query on the 8.3 system:
> > 
> > explain analyze SELECT count(v_bprofile_comments_club16.id)  FROM
> > v_bprofile_comments_club16 WHERE v_bprofile_comments_club16.profile_id =
> > '5584' AND v_bprofile_comments_club16.approved = true; QUERY PLAN
> > 
> > -
> > - Aggregate
> >  (cost=6294.37..6294.38 rows=1 width=4) (actual time=269.633..269.635
> > rows=1 loops=1) ->  Nested Loop  (cost=1889.71..6273.06 rows=8523
> > width=4) (actual time=156.585..266.325 rows=1641 loops=1) ->  Nested
> > Loop  (cost=1889.71..5858.47 rows=779 width=16) (actual
> > time=156.565..237.216 rows=1641 loops=1) ->  Nested Loop
> >  (cost=1889.71..4488.01 rows=763 width=12) (actual time=156.453..200.174
> > rows=1641 loops=1) ->  Index Scan using bprofile_pkey on bprofile m
> >  (cost=0.00..4.27 rows=1 width=4) (actual time=0.140..0.145 rows=1
> > loops=1) Index Cond: (id = 5584)
> > ->  Hash Left Join  (cost=1889.71..4476.11 rows=763
> > width=16) (actual time=156.302..194.762 rows=1641 loops=1) Hash Cond:
> > (b.uid = ug.user_id)
> >   ->  Hash Join  (cost=1821.55..4399.44 rows=763
> > width=20) (actual time=151.372..183.103 rows=1641 loops=1) Hash Cond:
> > (c.from_id = b.id)
> > ->  Index Scan using
> > bprofile_comments_status_idx on bprofile_comments c  (cost=0.00..2558.77
> > rows=1531 width=12) (actual time=0.140..21.559 rows=1660 loops=1) Index
> > Cond: ((profile_id = 5584) AND (approved = true)) Filter: approved
> > ->  Hash  (cost=1726.15..1726.15
> > rows=7632 width=8) (actual time=151.131..151.131 rows=14782 loops=1) ->
> >  Hash Left Join  (cost=61.50..1726.15 rows=7632 width=8) (actual
> > time=2.622..119.268 rows=14782 loops=1) Hash Cond: (b.uid = ugi.user_id)
> > Filter: (gi.group_name IS NULL) ->  Seq Scan on bprofile b
> >  (cost=0.00..1579.44 rows=15265 width=8) (actual time=0.058..64.033
> > rows=15265 loops=1) Filter: (NOT deleted) ->  Hash  (cost=55.12..55.12
> > rows=510 width=13) (actual time=2.526..2.526 rows=231 loops=1) ->
> >  Nested Loop  (cost=0.00..55.12 rows=510 width=13) (actual
> > time=0.136..1.909 rows=231 loops=1) ->  Seq Scan on tg_group gi
> >  (cost=0.00..1.07 rows=1 width=13) (actual time=0.041..0.050 rows=1
> > loops=1) Filter: ((group_name)::text = 'Club16'::text) ->  Index Scan
> > using user_group_group_idx on user_group ugi  (cost=0.00..45.80 rows=660
> > width=8) (actual time=0.084..1.071 rows=231 loops=1) Index Cond:
> > (ugi.group_id = gi.group_id) ->  Hash  (cost=55.35..55.35 rows=1025
> > width=4) (actual time=4.866..4.866 rows=1025 loops=1) ->  Index Scan
> > using user_group_group_idx on user_group ug  (cost=0.00..55.35 rows=1025
> > width=4) (actual time=0.058..2.766 rows=1025 loops=1) Index Cond:
> > (group_id = 2) ->  Index Scan using bphotos_profile_primary_idx on
> > bphotos p  (cost=0.00..1.78 rows=1 width=4) (actual time=0.012..0.015
> > rows=1 loops=1641) Index Cond: ((p.profile_id = b.id) AND (p.is_primary
> > = true)) ->  Index Scan using bphotos_profile_primary_idx on bphotos p
> >  (cost=0.00..0.52 rows=1 width=4) (actual time=0.008..0.011 rows=1
> > loops=1641) Index Cond: ((p.profile_id = b.id) AND (p.is_primary =
> > true)) Total runtime: 270.808 ms
> > (33 rows)
> > 
> > 
> > As you can see, the query performs nicely (for the hardware used).
> > 
> > Now I turn off the 8.3 instance and start the 9.0 instance. Remember,
> > everything is identical. Here the same query again:
> > 
> > explain analyze SELECT count(v_bprofile_comments_club

[GENERAL] PG9.0 planner difference to 8.3 -> majorly bad performance

2011-01-29 Thread Uwe Schroeder

Maybe someone here can make sense of this. 
I'm trying to upgrade a 8.3 system to a 9.0 system. Usual procedure dump, 
restore, vac full, reindex.

Both - old and new - run on the same hardware and the postgresql.conf settings 
are identical.

You'll probably ask for the table definitions, which I'm happy to provide, but 
I'll omit them here for the sake of a shorter message. Basically everything is 
identical, data, tables, indexes, harware, 
config.

I should mention that the "tables" are really views - maybe something in the 
views changed in 9.0.2

I run this query on the 8.3 system:

explain analyze SELECT count(v_bprofile_comments_club16.id)  FROM 
v_bprofile_comments_club16  

 
WHERE v_bprofile_comments_club16.profile_id = '5584' AND 
v_bprofile_comments_club16.approved = true;

  QUERY PLAN
  
--
 Aggregate  (cost=6294.37..6294.38 rows=1 width=4) (actual 
time=269.633..269.635 rows=1 loops=1)
   ->  Nested Loop  (cost=1889.71..6273.06 rows=8523 width=4) (actual 
time=156.585..266.325 rows=1641 loops=1)
 ->  Nested Loop  (cost=1889.71..5858.47 rows=779 width=16) (actual 
time=156.565..237.216 rows=1641 loops=1)
   ->  Nested Loop  (cost=1889.71..4488.01 rows=763 width=12) 
(actual time=156.453..200.174 rows=1641 loops=1)
 ->  Index Scan using bprofile_pkey on bprofile m  
(cost=0.00..4.27 rows=1 width=4) (actual time=0.140..0.145 rows=1 loops=1)
   Index Cond: (id = 5584)
 ->  Hash Left Join  (cost=1889.71..4476.11 rows=763 
width=16) (actual time=156.302..194.762 rows=1641 loops=1)
   Hash Cond: (b.uid = ug.user_id)
   ->  Hash Join  (cost=1821.55..4399.44 rows=763 
width=20) (actual time=151.372..183.103 rows=1641 loops=1)
 Hash Cond: (c.from_id = b.id)
 ->  Index Scan using 
bprofile_comments_status_idx on bprofile_comments c  (cost=0.00..2558.77 
rows=1531 width=12) (actual time=0.140..21.559 rows=1660 loops=1)
   Index Cond: ((profile_id = 5584) AND 
(approved = true))
   Filter: approved
 ->  Hash  (cost=1726.15..1726.15 rows=7632 
width=8) (actual time=151.131..151.131 rows=14782 loops=1)
   ->  Hash Left Join  (cost=61.50..1726.15 
rows=7632 width=8) (actual time=2.622..119.268 rows=14782 loops=1)
 Hash Cond: (b.uid = ugi.user_id)
 Filter: (gi.group_name IS NULL)
 ->  Seq Scan on bprofile b  
(cost=0.00..1579.44 rows=15265 width=8) (actual time=0.058..64.033 rows=15265 
loops=1)
   Filter: (NOT deleted)
 ->  Hash  (cost=55.12..55.12 
rows=510 width=13) (actual time=2.526..2.526 rows=231 loops=1)
   ->  Nested Loop  
(cost=0.00..55.12 rows=510 width=13) (actual time=0.136..1.909 rows=231 loops=1)
 ->  Seq Scan on 
tg_group gi  (cost=0.00..1.07 rows=1 width=13) (actual time=0.041..0.050 rows=1 
loops=1)
   Filter: 
((group_name)::text = 'Club16'::text)
 ->  Index Scan using 
user_group_group_idx on user_group ugi  (cost=0.00..45.80 rows=660 width=8) 
(actual time=0.084..1.071 rows=231 loops=1)
   Index Cond: 
(ugi.group_id = gi.group_id)
   ->  Hash  (cost=55.35..55.35 rows=1025 width=4) 
(actual time=4.866..4.866 rows=1025 loops=1)
 ->  Index Scan using user_group_group_idx on 
user_group ug  (cost=0.00..55.35 rows=1025 width=4) (actual time=0.058..2.766 
rows=1025 loops=1)
   Index Cond: (group_id = 2)
   ->  Index Scan using bphotos_profile_primary_idx on bphotos p  
(cost=0.00..1.78 rows=1 width=4) (actual time=0.012..0.015 rows=1 loops=1641)   
 
 Index Cond: ((p.profile_id = b.id) AND (p.is_primary = 
true)) 

Re: [GENERAL] Subselect AS and Where clause

2011-01-26 Thread Uwe Schroeder


> Uwe Schroeder, 26.01.2011 08:34:
> > I have a query like this:
> > 
> > SELECT a,b,c, (select problem from other_table where id=a) as problem
> > FROM mytable WHERE a=1
> > 
> > So far so good. Actually "problem" always resolves to one record, so it's
> > not the "multiple records returned" problem.
> > 
> > What I try to do is this:
> > 
> > SELECT a,b,c, (select problem from other_table where id=a) as problem
> > FROM mytable WHERE a=1 and problem = 3
> > 
> > see the "problem=3" part in the where clause? The error I get is
> > 
> >   SQLError: (ProgrammingError) column "problem" does not exist
> 
> You need to wrap the whole SELECT in order to be able to use the column
> alias:
> 
> SELECT *
> FROM (
>SELECT a,
>   b,
>   c,
>   (select problem from other_table where id=a) as problem
>FROM mytable
> ) t
> WHERE a=1
>AND problem = 3
> 
> Regards
> Thomas


Thanks a lot Thomas!

there's the reason why open source like postgresql is far supperior to 
anything commercial - an answer when you need it!

Works like a charm now.

Uwe




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Subselect AS and Where clause

2011-01-25 Thread Uwe Schroeder

Google being useless tonight - now that's new :-)

What I'm trying to do is the following and I'm certain there is a simple 
solution which eludes me:

I have a query like this:

SELECT a,b,c, (select problem from other_table where id=a) as problem FROM 
mytable WHERE a=1

So far so good. Actually "problem" always resolves to one record, so it's not 
the "multiple records returned" problem.

What I try to do is this:

SELECT a,b,c, (select problem from other_table where id=a) as problem FROM 
mytable WHERE a=1 and problem = 3

see the "problem=3" part in the where clause? The error I get is 

 SQLError: (ProgrammingError) column "problem" does not exist

Do I miss something? Shouldn't the "as" assignment make this virtual column 
available to the where clause? I think this should work and I know it works if 
I'd make a view out of the query. However, the query is built dynamically, so 
turning it into a view isn't really an option.

Any pointer will be greatly appreciated.

Uwe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Moving from SQL Anywhere to PostGres - First Time

2011-01-21 Thread Uwe Schroeder


> Hi,
> 
> Price and features.   There are some changes in V12 which is causing us to
> do some steps with our support and resellers.  We've always wanted to move
> away from Sybase so we can make more of a profit off our software and lower
> costs for us and our clients.  Sybase is a wonderful product, and when
> things stay the same - we stay the same.   But, since V12 is requiring us
> to make internal changes, step changes, documentation changes, training
> changes ... then it's a good time for us to stop and look at alternatives
> if we were going to look at them anyway.
> 
> If we can save $100K a year moving to PostGres and give our customers
> better pricing, then it only makes sense to move - if PostGres does the
> same thing that Sybase does.

This is quite a jump. I've worked with Sybase on military projects - pretty 
much all the application logic coded in Sybase stored procedures. This would 
be a scenario I'm not happy to convert to any other database - maybe except MS 
SQL because that's historically an off-spring of Sybase and rather similar at 
the base of things (although a decade of MS intervention sure changed things a 
lot).

However, if your app doesn't rely on stored procedures heavily, the switch may 
well be worth it. I've used postgresql long before it was even called that - 
even back in the day when it was postgres and there was no sql whatsoever. 
Still, this database has yet to fail me. I've written software for government, 
military and financial industry and most of the projects I worked on where I 
recommended postgres(sql) as the DB backend are still around, maintained and 
run like a charm. One of my insurance clients had a postgresql 8.2 database 
running for well over a year with nobody checking in on it (see -> recession - 
they fired all their IT staff). The old server is still running just fine - 
just 
checked in on it the other day and I was the first to log on in over a year - 
runs, no questions asked.

So yes, if you want a database that just does the job I'd really recommend 
postgresql. Sure, you can do the same with mucho  commercial products, but 
personally I'd rather shoot a message to the mailing list here and get a very 
professional/knowledegable/qualified answer within hours - that's much better 
than any commercial big business support I've experienced (usually you call 
the support for your vendor and you get someone who basically has no clue what 
you're talking about - been there, done that - worked for Sun Microsystems 3rd 
level operating system support for years and was always amazed that only few 
people else could answer the questions of customers who paid 7+ digit support 
contracts - i.e. BMW, Audi etc.)

> 
> They are a premium software vendor and have a premium product, but charge a
> premium price for it.   They've made huge changes that affect us, and won't
> make any exceptions or changes for us, so now is the time to start looking
> at our options.

Have to say: here you get a premium product at cost of zip, nada, nothing and 
when you shout out to the mailing list - and you provide useable information - 
chances are you'll get premium support from people like Tom Lane. No company I 
know of can beat that kind of "easy access to those who know what they're 
talking about"

Don't get me wrong: Sybase is a good product and has been around for years.  
But if you're on your road to change, going open source is definitely an option 
to look into.

To sum it up: I'd put Postgresql on par with Oracle or DB2 any day of the week 
- except it's not subject to exorbitant license fees and the support is much 
better. It's not the fastest database on the market, but if you want something 
that just works relieably, PostgreSQL will serve you well.


> 
> -Robert
> 
> 
> "Michael Gould"  wrote in message
> news:393bf55cebd45c71fec623552acbd...@intermodalsoftwaresolutions.net...
> 
> Robert,
> 
> We used Data Manager from JP to do this.  Worked well.  He recently added
> the ability to read OEM versions of ASA.  How's come your moving away from
> SQL Anywhere?
> 
> Best Regards
> 
> Mike Gould
> 
> "Robert Paresi"  wrote:
> > Hello,
> > 
> > We have 700 user install base using Sybase SQL Anywhere 9.02
> > 
> > We are looking at migrating these installations over to PostGres
> > 
> > 1.  Very Very Short Answer Please - why should we?
> > 
> > 2.  Does anyone have a utility or migration application to read SQL
> 
> Anywhere
> 
> > to go to PostGres
> > 
> > 3.  Does PostGres handle column descriptions (ie: you can give each
> > column
> 
> a
> 
> > 50 character description) and then access it via SQL Result Set (like I
> 
> can
> 
> > do in Sybase)
> > 
> > 4.  Is there any Date/TimeStamp issues and conversions I need to know
> 
> about.
> 
> > I use simply a DATE field and a TIME field - but do not use DATE/TIME
> 
> stamp
> 
> > fields together.
> > 
> > 5.  What UI/Developer tools (GUI) are available to manage the database as
> > well as add/c

Re: [GENERAL] What is better method to backup postgresql DB.

2010-10-26 Thread Uwe Schroeder


> In response to Steeles :
> > new to postgresql. need to backup postgresql DB, which way is better to
> > backup DB.
> >
> > from training, I learned that we can backup the whole PGdata and other
> > directories to achieve backup goal, originally I was planned to schedule
> > jobs to use pgdump to backup all DBs.
> >
> > so which way is better for backup or there is other way to backup PGDB.
>
> Use pg_dump or pg_dumpall.  If you're at the level that you have to ask
> this question, then you'll have nothing but trouble getting reliable
> backups by backing up directories.
>
> Also, since you're new to PostgreSQL, I _HIGHLY_ recommend that you don't
> assume that you're getting backups until you can demonstrate that you can
> restore them.

Good advise! For all my production systems I do a life backup/replication 
using slony plus a nightly pg_dump which is then copied to a remote server. 
That way, no matter what happens, I have a backup that's a maximum of 1 day 
old (if I have to use the pg_dump) or a few seconds behind (if the replicated 
database isn't hit - which sure is also remote and not in the same physical 
location)

Whatever you use, always make sure you actually can restore the backup.




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Trade Study on Oracle vs. PostgreSQL

2010-09-24 Thread Uwe Schroeder


> On 25/09/2010 4:55 AM, Leif Biberg Kristensen wrote:
> > On Friday 24. September 2010 20.04.26 Lincoln Yeoh wrote:
> >> Then again, does Tom ever sleep? Maybe he sleeps in the afternoon? Or
> >> is that when he does intensive coding?
> >
> > Once there was a suggestion on this list to give Tom every other week off
> > to level the competition with the other RDBMSes.
> >
> > That being said, the PostgreSQL mail lists is an excellent and totally
> > free support channel. Speaking for myself, I've never needed any support
> > outside the mail lists. And Tom is of course a great big part of that.
>
> I couldn't agree more. I really appreciate the time and effort people
> here have spent helping out. Again, Tom Lane particularly so, for
> amazing patience, psychic debugging skills, and endless willingness to
> help anybody and everybody.

Don't omit other major contributors like Joshua or Scott... (and others I 
didn't mention by name).
Certainly Tom doesn't seem to need sleep, but there's quite a few more people 
willing and able to contribute and answer questions. Helped me out on more 
than one occasion and I've been using postgresql long before it was named 
postgresql (even back in the days when the linux kernel had a 0.9 version 
number). Sure many of the current contributers weren't around back then, but 
they picked up where others left off and made postgresql what it is today: a 
quite viable alternative to commercial "pay lots of dollars" databases. Suffice 
it to say, I have yet to see a showstopper (provided that you run postgresql 
on decent hardware).

Since they just opened a beer tasting room around the corner of my home, 
here's to all the active contributors - cheers. (and if any of the core 
developers/contributors ever come to Santa Rosa, CA - drop me a note and I'll 
have you over for a beer or wine and a nice BBQ)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Transposing rows and columns

2010-09-16 Thread Uwe Schroeder


> I'm working with some people who live and breath Excel.  I need to be able
> to move data back and forth between formats which make sense for Excel and
> for PostgreSQL.  In some cases, this is just to accommodate what people are
> used to.  In other cases, like statistical clustering, it's something that
> really has to be done.
>
> Here is a simplified example:
>
> I'm given data in Excel with one sheet each for a bunch of experiments.  In
> each sheet, there are rows with different drugs at different doses and
> columns for each subject.  The cells contain the response data.  I wrote a
> Perl script which automates the process of extracting that data into a csv
> file which can be imported into a table like the following:
>
> create table results(
> expt_no int references experiments(id),
> subject int references subjects(id),
> drug text references drugs(name),
> dose numeric,
> response numeric
> )
>
> Now, suppose I do some computation on the results in the database and want
> to export it back out to the same kind of format that I received it (drugs
> and doses in rows and subjects in columns.)   One method would be to use
> Perl.  I could use DBD::Pg and loop through a bunch of queries to build a
> two dimensional array and then spit that back out but is there a good way
> to do this just in SQL?  Is there a better way than creating a temporary
> table for each subject and then joining all the temp tables?

You may want to look into the tablefunc contrib module. It contains a crosstab 
which will transpose rows and columns in the result.
This may be slow though.

HTH






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] calculating distance between longitude and latitude

2010-06-09 Thread Uwe Schroeder


> Does postgresql have functions to calculate the distance between two
> sets of longitude and latitude.
>

You're looking for the earthdistance contrib module. With most Linux distros 
it's installed under /usr/share/postgresql/8.xx/contrib
You may have to install a "postgresql-contrib" package depending on your 
distro.
Typing "locate earthdistance.sql" should reveal the location if it's available 
already. To activate you'd just do a "pgsql [database] < 
/whereever/earthdistance.sql"

HTH
  Uwe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] virtualidx exclusive lock

2009-11-09 Thread Uwe Schroeder

On Sunday 08 November 2009 11:38:28 pm Uwe Schroeder wrote:
> I've googled, but there's 0 hits.
>
> I have an issue with a ton of "idle in transaction" backends.
> What I noticed is when I look at pg_locks, pretty much all of the processes
> being idle in transaction have an exclusive lock of locktype "virtualidx".
>
> Well, that doesn't make sense to me, but maybe someone here can tell me
> where a "virtualidx" locktype would come from. I'm sure it has to be some
> type of query. There is no info about table or anything, all the records
> look like:
> Is there a way to find out what query, or in lack of that at least what
> table is involved?

Thanks everyone. No wonder I didn't find anything on google :-)

Turns out the issue was related to the ORM my app is using. That darn thing 
keeps a cursor open for every select - which certainly keeps the transaction 
alive (why it uses a transaction for a simple select is the other thing). 
Anyways, I got it fixed.

Thanks

Uwe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] virtualidx exclusive lock

2009-11-08 Thread Uwe Schroeder

I've googled, but there's 0 hits.

I have an issue with a ton of "idle in transaction" backends.
What I noticed is when I look at pg_locks, pretty much all of the processes 
being idle in transaction have an exclusive lock of locktype "virtualidx".

Well, that doesn't make sense to me, but maybe someone here can tell me where a 
"virtualidx" locktype would come from. I'm sure it has to be some type of query.
There is no info about table or anything, all the records look like:

   locktype| database | relation | page | tuple | virtualxid | 
transactionid | classid | objid | objsubid | virtualtransaction |  pid  |   
mode   | granted
---+--+--+--+---++---+-+---+--++---+--+-
virtualxid|  |  |  |   | 63/10150   |   
| |   |  | 63/10150   | 31932 | ExclusiveLock| t
 virtualxid|  |  |  |   | 48/48530   |  
 | |   |  | 48/48530   | 31323 | ExclusiveLock| 
t
 virtualxid|  |  |  |   | 47/52387   |  
 | |   |  | 47/52387   | 31321 | ExclusiveLock| 
t
 virtualxid|  |  |  |   | 76/4086|  
 | |   |  | 76/4086| 32074 | ExclusiveLock| 
t
 virtualxid|  |  |  |   | 15/6007096 |  
 | |   |  | 15/6007096 | 31169 | ExclusiveLock| 
t
 virtualxid|  |  |  |   | 10/5689919 |  
 | |   |  | 10/5689919 | 31595 | ExclusiveLock| 
t
 virtualxid|  |  |  |   | 32/603998  |  
 | |   |  | 32/603998  | 31213 | ExclusiveLock| 
t
 virtualxid|  |  |  |   | 42/117511  |  
 | |   |  | 42/117511  | 31270 | ExclusiveLock| 
t
 virtualxid|  |  |  |   | 39/279415  |  
 | |   |  | 39/279415  | 31267 | ExclusiveLock| 
t


Is there a way to find out what query, or in lack of that at least what table 
is involved?

Thanks

Uwe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Backups

2009-11-08 Thread Uwe Schroeder
Personally I prefer multiple stages of backups off-site.
So for my production webserver database, I have slony replicating the database 
to a different location. In addition I run full dumps every 12 hours which in 
turn I replicate using rdist to a remote system. That way, whatever happens, 
the max data loss in the worst case scenario is about 12 hours - which for my 
application is acceptable.
Oh, yes, forgot to mention: the replicated database also gets dumped and goes 
onto a tape every day.



On Saturday 07 November 2009 11:41:55 pm Bret wrote:
> I need to back up a production database every night
> on FreeBSD 7.2, running Postgresql 8.3.
>
> Any good backup tips I should be aware of.
>
> Typically, I make a backup based on the current day,
> and rotate the seven days in the backup file name
>  (eg; sat_backup, sun_backup etc).
>
> Thanks for all the chatter.
>
> Bret Stern

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general