Re: [SQL] help with pagila
Hi, Tom, Tom Lane wrote: If film_id is a primary key for film, then it's actually legal per SQL99 (though not in earlier SQL specs) to just GROUP BY film_id and then reference the other columns of film without explicit grouping, because clearly there can be only one value of them per film_id value. However the quoted query includes ungrouped references to other tables as well, and it's not immediately obvious that those references must have unique values for any one value of film_id. In situations like this, I often missed (and sometimes implemented) a simple first() aggregate to put around those other columns. Some of those cases could be fixed by creative use of DISTINCT ON, but sometimes lead to suboptimal query plans (due to the useless sorting). Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [SQL] help with pagila
On Fri, Sep 01, 2006 at 10:31:48AM -0300, Walter Cruz wrote: ERROR: column film.description must appear in the GROUP BY clause or be used in an aggregate function If I put that column on GROUP BY everything works ok. But I want understant why do I need to do that. Can someone teach me, please? You need to because everything else is being grouped or aggregated. Otherwise, you should get one row for every match of film.description, and that's not what you want. (More precisely and yet still completely imprecise, that's not even something you can have, because of the way sets work.) A -- Andrew Sullivan | [EMAIL PROTECTED] In the future this spectacle of the middle classes shocking the avant- garde will probably become the textbook definition of Postmodernism. --Brad Holland ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] help with pagila
So I can assume that the MySQL implementation is strange? (It accepts that kind of query)[]'s- WalterOn 9/1/06, Andrew Sullivan [EMAIL PROTECTED] wrote:On Fri, Sep 01, 2006 at 10:31:48AM -0300, Walter Cruz wrote: ERROR:column film.description must appear in the GROUP BY clause or be used in an aggregate function If I put that column on GROUP BY everything works ok. But I want understant why do I need to do that. Can someone teach me, please?You need to because everything else is being grouped or aggregated.Otherwise, you should get one row for every match offilm.description, and that's not what you want.(More precisely and yet still completely imprecise, that's not even something you canhave, because of the way sets work.)A--Andrew Sullivan| [EMAIL PROTECTED]In the future this spectacle of the middle classes shocking the avant- garde will probably become the textbook definition of Postmodernism.--Brad Holland---(end of broadcast)---TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] help with pagila
On Fri, Sep 01, 2006 at 02:26:39PM -0300, Walter Cruz wrote: So I can assume that the MySQL implementation is strange? (It accepts that kind of query) In my experience, it is almost never safe to assume that the MySQL approach to SQL bears anything but a passing resemblance to SQL proper. This is considerably better under recent releases, however, and I think you'd find, if you used the strict mode in the most recent release, that MySQL would choke on a query like you posted as well. A -- Andrew Sullivan | [EMAIL PROTECTED] The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] help with pagila
But, when I add another column on select, like, film_description, I get the following error: ERROR: column film.description must appear in the GROUP BY clause or be used in an aggregate function If I put that column on GROUP BY everything works ok. But I want understant why do I need to do that. Can someone teach me, please? The reason is pretty simple - GROUPing actually means sorting into boxes by values in the columns after the GROUP BY keyword (and then applying some aggregate functions to these boxes, as for example COUNT, AVG etc.) Besides these aggregates, you can select a column that 'constant' for each of the boxes, that is all the rows in that box have the same value in this column. That's the case of the first SQL query you've posted - you're grouping by 'film_id', thus all the rows in a box have the same value in this column. And thanks to this you can select the value in the SELECT. But in the second query, you'd like to select another column (directly, not through an aggregate function) - title. But there could be different values for each row in the box (PostgreSQL doesn't know that the ID uniquely identifies the title). For example imagine you would group by 'published_year' instead of the 'film_id' - in this case there would be many different movies in the same box, in which case it's impossible to select 'title' for all of them. There are two ways to solve this: 1) add the 'title' to the GROUP BY clause, thus all the rows in a box have the same value of 'title' (and postgresql knows about that) 2) use a subselect film_id, (SELECT title FROM film WHERE film_id = film_outer.film_id) AS title FROM film AS film_outer ... GROUP BY film_id; Tomas ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] help with pagila
On Fri, 2006-09-01 at 12:26, Walter Cruz wrote: So I can assume that the MySQL implementation is strange? (It accepts that kind of query) Yes, according to the SQL spec, you should generally get an error when you run a query like this: select field1, field2 from table group by field1 since you could theoretically get a different value for field2 each time you run the query. If the data looked like this: field1 | field2 ---+ 1 | 1 1 | 3 2 | 5 2 | 3 The possible answers to that query would be (1,1)(2,5), (1,1)(2,3), (1,3)(2,5), (1,3)(2,3) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] help with pagila
Andrew Sullivan [EMAIL PROTECTED] writes: On Fri, Sep 01, 2006 at 02:26:39PM -0300, Walter Cruz wrote: So I can assume that the MySQL implementation is strange? (It accepts that kind of query) In my experience, it is almost never safe to assume that the MySQL approach to SQL bears anything but a passing resemblance to SQL proper. This is considerably better under recent releases, however, and I think you'd find, if you used the strict mode in the most recent release, that MySQL would choke on a query like you posted as well. If film_id is a primary key for film, then it's actually legal per SQL99 (though not in earlier SQL specs) to just GROUP BY film_id and then reference the other columns of film without explicit grouping, because clearly there can be only one value of them per film_id value. However the quoted query includes ungrouped references to other tables as well, and it's not immediately obvious that those references must have unique values for any one value of film_id. It's possible that MySQL is taking the trouble to validate that this query is legal per SQL99 rules, but I'd find it quite surprising given their project philosophy --- fine points like whether a query has a single right answer tend not to matter to them. Postgres currently implements only the older SQL92 rules, under which you gotta explicitly GROUP BY all the columns you want to reference outside aggregate functions. We'll probably implement some parts of the looser SQL99 rules in the future, but that's where it stands today. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [SQL] help with pagila
So I can assume that the MySQL implementation is strange? (It accepts that kind of query) Yes, MySQL behaves strangely in this case (as well as in several other cases). I wouldn't rely on this as it probably can choose different values each time (although as far as I remember I haven't seen this). t.v. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [SQL] help with pagila
Thank you all.I found myself with the same trouble in last week, when I tried to port mambo CMS to PostgreSQL.After some work, In fall i a query like the one that was quoted by Andrew and decided to stop. In that time, my guess wae that something was wrong with MySQL.(I don't know.. I have seen queries like that on a lot of mysql free projects. Maybe we can put that on a FAQ?)[]'s- Walter On 9/1/06, Tom Lane [EMAIL PROTECTED] wrote: Andrew Sullivan [EMAIL PROTECTED] writes: On Fri, Sep 01, 2006 at 02:26:39PM -0300, Walter Cruz wrote: So I can assume that the MySQL implementation is strange? (It accepts that kind of query) In my experience, it is almost never safe to assume that the MySQL approach to SQL bears anything but a passing resemblance to SQL proper.This is considerably better under recent releases, however, and I think you'd find, if you used the strict mode in the most recent release, that MySQL would choke on a query like you posted as well.If film_id is a primary key for film, then it's actually legal per SQL99 (though not in earlier SQL specs) to just GROUP BY film_id and thenreference the other columns of film without explicit grouping, becauseclearly there can be only one value of them per film_id value.However the quoted query includes ungrouped references to other tables as well,and it's not immediately obvious that those references must have uniquevalues for any one value of film_id.It's possible that MySQL is taking the trouble to validate that this query is legal per SQL99 rules, but I'd find it quite surprising giventheir project philosophy --- fine points like whether a query has asingle right answer tend not to matter to them.Postgres currently implements only the older SQL92 rules, under which you gotta explicitly GROUP BY all the columns you want to referenceoutside aggregate functions.We'll probably implement some parts of thelooser SQL99 rules in the future, but that's where it stands today. regards, tom lane
Re: [SQL] help with pagila
On Fri, 1 Sep 2006, Walter Cruz wrote: Hi all. I'm with a little doubt. I'm testing the pagila (the postgres port of mysql sakila sample). Well, I was trying to translate the query: select film.film_id AS FID, film.title AS title, film.description AS description, category.name AS category, film.rental_rate AS price, film.length AS length, film.rating AS rating, group_concat(concat(actor.first_name,_utf8' ',actor.last_name) separator ',') AS actors from category inner join film on(category.category_id = film.category_id) inner join film_actor on(film.film_id = film_actor.film_id) inner join actor on(film_actor.actor_id = actor.actor_id) group by film.film_id; Assuming that film_id is the primary key on film and category_id is the primary key on category, I think you'd be allowed to have the other column references in SQL03 (and 99?) but not in SQL92 (which is the version that PostgreSQL currently implements). IIRC, the later specs allow you to not mention columns in group by that are functionally dependant on other columns that are mentioned. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster