Re: [SQL] help with pagila

2006-09-04 Thread Markus Schaber
Hi, Tom,

Tom Lane wrote:

 If film_id is a primary key for film, then it's actually legal per SQL99
 (though not in earlier SQL specs) to just GROUP BY film_id and then
 reference the other columns of film without explicit grouping, because
 clearly there can be only one value of them per film_id value.  However
 the quoted query includes ungrouped references to other tables as well,
 and it's not immediately obvious that those references must have unique
 values for any one value of film_id.

In situations like this, I often missed (and sometimes implemented) a
simple first() aggregate to put around those other columns.

Some of those cases could be fixed by creative use of DISTINCT ON, but
sometimes lead to suboptimal query plans (due to the useless sorting).

Markus

-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [SQL] help with pagila

2006-09-01 Thread Andrew Sullivan
On Fri, Sep 01, 2006 at 10:31:48AM -0300, Walter Cruz wrote:
 ERROR:  column film.description must appear in the GROUP BY clause or be
 used in an aggregate function
 
 If I put that column on GROUP BY everything works ok. But I want understant
 why do I need to do that. Can someone teach me, please?

You need to because everything else is being grouped or aggregated.
Otherwise, you should get one row for every match of
film.description, and that's not what you want.  (More precisely and
yet still completely imprecise, that's not even something you can
have, because of the way sets work.)

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism. 
--Brad Holland

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [SQL] help with pagila

2006-09-01 Thread Walter Cruz
So I can assume that the MySQL implementation is strange? (It accepts that kind of query)[]'s- WalterOn 9/1/06, Andrew Sullivan 
[EMAIL PROTECTED] wrote:On Fri, Sep 01, 2006 at 10:31:48AM -0300, Walter Cruz wrote:
 ERROR:column film.description must appear in the GROUP BY clause or be used in an aggregate function If I put that column on GROUP BY everything works ok. But I want understant
 why do I need to do that. Can someone teach me, please?You need to because everything else is being grouped or aggregated.Otherwise, you should get one row for every match offilm.description, and that's not what you want.(More precisely and
yet still completely imprecise, that's not even something you canhave, because of the way sets work.)A--Andrew Sullivan| [EMAIL PROTECTED]In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism.--Brad Holland---(end of broadcast)---TIP 9: In versions below 8.0, the planner will ignore your desire to
 choose an index scan if your joining column's datatypes do not match


Re: [SQL] help with pagila

2006-09-01 Thread Andrew Sullivan
On Fri, Sep 01, 2006 at 02:26:39PM -0300, Walter Cruz wrote:
 So I can assume that the MySQL implementation is strange? (It accepts that
 kind of query)

In my experience, it is almost never safe to assume that the MySQL
approach to SQL bears anything but a passing resemblance to SQL
proper.  This is considerably better under recent releases, however,
and I think you'd find, if you used the strict mode in the most
recent release, that MySQL would choke on a query like you posted as
well.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The fact that technology doesn't work is no bar to success in the marketplace.
--Philip Greenspun

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [SQL] help with pagila

2006-09-01 Thread Tomas Vondra
 But, when I add another column on select, like, film_description, I get
 the following error:
 
 ERROR:  column film.description must appear in the GROUP BY clause or
 be used in an aggregate function
 
 If I put that column on GROUP BY everything works ok. But I want
 understant why do I need to do that. Can someone teach me, please?

The reason is pretty simple - GROUPing actually means sorting into
boxes by values in the columns after the GROUP BY keyword (and then
applying some aggregate functions to these boxes, as for example COUNT,
AVG etc.) Besides these aggregates, you can select a column that
'constant' for each of the boxes, that is all the rows in that box have
the same value in this column.

That's the case of the first SQL query you've posted - you're grouping
by 'film_id', thus all the rows in a box have the same value in this
column. And thanks to this you can select the value in the SELECT.

But in the second query, you'd like to select another column (directly,
not through an aggregate function) - title. But there could be different
values for each row in the box (PostgreSQL doesn't know that the ID
uniquely identifies the title). For example imagine you would group by
'published_year' instead of the 'film_id' - in this case there would be
many different movies in the same box, in which case it's impossible to
select 'title' for all of them.

There are two ways to solve this:

1) add the 'title' to the GROUP BY clause, thus all the rows in a box
   have the same value of 'title' (and postgresql knows about that)

2) use a subselect

   film_id,
   (SELECT title FROM film WHERE film_id = film_outer.film_id) AS title
   FROM film AS film_outer
   ...
   GROUP BY film_id;

Tomas


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [SQL] help with pagila

2006-09-01 Thread Scott Marlowe
On Fri, 2006-09-01 at 12:26, Walter Cruz wrote:
 So I can assume that the MySQL implementation is strange? (It accepts
 that kind of query)

Yes, according to the SQL spec, you should generally get an error when
you run a query like this:

select field1, field2 from table group by field1

since you could theoretically get a different value for field2 each time
you run the query.  If the data looked like this:

field1 | field2
---+
 1 | 1
 1 | 3
 2 | 5
 2 | 3

The possible answers to that query would be (1,1)(2,5), (1,1)(2,3),
(1,3)(2,5), (1,3)(2,3)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [SQL] help with pagila

2006-09-01 Thread Tom Lane
Andrew Sullivan [EMAIL PROTECTED] writes:
 On Fri, Sep 01, 2006 at 02:26:39PM -0300, Walter Cruz wrote:
 So I can assume that the MySQL implementation is strange? (It accepts that
 kind of query)

 In my experience, it is almost never safe to assume that the MySQL
 approach to SQL bears anything but a passing resemblance to SQL
 proper.  This is considerably better under recent releases, however,
 and I think you'd find, if you used the strict mode in the most
 recent release, that MySQL would choke on a query like you posted as
 well.

If film_id is a primary key for film, then it's actually legal per SQL99
(though not in earlier SQL specs) to just GROUP BY film_id and then
reference the other columns of film without explicit grouping, because
clearly there can be only one value of them per film_id value.  However
the quoted query includes ungrouped references to other tables as well,
and it's not immediately obvious that those references must have unique
values for any one value of film_id.

It's possible that MySQL is taking the trouble to validate that this
query is legal per SQL99 rules, but I'd find it quite surprising given
their project philosophy --- fine points like whether a query has a
single right answer tend not to matter to them.

Postgres currently implements only the older SQL92 rules, under which
you gotta explicitly GROUP BY all the columns you want to reference
outside aggregate functions.  We'll probably implement some parts of the
looser SQL99 rules in the future, but that's where it stands today.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] help with pagila

2006-09-01 Thread Tomas Vondra
 So I can assume that the MySQL implementation is strange? (It accepts
 that kind of query)

Yes, MySQL behaves strangely in this case (as well as in several other
cases). I wouldn't rely on this as it probably can choose different
values each time (although as far as I remember I haven't seen this).

t.v.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] help with pagila

2006-09-01 Thread Walter Cruz
Thank you all.I found myself with the same trouble in last week, when I tried to port mambo CMS to PostgreSQL.After some work, In fall i a query like the one that was quoted by Andrew and decided to stop.
In that time, my guess wae that something was wrong with MySQL.(I don't know.. I have seen queries like that on a lot of mysql free projects. Maybe we can put that on a FAQ?)[]'s- Walter
On 9/1/06, Tom Lane [EMAIL PROTECTED] wrote:
Andrew Sullivan [EMAIL PROTECTED] writes: On Fri, Sep 01, 2006 at 02:26:39PM -0300, Walter Cruz wrote: So I can assume that the MySQL implementation is strange? (It accepts that
 kind of query) In my experience, it is almost never safe to assume that the MySQL approach to SQL bears anything but a passing resemblance to SQL proper.This is considerably better under recent releases, however,
 and I think you'd find, if you used the strict mode in the most recent release, that MySQL would choke on a query like you posted as well.If film_id is a primary key for film, then it's actually legal per SQL99
(though not in earlier SQL specs) to just GROUP BY film_id and thenreference the other columns of film without explicit grouping, becauseclearly there can be only one value of them per film_id value.However
the quoted query includes ungrouped references to other tables as well,and it's not immediately obvious that those references must have uniquevalues for any one value of film_id.It's possible that MySQL is taking the trouble to validate that this
query is legal per SQL99 rules, but I'd find it quite surprising giventheir project philosophy --- fine points like whether a query has asingle right answer tend not to matter to them.Postgres currently implements only the older SQL92 rules, under which
you gotta explicitly GROUP BY all the columns you want to referenceoutside aggregate functions.We'll probably implement some parts of thelooser SQL99 rules in the future, but that's where it stands today.
regards, tom lane


Re: [SQL] help with pagila

2006-09-01 Thread Stephan Szabo

On Fri, 1 Sep 2006, Walter Cruz wrote:

 Hi all. I'm with a little doubt.

 I'm testing the pagila (the postgres port of mysql sakila sample).

 Well, I was trying to translate the query:

 select
  film.film_id AS FID,
  film.title AS title,
  film.description AS description,
  category.name AS category,
  film.rental_rate AS price,
  film.length AS length,
  film.rating AS rating,
  group_concat(concat(actor.first_name,_utf8' ',actor.last_name)
 separator ',') AS actors
 from
  category
  inner join film on(category.category_id = film.category_id)
  inner join film_actor on(film.film_id = film_actor.film_id)
  inner join actor on(film_actor.actor_id = actor.actor_id)
 group by
 film.film_id;

Assuming that film_id is the primary key on film and category_id is the
primary key on category, I think you'd be allowed to have the other column
references in SQL03 (and 99?) but not in SQL92 (which is the version that
PostgreSQL currently implements).

IIRC, the later specs allow you to not mention columns in group by that
are functionally dependant on other columns that are mentioned.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster