RE: 1 to many relationship

2006-08-17 Thread Peter Lauri
This was very good reading. And we all learn something new everyday. What
you are writing makes so much sense.

This also comes from me relying on that all things said on this list is
true, and that all members do know what they write before they write it. I
try to never post a reply to anyone unless I'm convinced that I am right.

So I learned a lesson. The big lesson learned was not maybe the one about
how a database optimizes a question, but rather something else.

/Peter





-Original Message-
From: Michael Stassen [mailto:[EMAIL PROTECTED] 
Sent: Friday, August 18, 2006 1:14 AM
To: mysql@lists.mysql.com
Cc: Chris; Peter Lauri
Subject: Re: 1 to many relationship

Peter Lauri wrote:
 > Is there not a better way to do that? What will happen there is that a
large
 > result set will be created because when you just do "select * from
 > customers c, issues i, customer_issues ci" it will be like the inner
product
 > from all these tables, and then just choosing the right ones.
 >
 > If the table C have 1000 records, issues 5000 and customer_issues 15000
you
 > would end up with a 75,000,000,000 rows large results set, that would not
be
 > so sweet, would it?

Peter Lauri wrote:
 > Yes, it cuts it down to that number of records in the end, so the final
 > result set will just be a few rows that match the 'WHERE'. But the
internal
 > process of MySQL do merge all tables and then chooses the records that
 > matches the 'WHERE' clause.

No database would work very well if that were the case.  You are essentially

asserting that mysql has no optimizer.  That just isn't so.  The job of the 
optimizer is to devise a plan to execute the query in a way that will
examine 
the fewest possible rows.  Eliminating rows before looking at them is always

preferable to eliminating them afterwards.

For example, given the query

   SELECT *
   FROM customers c
   JOIN customer_issues ci ON c.customerid = ci.customerid
   JOIN issues i on ci.issueid = i.issueid
   WHERE c.customerid = 13;

mysql will use the index on customers.customerid to select the 1 row with 
customerid = 13, it will then use the index on customer_issues.customerid to

find matching rows in customer_issues, then finally it will use the index on

issues.issueid to find matching rows in issues.  This is easily verified
using 
EXPLAIN:

   EXPLAIN SELECT *
   FROM customers c
   JOIN customer_issues ci ON c.customerid = ci.customerid
   JOIN issues i on ci.issueid = i.issueid
   WHERE c.customerid = 13;

   +-+---++-+-+--+
   | select_type | table | type   | key | ref | rows |
   +-+---++-+-+--+
   | SIMPLE  | c | const  | PRIMARY | const   |1 |
   | SIMPLE  | ci| ref| PRIMARY | const   |4 |
   | SIMPLE  | i | eq_ref | PRIMARY | test.ci.issueid |1 |
   +-+---++-+-+--+
   3 rows in set (0.01 sec)

(Note that I've pared the output of EXPLAIN down to a few relevant columns.)

The "rows" column tells the story.  Mysql plans to use the primary key to
find 
the 1 matching row in customers, then use the primary key to find the 4
matching 
rows in customer_issues for that 1 customer, then use the primary key to
find 
the 1 matching row in issues for each row found in customer_issues.  That
is, it 
expects to produce 1x4x1 = 4 rows, not 'size of c' x 'size of ci' x 'size of
i' 
rows!

You might want to read the optimization section of the manual for more on
the 
subject <http://dev.mysql.com/doc/refman/4.1/en/optimization.html>.

Chris wrote:
 > I don't know enough about mysql internals to debate that so I'll take
 > your word for it.
 >
 > 'Explain' doesn't give enough information about what happens behind the
 > scenes so I'm not sure how to prove/disprove that and I don't know of
 > any tools that would show you that (if there is let me know!).
 >
 > Having said all of that I've never had a problem doing it the way I
 > mentioned.. ;)

EXPLAIN is documented in the manual 
<http://dev.mysql.com/doc/refman/4.1/en/explain.html>.

Michael

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: 1 to many relationship

2006-08-17 Thread Michael Stassen

Chris wrote:

> Hmm. Must be a recent change, I've had problems with that in the past.
>
> Thanks for the pointers.

You're welcome.  I'm not sure which part you think is a recent change. 
Certainly, mysql 5 with its demotion of the precedence of the implicit join is 
recent.  Perhaps it depends on your definition of recent.  According to the 
manual , putting the join 
condition in an ON clause was first supported for INNER JOIN as of 3.23.17, and 
for JOIN starting with 4.0.11.  Is that what you mean?


Chris wrote:


EXPLAIN is documented in the manual 
.


So it is, but it doesn't show me the decisions it makes. It shows me 
whether it will use an index or not, it won't show me how it puts the 
whole thing together.


Well, I think that information is there, unless I'm mistaking what you mean. 
Mysql joins the tables in the order given in the explain output.  It shows 
whether or not an index is being used, how many rows are predicted to be 
examined, and how the rows will be sorted.



Compare to the postgres one:

http://www.designmagick.com/article/23/page/5

(yes I did write that article so I am biased)

which tells me what's going on at each step of the decision process.

Whether you find that useful or not - *shrug* - I however find it very 
useful and wish the mysql explain was as good as that.


I've not used postgresql, so I will make no claim as to the relative merits of 
each system's EXPLAIN.  It appears you're in a better position to make 
comparisons.  I'll just say that it takes some practice/experience to really get 
what mysql's EXPLAIN is telling you, and I gather from your article that the 
same could be said of postgresql's EXPLAIN.  At least, I couldn't tell just from 
your article what useful info postgresql's EXPLAIN gives that mysql's does not. 
 I wonder if it is possible that what you are looking for is hidden in mysql's 
EXPLAIN output in a way that isn't apparent to you.  If you think that might be 
the case, and you told us what you're looking for, someone could say if it's 
there or not.  On the other hand, you may be certain that's not the case.  Then 
I'd be curious to hear about it.


Michael


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: 1 to many relationship

2006-08-17 Thread Chris

Michael Stassen wrote:

Chris wrote:

 > select *
 > from customers c, issues i, customer_issues ci
 > where c.customerid=ci.customerid
 >  AND ci.issueid=i.issueid;

Chris wrote:

 > That query should be the same as this one:
 > select *
 > from customers c
 > inner join customer_issues ci on (c.customerid=ci.customerid)
 > inner join issues i on (ci.issueid=i.issueid);
 >
 > I don't like writing my queries that way because the order of the joins
 > needs to be very specific and as you add more tables, getting the order
 > wrong can make things slow (and hard to track down).

I think you have that precisely backwards.  Putting the join conditions 
in the WHERE clause makes it easy to accidentally forget one.  Many, 
many "Why isn't this query working?" questions on this list have been 
the result of missing join conditions that became obvious when the query 
was rewritten to explicitly give the join conditions in ON clauses.  As 
you add more tables, the likelihood of making that mistake and the 
difficulty of detecting it increase.


Whether you put it in the WHERE or ON clause, the correct join condition 
is the same.  Like you, I used to write all my joins as implicit joins 
with the join conditions in the WHERE clause (largely becuase most of 
the examples in the manual were written that way).  It worked fine for 
the relatively simple queries I was writing at the time, but as my 
queries became more complex, that format became more and more unwieldy.  
With join conditions for 4 or 5 tables and 3 or 4 row restrictions in 
the WHERE clause, it gets to be a mess.  Faced with that, and seeing the 
problems others have had, I've gradually switched to writing all of my 
joins with explicit join conditions in ON clauses.


I also have no idea what you mean by "getting the order wrong can make 
things slow."  The order in which you write inner joins should have no 
effect on how your query is performed.  The optimizer will, in fact, 
consider each of the possible orderings that produce correct results and 
choose the one it calculates to be the most efficient.  That is,


  SELECT *
  FROM customers c
  JOIN customer_issues ci ON c.customerid = ci.customerid
  JOIN issues i ON ci.issueid = i.issueid
  WHERE c.customerid = 13;

is identical to

  SELECT *
  FROM issues i
  JOIN customer_issues ci ON ci.issueid = i.issueid
  JOIN customers c ON c.customerid = ci.customerid
  WHERE c.customerid = 13;

in that mysql will do the same thing to satisfy either one.  This can be 
verified through the use of EXPLAIN.


In short, there is absolutely no speed benefit in putting your join 
conditions in the WHERE clause.  In fact, there is a small inefficiency 
in putting them there, as mysql must then parse the WHERE clause into 
join conditions and row restrictions.  This is very fast, of course, but 
doesn't have to be done if the join conditions are explicitly given in 
ON clauses.


Finally, you will discover that some of your queries which use the 
"comma as implicit join operator" will break when you upgrade to mysql 
5, because starting with mysql 5, implicit joins are given lower 
precedence than explicit joins, as specified by the SQL standard.  See 
the manual for the details:


.


Hmm. Must be a recent change, I've had problems with that in the past.

Thanks for the pointers.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: 1 to many relationship

2006-08-17 Thread Chris


EXPLAIN is documented in the manual 
.


So it is, but it doesn't show me the decisions it makes. It shows me 
whether it will use an index or not, it won't show me how it puts the 
whole thing together.


Compare to the postgres one:

http://www.designmagick.com/article/23/page/5

(yes I did write that article so I am biased)

which tells me what's going on at each step of the decision process.

Whether you find that useful or not - *shrug* - I however find it very 
useful and wish the mysql explain was as good as that.


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: 1 to many relationship

2006-08-17 Thread Michael Stassen

Peter Lauri wrote:
> Is there not a better way to do that? What will happen there is that a large
> result set will be created because when you just do "select * from
> customers c, issues i, customer_issues ci" it will be like the inner product
> from all these tables, and then just choosing the right ones.
>
> If the table C have 1000 records, issues 5000 and customer_issues 15000 you
> would end up with a 75,000,000,000 rows large results set, that would not be
> so sweet, would it?

Peter Lauri wrote:
> Yes, it cuts it down to that number of records in the end, so the final
> result set will just be a few rows that match the 'WHERE'. But the internal
> process of MySQL do merge all tables and then chooses the records that
> matches the 'WHERE' clause.

No database would work very well if that were the case.  You are essentially 
asserting that mysql has no optimizer.  That just isn't so.  The job of the 
optimizer is to devise a plan to execute the query in a way that will examine 
the fewest possible rows.  Eliminating rows before looking at them is always 
preferable to eliminating them afterwards.


For example, given the query

  SELECT *
  FROM customers c
  JOIN customer_issues ci ON c.customerid = ci.customerid
  JOIN issues i on ci.issueid = i.issueid
  WHERE c.customerid = 13;

mysql will use the index on customers.customerid to select the 1 row with 
customerid = 13, it will then use the index on customer_issues.customerid to 
find matching rows in customer_issues, then finally it will use the index on 
issues.issueid to find matching rows in issues.  This is easily verified using 
EXPLAIN:


  EXPLAIN SELECT *
  FROM customers c
  JOIN customer_issues ci ON c.customerid = ci.customerid
  JOIN issues i on ci.issueid = i.issueid
  WHERE c.customerid = 13;

  +-+---++-+-+--+
  | select_type | table | type   | key | ref | rows |
  +-+---++-+-+--+
  | SIMPLE  | c | const  | PRIMARY | const   |1 |
  | SIMPLE  | ci| ref| PRIMARY | const   |4 |
  | SIMPLE  | i | eq_ref | PRIMARY | test.ci.issueid |1 |
  +-+---++-+-+--+
  3 rows in set (0.01 sec)

(Note that I've pared the output of EXPLAIN down to a few relevant columns.)

The "rows" column tells the story.  Mysql plans to use the primary key to find 
the 1 matching row in customers, then use the primary key to find the 4 matching 
rows in customer_issues for that 1 customer, then use the primary key to find 
the 1 matching row in issues for each row found in customer_issues.  That is, it 
expects to produce 1x4x1 = 4 rows, not 'size of c' x 'size of ci' x 'size of i' 
rows!


You might want to read the optimization section of the manual for more on the 
subject .


Chris wrote:
> I don't know enough about mysql internals to debate that so I'll take
> your word for it.
>
> 'Explain' doesn't give enough information about what happens behind the
> scenes so I'm not sure how to prove/disprove that and I don't know of
> any tools that would show you that (if there is let me know!).
>
> Having said all of that I've never had a problem doing it the way I
> mentioned.. ;)

EXPLAIN is documented in the manual 
.


Michael

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: 1 to many relationship

2006-08-17 Thread Michael Stassen

Chris wrote:

> select *
> from customers c, issues i, customer_issues ci
> where c.customerid=ci.customerid
>  AND ci.issueid=i.issueid;

Chris wrote:

> That query should be the same as this one:
> select *
> from customers c
> inner join customer_issues ci on (c.customerid=ci.customerid)
> inner join issues i on (ci.issueid=i.issueid);
>
> I don't like writing my queries that way because the order of the joins
> needs to be very specific and as you add more tables, getting the order
> wrong can make things slow (and hard to track down).

I think you have that precisely backwards.  Putting the join conditions in the 
WHERE clause makes it easy to accidentally forget one.  Many, many "Why isn't 
this query working?" questions on this list have been the result of missing join 
conditions that became obvious when the query was rewritten to explicitly give 
the join conditions in ON clauses.  As you add more tables, the likelihood of 
making that mistake and the difficulty of detecting it increase.


Whether you put it in the WHERE or ON clause, the correct join condition is the 
same.  Like you, I used to write all my joins as implicit joins with the join 
conditions in the WHERE clause (largely becuase most of the examples in the 
manual were written that way).  It worked fine for the relatively simple queries 
I was writing at the time, but as my queries became more complex, that format 
became more and more unwieldy.  With join conditions for 4 or 5 tables and 3 or 
4 row restrictions in the WHERE clause, it gets to be a mess.  Faced with that, 
and seeing the problems others have had, I've gradually switched to writing all 
of my joins with explicit join conditions in ON clauses.


I also have no idea what you mean by "getting the order wrong can make things 
slow."  The order in which you write inner joins should have no effect on how 
your query is performed.  The optimizer will, in fact, consider each of the 
possible orderings that produce correct results and choose the one it calculates 
to be the most efficient.  That is,


  SELECT *
  FROM customers c
  JOIN customer_issues ci ON c.customerid = ci.customerid
  JOIN issues i ON ci.issueid = i.issueid
  WHERE c.customerid = 13;

is identical to

  SELECT *
  FROM issues i
  JOIN customer_issues ci ON ci.issueid = i.issueid
  JOIN customers c ON c.customerid = ci.customerid
  WHERE c.customerid = 13;

in that mysql will do the same thing to satisfy either one.  This can be 
verified through the use of EXPLAIN.


In short, there is absolutely no speed benefit in putting your join conditions 
in the WHERE clause.  In fact, there is a small inefficiency in putting them 
there, as mysql must then parse the WHERE clause into join conditions and row 
restrictions.  This is very fast, of course, but doesn't have to be done if the 
join conditions are explicitly given in ON clauses.


Finally, you will discover that some of your queries which use the "comma as 
implicit join operator" will break when you upgrade to mysql 5, because starting 
with mysql 5, implicit joins are given lower precedence than explicit joins, as 
specified by the SQL standard.  See the manual for the details:


.

Michael

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: 1 to many relationship

2006-08-15 Thread Peter Lauri
There will hopefully be some MySQL guru to confirm if I am right or wrong. I
also did the "inner product" version instead of JOIN's, but moved to JOIN's
that are more logical in the way I work with the tables I have.

/Peter

-Original Message-
From: Chris [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, August 16, 2006 11:00 AM
To: Peter Lauri
Cc: mysql@lists.mysql.com
Subject: Re: 1 to many relationship

Peter Lauri wrote:
> [snip "Chris"]
> The 'where' clause cuts that down to only matching records between the 
> tables. Without the where, you'd end up with lots of rows but with the 
> where it will be fine.
> [/snip]
> 
> Yes, it cuts it down to that number of records in the end, so the final
> result set will just be a few rows that match the 'WHERE'. But the
internal
> process of MySQL do merge all tables and then chooses the records that
> matches the 'WHERE' clause.

I don't know enough about mysql internals to debate that so I'll take 
your word for it.

'Explain' doesn't give enough information about what happens behind the 
scenes so I'm not sure how to prove/disprove that and I don't know of 
any tools that would show you that (if there is let me know!).

Having said all of that I've never had a problem doing it the way I 
mentioned.. ;)

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: 1 to many relationship

2006-08-15 Thread Chris

Peter Lauri wrote:

[snip "Chris"]
The 'where' clause cuts that down to only matching records between the 
tables. Without the where, you'd end up with lots of rows but with the 
where it will be fine.

[/snip]

Yes, it cuts it down to that number of records in the end, so the final
result set will just be a few rows that match the 'WHERE'. But the internal
process of MySQL do merge all tables and then chooses the records that
matches the 'WHERE' clause.


I don't know enough about mysql internals to debate that so I'll take 
your word for it.


'Explain' doesn't give enough information about what happens behind the 
scenes so I'm not sure how to prove/disprove that and I don't know of 
any tools that would show you that (if there is let me know!).


Having said all of that I've never had a problem doing it the way I 
mentioned.. ;)


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: 1 to many relationship

2006-08-15 Thread Peter Lauri
[snip "Chris"]
The 'where' clause cuts that down to only matching records between the 
tables. Without the where, you'd end up with lots of rows but with the 
where it will be fine.
[/snip]

Yes, it cuts it down to that number of records in the end, so the final
result set will just be a few rows that match the 'WHERE'. But the internal
process of MySQL do merge all tables and then chooses the records that
matches the 'WHERE' clause.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: 1 to many relationship

2006-08-15 Thread Chris

Peter Lauri wrote:

[snip "Chris"]

If you want multiple customers to be associated with each issue you need 
3 tables:


create table customers (customerid int auto_increment primary key, 
customername varchar(255));


create table issues (issueid int auto_increment primary key, issuetitle 
varchar(255));


create table customer_issues (issueid int, customerid int);

then you can do:

select * from
customers c, issues i, customer_issues ci
where
c.customerid=ci.customerid AND
ci.issueid=i.issueid;

[/snip]


Is there not a better way to do that? What will happen there is that a large
result set will be created because when you just do "select * from
customers c, issues i, customer_issues ci" it will be like the inner product
from all these tables, and then just choosing the right ones.

If the table C have 1000 records, issues 5000 and customer_issues 15000 you
would end up with a 75,000,000,000 rows large results set, that would not be
so sweet, would it?


The 'where' clause cuts that down to only matching records between the 
tables. Without the where, you'd end up with lots of rows but with the 
where it will be fine.


Of course make sure all parts of the where clause are appropriately 
indexed (one on c.customerid, one on i.issueid, one on ci.customerid and 
ci.issueid)



That query should be the same as this one:
select * from
customers c
inner join customer_issues ci on (c.customerid=ci.customerid)
inner join issues i on (ci.issueid=i.issueid)
;


I don't like writing my queries that way because the order of the joins 
needs to be very specific and as you add more tables, getting the order 
wrong can make things slow (and hard to track down).


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: 1 to many relationship

2006-08-15 Thread Peter Lauri
[snip "Chris"]

If you want multiple customers to be associated with each issue you need 
3 tables:

create table customers (customerid int auto_increment primary key, 
customername varchar(255));

create table issues (issueid int auto_increment primary key, issuetitle 
varchar(255));

create table customer_issues (issueid int, customerid int);

then you can do:

select * from
customers c, issues i, customer_issues ci
where
c.customerid=ci.customerid AND
ci.issueid=i.issueid;

[/snip]


Is there not a better way to do that? What will happen there is that a large
result set will be created because when you just do "select * from
customers c, issues i, customer_issues ci" it will be like the inner product
from all these tables, and then just choosing the right ones.

If the table C have 1000 records, issues 5000 and customer_issues 15000 you
would end up with a 75,000,000,000 rows large results set, that would not be
so sweet, would it?

/Peter


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: 1 to many relationship

2006-08-15 Thread Chris

Brian E Boothe wrote:
how do i do 1 to many relationship?? i have this SQL syntax " i may need 
to add many issues per customer" so this is what i have

  Please help
   SELECT *
FROM
 mcsdata.customers
 INNER JOIN mcsdata.issues ON (mcsdata.customers.id = mcsdata.issues.id)
WHERE
 (mcsdata.customers.id = mcsdata.issues.id)


Firstly - the 'ON' takes care of the join so you don't need it in the 
'WHERE' as well.



Secondly for 1 -> many relationships you need at two tables:

create table customers (customerid int auto_increment primary key, 
customername varchar(255));


create table issues (issueid int auto_increment primary key, customerid 
int, issuetitle varchar(255));


Then you can do:

select * from customers c inner join issues i on 
(c.customerid=i.customerid);



If you want multiple customers to be associated with each issue you need 
3 tables:


create table customers (customerid int auto_increment primary key, 
customername varchar(255));


create table issues (issueid int auto_increment primary key, issuetitle 
varchar(255));


create table customer_issues (issueid int, customerid int);

then you can do:

select * from
customers c, issues i, customer_issues ci
where
c.customerid=ci.customerid AND
ci.issueid=i.issueid;


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



1 to many relationship

2006-08-15 Thread Brian E Boothe
how do i do 1 to many relationship?? i have this SQL syntax " i may need 
to add many issues per customer" so this is what i have

  Please help
   
SELECT *

FROM
 mcsdata.customers
 INNER JOIN mcsdata.issues ON (mcsdata.customers.id = mcsdata.issues.id)
WHERE
 (mcsdata.customers.id = mcsdata.issues.id)

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]