Re: Change JOIN tutorial to focus more on explicit joins

2020-09-04 Thread Jürgen Purtz

On 04.09.20 08:52, Peter Eisentraut wrote:


For the remaining patch I have a couple of concerns:

>  
>   Attempt to determine the semantics of this query when the
> - WHERE clause is omitted.
> + ON clause is omitted.
>  
> 

This no longer works.


Ok, but I don't have any better suggestion than to delete this para.
In general, I agree that some more emphasis on the JOIN syntax is 
okay. But I think the order in which the tutorial has taught it so far 
is okay: First you do it the manual way, then you learn the more 
abstract way.


In this context, I wouldn't use the terms 'manual' and 'abstract', it's 
more about 'implicit' and 'explicit' syntax. The 'explicit' syntax does 
not only emphasis the aspect of 'joining' tables, it also differentiates 
between the usage of following AND/OR/NOT key words as join conditions 
or as additional restrictions (the results are identical but not the 
semantic). Because the purpose of this patch is the preference of the 
explicit syntax, we shall show this syntax first.




> +   
> +    
> + The examples shown here combine rows via city names.
> + This should help to understand the concept. Professional
> + solutions prefer to use numerical IDs and foreign keys
> + to join tables.
> +    
> +   

While there are interesting debates to be had about natural vs. 
surrogate keys, I don't think we should imply that one of them is 
unprofessional and then leave it at that and give no further guidance. 
I think we should leave this out.



Ok, deleted.

--

Jürgen Purtz

diff --git a/doc/src/sgml/query.sgml b/doc/src/sgml/query.sgml
index e73e805ec4..413763691e 100644
--- a/doc/src/sgml/query.sgml
+++ b/doc/src/sgml/query.sgml
@@ -441,11 +441,11 @@ SELECT DISTINCT city

 Thus far, our queries have only accessed one table at a time.
 Queries can access multiple tables at once, or access the same
-table in such a way that multiple rows of the table are being
-processed at the same time.  A query that accesses multiple rows
-of the same or different tables at one time is called a
-join query.  As an example, say you wish to
-list all the weather records together with the location of the
+table several times. Such queries  they are called
+join queries  combine
+rows of one table in some way with rows of the other table
+and return a single row per combination. An example may be a
+list of all the weather records together with the location of the
 associated city.  To do that, we need to compare the city
 column of each row of the weather table with the
 name column of all rows in the cities
@@ -461,10 +461,17 @@ SELECT DISTINCT city
 
 
 SELECT *
-FROM weather, cities
-WHERE city = name;
+FROM   weather
+JOIN   cities ON (city = name);
 
 
+The keyword JOIN connects the two tables.
+Behind the keyword ON follows the
+definition how to compare their rows. In this case, the
+column city of table weather
+must be equal to the column name
+of table cities.
+
 
  city  | temp_lo | temp_hi | prcp |date| name  | location
 ---+-+-+--++---+---
@@ -476,8 +483,14 @@ SELECT *

 

-Observe two things about the result set:
+Observe some things about the result set:
 
+ 
+  
+   The resulting rows contain values from both tables.
+  
+ 
+
  
   
There is no result row for the city of Hayward.  This is
@@ -499,23 +512,14 @@ SELECT *
*:
 
 SELECT city, temp_lo, temp_hi, prcp, date, location
-FROM weather, cities
-WHERE city = name;
+FROM   weather
+JOIN   cities ON (city = name);
 
   
  
 

 
-   
-Exercise:
-
-
- Attempt to determine the semantics of this query when the
- WHERE clause is omitted.
-
-   
-

 Since the columns all had different names, the parser
 automatically found which table they belong to.  If there
@@ -526,8 +530,8 @@ SELECT city, temp_lo, temp_hi, prcp, date, location
 
 SELECT weather.city, weather.temp_lo, weather.temp_hi,
weather.prcp, weather.date, cities.location
-FROM weather, cities
-WHERE cities.name = weather.city;
+FROM   weather
+JOIN   cities ON (cities.name = weather.city);
 
 
 It is widely considered good style to qualify all column names
@@ -537,15 +541,18 @@ SELECT weather.city, weather.temp_lo, weather.temp_hi,
 

 Join queries of the kind seen thus far can also be written in this
-alternative form:
+form:
 
 
 SELECT *
-FROM weather INNER JOIN cities ON (weather.city = cities.name);
+FROM   weather, cities
+WHERE  weather.city = cities.name;
 
 
-This syntax is not as commonly used as the one above, but we show
-it here to help you understand the following topics.
+This syntax is mainly used in legacy applications. It dates back
+to the first days of SQL, 

Re: Change JOIN tutorial to focus more on explicit joins

2020-09-04 Thread Peter Eisentraut

On 2020-05-27 10:29, Jürgen Purtz wrote:

The attached patch

- prefers the explicit join-syntax over the implicit one and explains
the keywords of the explicit syntax

- uses a more accurate definition of 'join'

- separates  and  tags

- shifts  definitions outside of  to get a better
rendering in PDF

- adds a note concerning IDs and foreign keys


I have committed some parts of this patch:

> - separates  and  tags

> - shifts  definitions outside of  to get a better
> rendering in PDF

as well as the change of W1/W2 to w1/w2.  (Note that there is also 
src/tutorial/basics.source that should be adjusted in the same way.)


For the remaining patch I have a couple of concerns:

>  
>   Attempt to determine the semantics of this query when the
> - WHERE clause is omitted.
> + ON clause is omitted.
>  
> 

This no longer works.

In general, I agree that some more emphasis on the JOIN syntax is okay. 
But I think the order in which the tutorial has taught it so far is 
okay: First you do it the manual way, then you learn the more abstract way.


> +   
> +
> + The examples shown here combine rows via city names.
> + This should help to understand the concept. Professional
> + solutions prefer to use numerical IDs and foreign keys
> + to join tables.
> +
> +   

While there are interesting debates to be had about natural vs. 
surrogate keys, I don't think we should imply that one of them is 
unprofessional and then leave it at that and give no further guidance. 
I think we should leave this out.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services