Hi,

Over in [1] it was rediscovered that our documentation assumes the reader
is familiar with NULL.  It seems worthwhile to provide both an introduction
to the topic and an overview of how this special value gets handled
throughout the system.

Attached is a very rough draft attempting this, based on my own thoughts
and those expressed by Tom in [1], which largely align with mine.

I'll flesh this out some more once I get support for the goal, content, and
placement.  On that point, NULL is a fundamental part of the SQL language
and so having it be a section in a Chapter titled "SQL Language" seems to
fit well, even if that falls into our tutorial.  Framing this up as
tutorial content won't be that hard, though I've skipped on examples and
such pending feedback.  It really doesn't fit as a top-level chapter under
part II nor really under any of the other chapters there.  The main issue
with the tutorial is the forward references to concepts not yet discussed
but problem points there can be addressed.

I do plan to remove the entity reference and place the content into
query.sgml directly in the final version.  It is just much easier to write
an entire new section in its own file.

David J.

[1] https://www.postgresql.org/message-id/1859814.1714532025%40sss.pgh.pa.us
From a068247e92e620455a925a0ae746adc225ae1339 Mon Sep 17 00:00:00 2001
From: "David G. Johnston" <david.g.johns...@gmail.com>
Date: Wed, 1 May 2024 07:45:48 -0700
Subject: [PATCH] Document NULL

---
 doc/src/sgml/filelist.sgml |  1 +
 doc/src/sgml/null.sgml     | 79 ++++++++++++++++++++++++++++++++++++++
 doc/src/sgml/query.sgml    |  2 +
 3 files changed, 82 insertions(+)
 create mode 100644 doc/src/sgml/null.sgml

diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index 38ec362d8f..ac4fd52978 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -10,6 +10,7 @@
 <!-- tutorial -->
 <!ENTITY advanced   SYSTEM "advanced.sgml">
 <!ENTITY query      SYSTEM "query.sgml">
+<!ENTITY null       SYSTEM "null.sgml">
 <!ENTITY start      SYSTEM "start.sgml">
 
 <!-- user's guide -->
diff --git a/doc/src/sgml/null.sgml b/doc/src/sgml/null.sgml
new file mode 100644
index 0000000000..5f95b2494e
--- /dev/null
+++ b/doc/src/sgml/null.sgml
@@ -0,0 +1,79 @@
+<sect1 id="tutorial-null">
+ <title>Handling Unkowns (NULL)</title>
+
+ <indexterm zone="tutorial-null">
+  <primary>NULL</primary>
+ </indexterm>
+
+ <para>
+  Looking again at our example weather data you will note that we do not know
+  the amount of precipitation Hayward.  We communicated that implicitly by
+  not including the prcp column in the insert.  Explicitly, we can communicate
+  this fact by writing.  [example using null].  When a column is not specified
+  in an insert the default value for that column is recorded and the default
+  default value is NULL.
+ </para>
+
+ <para>
+  As a value NULL crops up all throughout the database and interacts with many
+  features.  The main portion of this book will detail the interactions specific
+  features have with NULL but it is helpful to have a reference page where one
+  can get an overview.
+ </para>
+
+ <para>
+  First, like all values, NULLs are typed.  But since any value can be unknown
+  NULL is a valid value for all data types.
+ </para>
+
+ <para>
+  Second, when speaking generally NULL is assumed to mean unknown.  However,
+  in practice meaning comes from context and so a model design may state that
+  NULL is to be used to represent "not applicable" - i.e., that a value is not
+  even possible.  SQL has only the single value NULL while there are multiple
+  concepts that people have chosen to apply it to.  In any case the behavior
+  of the system when dealing with NULL is the same regardless of the meaning
+  the given to it in the surrounding context.
+ </para>
+
+ <para>
+  The cardinal rule, NULL is never equal or unequal to any non-null
+  value; and when asked to be combined with a known value in an operation the
+  result of the operation becomes unknown.  e.g., both 1 = NULL and 1 + NULL
+  result in NULL.  Exceptions to this are documented.  See [chapter] for
+  details on how to test for null.  Specifically, note that concept of
+  distinctness is introduced to allow for true/false equality tests.
+ </para>
+
+ <para>
+  Extending from the previous point, function calls are truly a mixed bag.
+  Aggregate functions in particular will usually just ignore NULL inputs
+  instead of forcing the entire aggregate result to NULL.  Function
+  specifications has a "strictness" attribute that, when set to "strict"
+  (a.k.a. "null on null input") will tell the executor to return NULL for any
+  function call having at least one NULL input value, without executing the
+  function.
+ </para>
+
+ <para>
+  A WHERE clause that evaluates to NULL for a given row will exclude that row.
+  This was demonstrated in the tutorial query where cities with prcp > 0 were
+  requested and Hayward was not returned due to this and the cardinal rule.
+ </para>
+
+ <para>
+  While not yet discussed, it is possible to define validation expressions on
+  tables that ensure only values passing those expressions are inserted.  While
+  this seems like it would behave as a WHERE clause on a table, the choice here
+  when an expression evaulates to NULL is allow the row to be inserted.  See
+  [check constraints] in create table for details.
+ </para>
+
+ <para>
+  In the context of both DISTINCT and GROUP BY it is necessary that all inputs
+  resolve to being either equal to or not equal to all other values.  These
+  features use distinctness instead of simple equality in order to handle
+  NULL like a definite value equal to itself and unequal to all other values.
+ </para>
+
+</sect1>
diff --git a/doc/src/sgml/query.sgml b/doc/src/sgml/query.sgml
index 59962d6e85..f9a8686365 100644
--- a/doc/src/sgml/query.sgml
+++ b/doc/src/sgml/query.sgml
@@ -907,4 +907,6 @@ DELETE FROM <replaceable>tablename</replaceable>;
    </para>
   </sect1>
 
+&null;
+
  </chapter>
-- 
2.34.1

Reply via email to