On Thu, Nov 03, 2022 at 09:55:22AM -0400, Tom Lane wrote: > Peter Eisentraut <peter.eisentr...@enterprisedb.com> writes: > > On 01.11.22 09:15, Tom Lane wrote: > >> Agreed that the libpq manual is not the place for this, but I feel > >> like it will also be clutter in "Data Types". Perhaps we should > >> invent a new appendix or the like? Somewhere near the wire protocol > >> docs seems sensible. > > > Would that clutter the protocol docs? ;-) > > I said "near", not "in". At the time I was thinking "new appendix", > but I now recall that the wire protocol docs are not an appendix > but a chapter in the Internals division. So that doesn't seem like > quite the right place anyway. > > Perhaps a new chapter under "IV. Client Interfaces" is the right > place? > > If we wanted to get aggressive, we could move most of the nitpicky details > about datatype text formatting (e.g., the array quoting rules) there too. > I'm not set on that, but it'd make datatype.sgml smaller which could > hardly be a bad thing. > > > I suppose figuring out exactly where to put it and how to mark it up, > > etc., in a repeatable fashion is part of the job here. > > Yup.
How does this look? I've simply moved things around into a new "Binary Format" section with the few parts that I've started for some quick feedback about whether this is looking like the right landing place. Regards, Mark
diff --git a/doc/src/sgml/binary-format.sgml b/doc/src/sgml/binary-format.sgml index a297ece784..779b606ec9 100644 --- a/doc/src/sgml/binary-format.sgml +++ b/doc/src/sgml/binary-format.sgml @@ -6,9 +6,102 @@ <indexterm zone="binary-format"><primary>pgsql binary format</primary></indexterm> <para> - This chapter describes the binary format used in the wire protocol. There - are a number of C examples for the data types used in PostgreSQL. We will - try to be as comprehensive as possible with the native data types. + This chapter describes the binary representation of the native PostgreSQL + data types and gives examples on how to handle each data type's binary format + by offering C code examples for each data types. </para> + <para> + We will try to cover all of the native data types... + </para> + + <sect1 id="binary-format-boolean"> + <title><type>boolean</type></title> + + <para> + A <type>boolean</type> is transmitted as single byte that, when cast to an + <literal>int</literal>, will be <literal>0</literal> for + <literal>false</literal> and <literal>1</literal> for + <literal>true</literal>. + </para> +<programlisting> +<![CDATA[ +int value; + +ptr = PQgetvalue(res, row_number, column_number); +value = (int) *ptr; +printf("%d\n", value); +]]> +</programlisting> + </sect1> + + <sect1 id="binary-format-real"> + <title><type>real</type></title> + + <para> + A <type>real</type> is composed of 4 bytes and needs to be handled correctly + for byte order. + </para> + +<programlisting> +<![CDATA[ +union { + int i; + float f; +} value; + +ptr = PQgetvalue(res, row_number, column_number); +val.i = ntohl(*((uint32_t *) ptr)); +printf("%f\n", value.f); +]]> +</programlisting> + </sect1> + + <sect1 id="binary-format-timestamp-without-time-zone"> + <title><type>timestamp without time zone</type></title> + + <para> + A <type>timestamp without time zone</type> is a 64-bit data type + representing the number of microseconds since January 1, 2000. It can be + converted into a broken-down time representation by converting the time into + seconds and saving the microseconds elsewhere. + </para> + + <para> + Note that in C time is counted from January 1, 1970, so this difference + needs to be accounted for in addition to handling the network byte order. + </para> + +<programlisting> +<![CDATA[ +#define POSTGRES_EPOCH_JDATE 2451545 /* == date2j(2000, 1, 1) */ +#define UNIX_EPOCH_JDATE 2440588 /* == date2j(1970, 1, 1) */ +#define SECS_PER_DAY 86400 + +uint64_t value; + +struct tm *tm; +time_t timep; +uint32_t mantissa; + +ptr = PQgetvalue(res, column_number, row_number); +/* Note ntohll() is not implemented on all platforms. */ +val = ntohll(*((uint64_t *) ptr)); + +timep = val / (uint64_t) 1000000 + + (uint64_t) (POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) * + (uint64_t) SECS_PER_DAY; +mantissa = val - (uint64_t) (timep - + (POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) * SECS_PER_DAY) * + (uint64_t) 1000000; + +/* Assume and print timestamps in GMT for simplicity. */ +tm = gmtime(&timep); + +printf("%04d-%02d-%02d %02d:%02d:%02d.%06d\n", + tm->tm_year + 1900, tm->tm_mon + 1, tm->tm_mday, tm->tm_hour, + tm->tm_min, tm->tm_sec, mantissa); +]]> +</programlisting> + </sect1> </chapter> diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml index 0d6be9a2fa..688f947107 100644 --- a/doc/src/sgml/filelist.sgml +++ b/doc/src/sgml/filelist.sgml @@ -51,6 +51,7 @@ <!ENTITY jit SYSTEM "jit.sgml"> <!-- programmer's guide --> +<!ENTITY binary-format SYSTEM "binary-format.sgml"> <!ENTITY bgworker SYSTEM "bgworker.sgml"> <!ENTITY dfunc SYSTEM "dfunc.sgml"> <!ENTITY ecpg SYSTEM "ecpg.sgml"> diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml index 2e271862fc..705b03f4aa 100644 --- a/doc/src/sgml/postgres.sgml +++ b/doc/src/sgml/postgres.sgml @@ -196,6 +196,7 @@ break is not needed in a wider output rendering. &lobj; &ecpg; &infoschema; + &binary-format; </part>