Caught someone using PG_DETOAST_PACKED and VARDATA_ANY on a structure that
required alignment so I guess some more prominent warnings are in order. I
also added a paragraph to the "User-Defined Types" chapter on using these
macros since it seems like they're a hit.

Index: doc/src/sgml/xtypes.sgml
===================================================================
RCS file: /home/stark/src/REPOSITORY/pgsql/doc/src/sgml/xtypes.sgml,v
retrieving revision 1.28
diff -c -r1.28 xtypes.sgml
*** doc/src/sgml/xtypes.sgml	16 Sep 2006 00:30:16 -0000	1.28
--- doc/src/sgml/xtypes.sgml	9 May 2007 17:25:52 -0000
***************
*** 237,256 ****
      <primary>TOAST</primary>
      <secondary>and user-defined types</secondary>
     </indexterm>
!   If the values of your data type might exceed a few hundred bytes in
!   size (in internal form), you should make the data type
!   <acronym>TOAST</>-able (see <xref linkend="storage-toast">).
!   To do this, the internal
!   representation must follow the standard layout for variable-length
!   data: the first four bytes must be an <type>int32</type> containing
!   the total length in bytes of the datum (including itself).  The C
!   functions operating on the data type must be careful to unpack any
    toasted values they are handed, by using <function>PG_DETOAST_DATUM</>.
    (This detail is customarily hidden by defining type-specific
!   <function>GETARG</function> macros.) Then, 
!   when running the <command>CREATE TYPE</command> command, specify the
!   internal length as <literal>variable</> and select the appropriate
!   storage option.
   </para>
  
   <para>
--- 237,274 ----
      <primary>TOAST</primary>
      <secondary>and user-defined types</secondary>
     </indexterm>
!   If the values of your data type vary in size (in internal form), you should
!   make the data type <acronym>TOAST</>-able (see <xref
!   linkend="storage-toast">). You should do this even if the data are always
!   too small to be compressed or stored externally because
!   <productname>Postgres</> can save space on small data using
!   <acronym>TOAST</> as well. 
!  </para>
! 
!  <para>
!   To do this, the internal representation must follow the standard layout for
!   variable-length data: the first four bytes must be an <type>int32</type>
!   which is never accessed directly (customarily named <literalvl_len_</>). You
!   must use <function>SET_VARSIZE()</function> to store the size of the datum
!   in this field and <function>VARSIZE()</function> to retrieve it. The C
!   functions operating on the data type must always be careful to unpack any
    toasted values they are handed, by using <function>PG_DETOAST_DATUM</>.
    (This detail is customarily hidden by defining type-specific
!   <function>GETARG_DATATYPE_P</function> macros.) Then, when running the
!   <command>CREATE TYPE</command> command, specify the internal length as
!   <literal>variable</> and select the appropriate storage option.
!  </para>
! 
!  <para>
!   If the alignment is unimportant (either just for a specific function or
!   because the data type specifies byte alignment anyways) then it's possible
!   to avoid some of the overhead of <function>PG_DETOAST_DATUM</>. You can use
!   <function>PG_DETOAST_DATUM_PACKED</> instead (customarily hidden by
!   defining a <function>GETARG_DATATYPE_PP</> macro) and using the macros
!   <function>VARSIZE_ANY_EXHDR</> and <function>VARDATA_ANY</> macros.
!   Again, the data returned by these macros is not aligned even if the data
!   type definition specifies an alignment. If the alignment is important you
!   must go through the regular <function>PG_DETOAST_DATUM</> interface.
   </para>
  
   <para>
Index: src/include/fmgr.h
===================================================================
RCS file: /home/stark/src/REPOSITORY/pgsql/src/include/fmgr.h,v
retrieving revision 1.50
diff -c -r1.50 fmgr.h
*** src/include/fmgr.h	6 Apr 2007 04:21:44 -0000	1.50
--- src/include/fmgr.h	9 May 2007 15:34:54 -0000
***************
*** 158,163 ****
--- 158,169 ----
   * The resulting datum can be accessed using VARSIZE_ANY() and VARDATA_ANY()
   * (beware of multiple evaluations in those macros!)
   *
+  * WARNING: It is only safe to use PG_DETOAST_DATUM_UNPACKED() and
+  * VARDATA_ANY() if you really don't care about the alignment. Either because
+  * you're working with something like text where the alignment doesn't matter
+  * or because you're not going to access its constituent parts and just use
+  * things like memcpy on it anyways.
+  *
   * Note: it'd be nice if these could be macros, but I see no way to do that
   * without evaluating the arguments multiple times, which is NOT acceptable.
   */
***************
*** 174,179 ****
--- 180,186 ----
  #define PG_DETOAST_DATUM_SLICE(datum,f,c) \
  		pg_detoast_datum_slice((struct varlena *) DatumGetPointer(datum), \
  		(int32) f, (int32) c)
+ /* WARNING -- unaligned pointer */
  #define PG_DETOAST_DATUM_PACKED(datum) \
  	pg_detoast_datum_packed((struct varlena *) DatumGetPointer(datum))
  
Index: src/include/postgres.h
===================================================================
RCS file: /home/stark/src/REPOSITORY/pgsql/src/include/postgres.h,v
retrieving revision 1.80
diff -c -r1.80 postgres.h
*** src/include/postgres.h	4 May 2007 02:01:02 -0000	1.80
--- src/include/postgres.h	9 May 2007 15:38:37 -0000
***************
*** 235,240 ****
--- 235,246 ----
   * use VARSIZE_ANY/VARSIZE_ANY_EXHDR/VARDATA_ANY.  The other macros here
   * should usually be used only by tuple assembly/disassembly code and
   * code that specifically wants to work with still-toasted Datums.
+  *
+  * WARNING: It is only safe to use VARDATA_ANY() -- typically with
+  * PG_DETOAST_DATUM_UNPACKED() -- if you really don't care about the alignment.
+  * Either because you're working with something like text where the alignment
+  * doesn't matter or because you're not going to access its constituent parts
+  * and just use things like memcpy on it anyways.
   */
  #define VARDATA(PTR)						VARDATA_4B(PTR)
  #define VARSIZE(PTR)						VARSIZE_4B(PTR)
***************
*** 265,270 ****
--- 271,277 ----
  	  VARSIZE_4B(PTR)-4))
  
  /* caution: this will not work on an external or compressed-in-line Datum */
+ /* caution: this will return a possibly unaligned pointer */
  #define VARDATA_ANY(PTR) \
  	 (VARATT_IS_1B(PTR) ? VARDATA_1B(PTR) : VARDATA_4B(PTR))

-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to