Re: [PATCHES] TODO Item - Return compressed length of TOAST datatypes (WIP)

2005-06-18 Thread Tom Lane
Mark Kirkwood [EMAIL PROTECTED] writes:
 I thought I would have a look at:
 (Datatypes) Add function to return compressed length of TOAST data values.

My recollection of that discussion is that we just wanted something
that would return the actual VARSIZE() of the datum.  You're building
something way too confusing ...

A more interesting point is that the way you've declared the function,
it will only work on text values, which is pretty limiting.  Ideally
we'd define this thing as pg_datum_length(any-varlena-type) returns int
but there is no pseudotype corresponding to any varlena type.

I was thinking about this the other day in connection with my proposal
to make something that could return the TOAST value OID of an
out-of-line datum.  I think the only non-restrictive way to do it would
be to declare the function as taking any, and then add a runtime check
using the get_fn_expr_argtype() mechanism to test that what you've been
given is in fact a varlena datatype.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PATCHES] TODO Item - Return compressed length of TOAST datatypes (WIP)

2005-06-17 Thread Mark Kirkwood

I thought I would have a look at:

(Datatypes) Add function to return compressed length of TOAST data values.

A WIP patch is attached for comment (wanted to check I hadn't bitten off
more than I could chew *before* asking questions!).

A few questions come to mind:

1) The name - I have called it 'toast_compressed_length'. Seems longish
- I'm wondering if just 'compressed_length' is ok?

2) What should be returned for toasted data that is not compressed (or
plain stored data for that matter)? The WIP patch just gives the
uncompressed size (I notice I may need to subtract VARHDRSZ in some cases).

3) What should be returned for non-varlena types? The WIP patch is
treating everything as a varlena, so is returning incorrect information
for that case.

4) The builtin is declared as immutable - I am not so sure about that (I
am wondering if altering a column's storage from MAIN - EXTENDED and
then updating the column to be itself will fool it).

5) Any multi-byte locale considerations?

regards

Mark








diff -Nacr src/include/catalog/pg_proc.h.orig src/include/catalog/pg_proc.h
*** src/include/catalog/pg_proc.h.orig  Fri Jun 17 15:30:17 2005
--- src/include/catalog/pg_proc.h   Fri Jun 17 17:08:18 2005
***
*** 3655,3660 
--- 3655,3664 
  DATA(insert OID = 2560 (  pg_postmaster_start_time PGNSP PGUID 12 f f t f s 0 
1184  _null_ _null_ _null_ pgsql_postmaster_start_time - _null_ ));
  DESCR(postmaster start time);
  
+ /* Toast compressed length */
+ DATA(insert OID = 2561 (  toast_compressed_lengthPGNSP PGUID 12 f f t 
f i 1 23 25 _null_ _null_ _null_  toast_compressed_length - _null_ ));
+ DESCR(toast compressed length);
+ 
  
  /*
   * Symbolic values for provolatile column: these indicate whether the result
diff -Nacr src/include/access/tuptoaster.h.orig src/include/access/tuptoaster.h
*** src/include/access/tuptoaster.h.origThu Jun 16 21:12:57 2005
--- src/include/access/tuptoaster.h Thu Jun 16 21:14:06 2005
***
*** 138,141 
--- 138,149 
   */
  extern Size toast_raw_datum_size(Datum value);
  
+ /* --
+  * toast_compressed_datum_size -
+  *
+  *Return the compressed (toasted) size of a varlena datum
+  * --
+  */
+ extern Size toast_compressed_datum_size(Datum value);
+ 
  #endif   /* TUPTOASTER_H */
diff -Nacr src/include/utils/pg_lzcompress.h.orig 
src/include/utils/pg_lzcompress.h
*** src/include/utils/pg_lzcompress.h.orig  Thu Jun 16 21:21:37 2005
--- src/include/utils/pg_lzcompress.h   Thu Jun 16 21:21:11 2005
***
*** 228,231 
--- 228,238 
  extern intpglz_get_next_decomp_char_from_lzdata(PGLZ_DecompState *dstate);
  extern intpglz_get_next_decomp_char_from_plain(PGLZ_DecompState *dstate);
  
+ /* --
+  * Function to get compressed size.
+  * Internal use only.
+  * --
+  */
+ extern intpglz_fetch_size(PGLZ_Header *source);
+ 
  #endif   /* _PG_LZCOMPRESS_H_ */
diff -Nacr src/include/utils/builtins.h.orig src/include/utils/builtins.h
*** src/include/utils/builtins.h.orig   Fri Jun 17 15:25:01 2005
--- src/include/utils/builtins.hFri Jun 17 15:27:30 2005
***
*** 828,831 
--- 828,834 
  /* catalog/pg_conversion.c */
  extern Datum pg_convert_using(PG_FUNCTION_ARGS);
  
+ /* toastfuncs.c */
+ Datum   toast_compressed_length(PG_FUNCTION_ARGS);
+ 
  #endif   /* BUILTINS_H */
diff -Nacr src/backend/access/heap/tuptoaster.c.orig 
src/backend/access/heap/tuptoaster.c
*** src/backend/access/heap/tuptoaster.c.orig   Thu Jun 16 20:56:59 2005
--- src/backend/access/heap/tuptoaster.cFri Jun 17 15:12:30 2005
***
*** 1436,1438 
--- 1436,1499 
  
return result;
  }
+ 
+ /* --
+  * toast_compressed_datum_size
+  *
+  *Show the compressed size of a datum
+  * --
+  */
+ Size 
+ toast_compressed_datum_size(Datum value)
+ {
+ 
+ 
+   Sizesize;
+   varattrib   *attr = (varattrib *) DatumGetPointer(value);
+ 
+   if (!PointerIsValid(attr))
+   {
+   /*
+* No storage or NULL.
+*/
+   size = 0;
+   }
+   else if (VARATT_IS_EXTERNAL(attr))
+   {
+   /*
+* Attribute is stored externally 
+* If  it is compressed too, then we need to get the external 
datum
+* and interrogate *its* compressed size
+* otherwise just use the external rawsize (i.e. no compression)
+*/
+   if (VARATT_IS_COMPRESSED(attr))
+   {
+   varattrib   *attrext = 
toast_fetch_datum(attr);
+   size = pglz_fetch_size((PGLZ_Header *)attrext);
+   pfree(attrext);
+   }
+   else
+   {
+ 
+   size = attr-va_content.va_external.va_rawsize;
+   }
+   }
+   else if