Re: [HACKERS] Reducing overhead for repeat de-TOASTing

2008-06-18 Thread Simon Riggs

On Mon, 2008-06-16 at 15:35 -0400, Tom Lane wrote:
 Recent discussions with the PostGIS hackers led me to think about ways
 to reduce overhead when the same TOAST value is repeatedly detoasted.
 In the example shown here
 http://archives.postgresql.org/pgsql-hackers/2008-06/msg00384.php
 90% of the runtime is being consumed by repeated detoastings of a single
 datum.  That is certainly an outlier case, but we've heard complaints of
 TOAST being slow before.  The solution I'm about to propose should fix
 this, and as a bonus it will reduce the problem of memory leakage when
 a detoasted value doesn't get pfreed.
 
 What I am imagining is that the tuple toaster will maintain a cache of
 recently-detoasted values...

 Comments, better ideas?  Anyone think this is too much trouble to take
 for the problem?

You've not covered the idea that we just alter the execution so we just
detoast once. If we tried harder to reduce the number of detoastings
then we would benefit all of the cases you mention, including internal
decompression. We would use memory, yes, but then so would a cache of
recently detoasted values.

If we see that the index scan key is toastable/ed then the lowest levels
of the plan can create an expanded copy of the tuple and pass that
upwards. We may need to do this in a longer lived context and explicitly
free previous tuples to avoid memory bloat, but we'd have the same
memory usage and same memory freeing issues as with caching. It just
seems more direct and more obvious, especially since it is just an
internal version of the workaround, which was to create a function to
perform early detoasting. Maybe this could be done inside the IndexScan
node when a tuple arrives with toasted values(s) for the scan key
attribute(s).

I presume there's various reasons why you've ruled that out, but with
such a complex proposal it seems worth revisiting the alternatives, even
if just to document them for the archives.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing overhead for repeat de-TOASTing

2008-06-18 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Agreed. Yet I'm thinking that a more coherent approach to optimising the
 tuple memory usage in the executor tree might be better than the special
 cases we seem to have in various places. I don't know what that is, or
 even if its possible though.

Yeah.  I had tried to think of a way to manage the cached detoasted
value as part of the TupleTableSlot in which the toasted datum is
(normally) stored, but there seems no way to know which slot that is
at the point where pg_detoast_datum is invoked --- and as mentioned
earlier, speculatively detoasting things at the point of the slot access
seems a loser.

[ thinks a bit more ... ]  But there's always more than one way to
skin a cat.  Right now, when you fetch a toasted attribute value
out of a Slot, what you get is a pointer to a stored-on-disk TOAST
pointer, ie

0x80 or 0x01
length (18)
struct varatt_external

Now the Slot knows which attributes are varlena (it has a tupdesc)
so it could easily check whether it's about to return one of these.
It could instead return a pointer to, say

0x80 or 0x01
length (more than 18)
struct varatt_external
pointer to Slot
pointer to detoasted value, or NULL if not detoasted yet

and that pointer-to-Slot would give us the hook we need to manage
the detoasting when and if pg_detoast_datum gets called.  Both
this struct and the ultimately decompressed value would be auxiliary
memory belonging to the Slot, and would go away at slot clear.
(This is certain to work since a not-toasted pass-by-ref datum
in the tuple would have that same lifetime.)

Come to think of it, if Slots are going to manage detoasted copies
of attributes, we could have them auto-detoast inline-compressed
Datums at the time of fetch.  The argument that this might be
wasted work has a lot less force for that case.

I am not sure this is a better scheme than the backend-wide cache,
but it's worth thinking about.  It would have a lot less management
overhead.  On the other hand it couldn't amortize detoastings across
repeated tuple fetches (such as could happen in a join, or successive
queries on the same value).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing overhead for repeat de-TOASTing

2008-06-17 Thread Jeff


On Jun 16, 2008, at 3:35 PM, Tom Lane wrote:



to a cache entry rather than a freshly palloc'd value.  The cache  
lookup
key is the toast table OID plus value OID.  Now pg_detoast_datum()  
has no

...
the result of decompressing an inline-compressed datum, because  
those have
no unique ID that could be used for a lookup key.  This puts a bit  
of a


Wouldn't the tid fit this? or table oid + tid?

--
Jeff Trout [EMAIL PROTECTED]
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing overhead for repeat de-TOASTing

2008-06-17 Thread Tom Lane
Jeff [EMAIL PROTECTED] writes:
 On Jun 16, 2008, at 3:35 PM, Tom Lane wrote:
 the result of decompressing an inline-compressed datum, because those
 have no unique ID that could be used for a lookup key.  This puts a
 bit of a

 Wouldn't the tid fit this? or table oid + tid?

No.  The killer reason why not is that at the time we need to decompress
a datum, we don't know what row it came from.  There are some other
problems too...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing overhead for repeat de-TOASTing

2008-06-17 Thread Teodor Sigaev



But we can resolve that by ruling that the required lifetime is the same
as the value would have had if it'd really been palloc'd --- IOW, until
the memory context that was current at the time gets deleted or reset.


Many support functions of GiST/GIN live in very short memory context - only for 
one call. So, that cache invalidation technique doesn't give any advantage 
without rearranging this part.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing overhead for repeat de-TOASTing

2008-06-17 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 But we can resolve that by ruling that the required lifetime is the same
 as the value would have had if it'd really been palloc'd --- IOW, until
 the memory context that was current at the time gets deleted or reset.

 Many support functions of GiST/GIN live in very short memory context - only 
 for 
 one call. So, that cache invalidation technique doesn't give any advantage 
 without rearranging this part.

Right, but I think I've got that covered.  The memory context reset
won't actually flush the toast cache entry, it effectively just drops its
reference count.  We'll only drop cache entries when under memory
pressure (or if they're invalidated by toast table updates/deletes).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing overhead for repeat de-TOASTing

2008-06-17 Thread Greg Stark
  I definitely think it's worth it, even if it doesn't handle an
  inline-compressed datum.

 Yeah.  I'm not certain how much benefit we could get there anyway.
 If the datum isn't out-of-line then there's a small upper limit on how
 big it can be and hence a small upper limit on how long it takes to
 decompress.  It's not clear that a complicated caching scheme would
 pay for itself.

Well there's a small upper limit per-instance but the aggregate could still be 
significant if you have a situation like btree scans which are repeatedly 
detoasting the same datum. Note that the inline compressed case includes 
packed varlenas which are being copied just to get their alignment right. It 
would be nice to get rid of that palloc/pfree bandwidth.

I don't really see a way to do this though. If we hook into the original 
datum's mcxt we could use the pointer itself as a key. But if the original 
datum comes from a buffer that doesn't work.

One thought I had -- which doesn't seem to go anywhere, but I thought was worth 
mentioning in case you see a way to leverage it that I don't -- is that if the 
toast key is already in the cache then deform_tuple could substitute the cached 
value directly instead of waiting for someone to detoast it. That means we can 
save all the subsequent trips to the toast cache manager. I'm not sure that 
would give us a convenient way to know when to unpin the toast cache entry 
though. It's possible that some code is aware that deform_tuple doesn't 
allocate anything currently and therefore doesn't set the memory context to 
anything that will live as long as the data it returns.


Incidentally, I'm on vacation and reading this via an awful webmail interface. 
So I'm likely to miss some interesting stuff for a couple weeks. I suppose the 
Snr ratio of the list is likely to move but I'm not sure which direction...


Re: [HACKERS] Reducing overhead for repeat de-TOASTing

2008-06-16 Thread Stephen Frost
* Tom Lane ([EMAIL PROTECTED]) wrote:
 One unsolved problem is that this scheme doesn't provide any way to cache
 the result of decompressing an inline-compressed datum, because those have
 no unique ID that could be used for a lookup key.

That's pretty unfortunate.

 Ideas?

Not at the moment, but given the situation it really does strike me as
something we want to solve.  Inventing an ID would likely be overkill
or wouldn't solve the problem anyway, I'm guessing...

 Comments, better ideas?  Anyone think this is too much trouble to take
 for the problem?

I definitely think it's worth it, even if it doesn't handle an
inline-compressed datum.  PostGIS is certainly a good use case for why,
but I doubt it's the only one.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Reducing overhead for repeat de-TOASTing

2008-06-16 Thread Tom Lane
Stephen Frost [EMAIL PROTECTED] writes:
 * Tom Lane ([EMAIL PROTECTED]) wrote:
 Comments, better ideas?  Anyone think this is too much trouble to take
 for the problem?

 I definitely think it's worth it, even if it doesn't handle an
 inline-compressed datum.

Yeah.  I'm not certain how much benefit we could get there anyway.
If the datum isn't out-of-line then there's a small upper limit on how
big it can be and hence a small upper limit on how long it takes to
decompress.  It's not clear that a complicated caching scheme would
pay for itself.

The profile shown here:
http://postgis.refractions.net/pipermail/postgis-devel/2008-June/003081.html
shows that the problem the PostGIS guys are looking at is definitely an
out-of-line case (in fact, it looks like the datum wasn't even compressed).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers