Per discussion on reducing heap tuple header, I've started to work on the phantom cid idea.

I'm thinking of having an array of cmin,cmax pairs, indexed by phantom cid number. Looking up cmin,cmax of a phantom id is then a simple array lookup. To allow reusing phantom cids, we have a hash table that allows looking up a phantomid by cmin,cmax pair.

A big question is, do we need to implement spilling to disk?

With the above data structures, each phantom cid is going to take 28 bytes of backend-private memory [See math below]. Transactions that actually need phantom cids are not that common, but I suppose that applications that make heavy use of plpgsql functions or do a lot of repeated UPDATES of same rows might need millions.


[quick sizing math]
array element = sizeof(cmin) + sizeof(cmax) = 4 + 4 = 8
hash table key + data + hash element overhead = sizeof(cmin) + sizeof(cmax) + sizeof(phantomcid) + sizeof(HASHELEMENT) = 20
Total: 28 bytes (or 32 if MAXALIGN is 8-bytes)

this excludes overhead of hash table buckets etc.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to