Re: [HACKERS] Pluggable storage

Andres Freund Mon, 14 Aug 2017 23:54:36 -0700

Hi,


On 2017-06-13 11:50:27 +1000, Haribabu Kommi wrote:
> Here I attached WIP patches to support pluggable storage. The patch series
> are may not work individually. Still so many things are under development.
> These patches are just to share the approach of the current development.

Making a pass through the patchset to get a feel where this at, and
where this is headed.  I previously skimmed the thread to get a rough
sense on what's discused, but not in a very detailed manner.


General:

- I think one important discussion we need to have is what kind of
  performance impact we're going to accept introducing this. It seems
  very likely that this'll cause some slowdown.  We can kind of
  alleviate that by doing some optimizations at the same time, but
  nevertheless, this abstraction is going to cost.

- I don't think we should introduce this without a user besides
  heapam. The likelihood that API will be usable by anything else
  without a testcase seems fairly remote.  I think some src/test/modules
  type implementation of a per-session, in-memory storage - relatively
  easy to implement - or such is necessary.

- I think, and detailed some of that, we're going to need some cleanups
  that go in before this, to decrease the size / increase the quality of
  the new APIs.  It's going to get more painful to change APIs
  subsequently.

- We'll have to document clearly that these APIs are going to change for
  a while, even after the release introducing them.


StorageAm - Scan stuff:

- I think API isn't quite right. There's a lot of granular callback
  functionality like scan_begin_function / scan_begin_catalog /
  scan_begin_bm - these largely are convenience wrappers around the same
  function, and I don't think that would, or rather should, change in
  any abstracted storage layer.  So I think there needs to be some
  unification first (pretty close w/ beginscan_internal already, but
  perhaps we should get rid of a few of these wrappers).

- Some of the exposed functionality, e.g. scan_getpage,
  scan_update_snapshot, scan_rescan_set_params looks like it should just
  be excised, i.e. there's no reason for it to exist.

- Minor: don't think the _function suffix for Storis necessary, just
  makes things long, and every member has it. Besides that, it's also
  easy to misunderstand - for a second I understood
  scan_getnext_function to be about getting the next function...

- Scans are still represented as HeapScanDesc - I don't think that's
  going to fly. Either this needs to be an opaque type (i.e. a struct
  that's not defined, just forward declared), or it needs to be a base
  struct that individual AMs embed in their own structs.  Individual AMs
  definitely are going to need different pieces of data.


Storage AM - tuple stuff:

- tuple_get_{xmin, updated_xid, cmin, itempointer, ctid, heaponly} are
  each individual functions, that seems pretty painful to maintain, and
  v/ likely to just grow and grow. Not sure what the solution is, but
  this seems like a hard sell.

- The three *speculative* functions don't quite seem right to me, nor do
  I understand:
+        *
+        * Setting a tuple's speculative token is a slot-only operation, so no 
need
+        * for a storage AM method, but after inserting a tuple containing a
+        * speculative token, the insertion must be completed by these routines:
+        */
  I don't see anything related to slots, here?


Storage AM - slot stuff:


- I think there's a number of wrapper functions (slot_getattr,
  slot_getallattrs, getsomeattrs, attisnull) around the same
  functionality - that bloats the API and causes slowdowns. Instead we
  need something like slot_virtualize_tuple(int16 upto), and the rest
  should just be wrappers.

- I think it's wrong to have the slot functionality defined on the
  StorageAm level.  That'll cause even more indirect function calls (=>
  slowness), and besides that the TupleTableSlot struct will need
  members for each and every Am.

  I think the solution is to instead have an Am level "create slot"
  function, and the returned slot is allocated by the Am, with a base
  member of TupleTableSlot with basically just tts_nvalid, tts_values,
  tts_isnull as members.  Those are the only members that can be
  accessed without functions.

  Access to the individual functions (say store_tuple) would then be
  direct members of the TupleTableSlot interface. While that costs a bit
  of memory, it removes one indirection from an already performance
  critical path.

- MinimalTuples should be one type of slot for the above, except it's
  not created by an StorageAm but by a function returning a
  TupleTableSlot.

  This should remove the need for the slot_copy_min_tuple,
  slot_is_physical_tuple functions.

- Right now TupleTableSlots are an executor datastructure, but
  these patches (rightly!) make it much more widely used. So I think it
  needs to be moved outside of executor/, and probably renamed to
  something like TupleHolder or something.

- The oid integration seems wrong - without an accessor oids won't be
  queryable with this unless you break through the API.  But from a
  higher level view I do wonder if it's not time to remove "special" oid
  columns and replace them with a normal column.  We should be hesitant
  enshrining crusty old concepts in new APIs.


Executor integration:

- I'm quite fearful that this'll cause slowdowns in a few tight paths.
  The most likely cases here seem to be a) bitmap indexscans b)
  indexscans c) highly selective sequential scans.   I do wonder if
  that can be partially addressed by switching out the individual
  executor routines in the relevant scan nodes by something using or
  similar to the infrastructure in cc9f08b6b8


Regards,

Andres


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pluggable storage

Reply via email to