Re: Interest in serializing composite types?

Darren Duncan Sun, 24 Aug 2014 19:24:07 -0700

What you proposed here sounds reasonable to me. -- Darren Duncan


On 2014-08-24, 7:15 PM, Chris Travers wrote:

    I think composite types should be treated in exactly the same manner as
    tuples in regular queries.  Whenever we look up information to parse the
    outside-most tuples of query results, do inner tuples/types then too.


Sure.  There is of course a question of responsibility here.  For OO stuff, I
don't mind saying "it's the object's responsibility to serialize" but we can't
put that in DBD::Pg.  That is, however, the assumption in PGObject.

Because it will make the lazy attribute problems clearer, I will describe the
PGObject framework here.

PGObject.pm is essentially glue which ties in objects to the db via common
defined interfaces (essentially duck typing).  Then the actual object
implementations (currently limited to PGObject::Simple for blessed hashrefs and
PGObject::Simple::Role for /Moo(se)?$/) then provide mapping services between
stored procedures and application methods.  Because of the common interfaces,
object-responsible serialization means one can have a custom datetime handler
which serializes and deserializes appropriately (for simple types).

       If it isn't possible to do that at prepare() time, then do it at
    execute() time.  Do not repeat for each row fetched from the result of an
    execute(), assuming data is homogeneous.  This is the simplest way to know
    that the type definitions we're working with are still valid for the 
results.


So basically if we see a hashref, we need to know what type it is.  This is
going to be outside SQL.  We can do this with stored procs because we can look
up expected input types.  I don't see how you can do something on an insert or
update statement though without the application specifying type.  I don't see
how this could be done outside a bind_param call.  Nested tuples at that point
cease to be a major problem.


    We do not want to add any non-core dependencies.  All that should be
    required to use DBD::Pg is Perl itself, DBI, and DBD::Pg.


Memoize is core and has been since at least 5.8.9 according to what I have been
able to find. This being said, memoization is dangerous when it comes to db
stuff so it should definitely be optional if used at all.

Of course, with a public API, there's no reason one can't memoize a wrapper
function.

       Any extra dependencies should be optional and user-defined, eg the user
    of DBD::Pg registers some sort of handler against DBI or DBD::Pg to eg
    override behavior to memoize for speed, but DBD::Pg has no knowledge of this
    other than to make the relevant parts possible to override using said
    generic API.


Right.





        I won't promise
        proper Moose handling of things like lazy attributes in this version, 
but it
        should be able to handle hashrefs.

        Does that sound like a reasonable plan?


    I don't know what lazy attributes have to do with this, unless you're
    talking about only deserializing a composite-typed value if it is actually
    accessed.


No  The problem is you have a moose object with a lazy attribute.  When you want
to serialize it, that attribute may not have been initialized yet.  There are a
number of solutions to this problem:

1.  Make it the object's responsibility to initialize lazy attributes before
calling db methods.  That is what we did in LedgerSMB 1.4, but we found it error
prone.

2.  Go through accessors if available.  This is what I recently did in
PGObject::Simple::Role and that makes things work for other object systems which
don't map to simple hashrefs.

I think the obvious way to do this would be to have DBD::Pg *only* work on
hashrefs, but make the catalog lookups into a public API.  This would allow
other implementations to handle serialization and just hand in a string.

This would keep DBD::Pg's responsibility here to a minimum.

    Either way, DBI/DBD::Pg itself should know nothing about Moose or any other
    non-core object systems and just represent things with what tools the core
    provides; your framework can do that as you see fit though.


I am actually thinking that a well-designed public API handling only hashrefs
would be the answer here.  This way you could ask for a type definition, and
generate a hashref with appropriate information.  This could then be passed in
again using a public API so no catalog needs to be hit.


    I don't know if it does this (please tell me), but you know what would be an
    extremely useful feature in Postgres itself?  It is there being a simple
    fast way to query the database if it had any schema changes (anything DDL
    does) since it was last asked the question.  This could take the form of a
    LISTEN/NOTIFY of some system-defined channel.


http://www.postgresql.org/docs/9.3/static/sql-createeventtrigger.html is the way
to do it.  That's outside the scope of DBD::Pg, though with a public API, an
application could listen and ask to requery.

       If we assume in practice that schema changes are infrequent, then we can
    safely memoize any schema knowledge we have and know it is still valid until
    we get such a message saying there was a change, and then we flush the cache
    and check what we need to know again.


Right.  What I do with PGObject.pm's stored proc lookups is make memoization
optional (because I don't know how an app developer wants to handle db schema
upgrades), but where it is memoized, we flush the cache when we get an error
about bad arguments and we provide a public API to flush the cache. This allows
developers to decide how to handle these changes with a bit of a failsafe if
they forget.

I think memoization should be optional.  There are a lot of traps with it that
are way outside DBD::Pg's scope both on the db side and on the app development
side (oops, we used shift on the return value).

    -- Darren Duncan

Re: Interest in serializing composite types?

Reply via email to