Re: [sqlalchemy] Grouping related Columns in table definitions and Python object reconstruction

gamcil Tue, 02 Jul 2019 19:46:37 -0700

Awesome thanks for the extensive reply.

This is the first time I've played with descriptor classes, so I'm sorry if 
this is really basic stuff.


I set up my view class:

class View:
    def __init__(self, gene, attrs):
        self.gene = gene
        self.attrs = attrs

    def keys(self):
        return iter(self.attrs)

    def items(self):
        return ((attr, getattr(self.gene, attr)) for attr in self.attrs)

    def __getitem__(self, key):
        return getattr(self.gene, key)

    def __setitem__(self, key, value):
        setattr(self.gene, key, value)

and a descriptor class:
class Descriptor:
    def __init__(self, *attrs):
        self.attrs = attrs

    def __get__(self, instance, owner):
        return View(instance, self.attrs)

    def __set__(self, instance, value):
        self.attrs = value

then the Gene class using __declare_first__ becomes:
class Gene(Base):
    __tablename__ = 'Gene'
    id = Column(Integer, primary_key=True)
    identifiers = Descriptor('locus', 'protein')
    features = Descriptor('gene', 'mRNA', 'CDS')

    def __init__(self, **kwargs):
        for dic in kwargs.values():
            for key, value in dic.items():
                setattr(self, key, value)

    @classmethod
    def __declare_first__(cls):
        for value in list(cls.__dict__.values()).copy():
            if isinstance(value, Descriptor):
                for attr in value.attrs:
                    setattr(cls, key, Column(attr, String))

This seems to give me close to the interface I would like. A couple of 
questions:
1) So I guess this is still just directly mapping columns to Class 
attributes, accessable by directly getting an attribute, i.e. 
gene.features['mRNA'] is equivalent to gene.mRNA. Would you then mask the 
class attributes by e.g. prepending with underscore?

2) Currently, every time a method is called on the descriptor, a new 
instance of the View class is returned. Is that by design, or is there a 
way of persisting the one View per attribute (identifiers, features, ...) 
for each Gene instance? Is that even an overhead I should be worried about? 
It seems like the equivalent of creating a new dictionary every time I want 
to access attributes on the Gene object.

3) When I call Base.metadata.create_all(bind=engine) without first creating 
a Gene instance, the generated SQL is:
CREATE TABLE "gene" (
    id INTEGER NOT NULL,
    PRIMARY KEY (id)
)

missing all of the other mappings. After calling e.g. gene = Gene(), it's 
as expected:
CREATE TABLE "gene" (
        id INTEGER NOT NULL,
        locus VARCHAR,
        protein VARCHAR,
        gene VARCHAR,
        "mRNA" VARCHAR,
        "CDS" VARCHAR,
        PRIMARY KEY (id)
)

When I'm first initialising the database, when should I be calling 
create_all? Do I need to actually create objects before I create the schema 
and initialise the database?

Thanks again


On Monday, 1 July 2019 23:25:38 UTC+8, Mike Bayer wrote:
>
>
>
> On Mon, Jul 1, 2019, at 5:56 AM, gamcil wrote:
>
> Hi,
>
> I'm new to SQLAlchemy/ORMs - previously I had just been interacting 
> directly with the builtin SQLite driver and had built up my own mess of a 
> mapping system.
>
> 1) I'm starting to convert some classes in a Python package to SQLAlchemy 
> models. . A paired down example of what I'm doing:
>
> class Gene:
>     def __init__(self, identifiers, features):
>         self.identifiers = identifiers
>         self.features = features
>
> gene = Gene(identifiers={'locus': 'GENE_0001', 'protein_id': 'PROT_0001'},
>             features={'mRNA': '1..300,400..500', 'CDS': '1..300,400..500'})
>
> In reality, I'm dealing with many attributes that are all related to the 
> Gene and stored in the same table, which is why to simplify the interface 
> of my classes, I grouped related instance attributes in dictionaries. For 
> example, the location of the mRNA feature of this Gene can then be accessed 
> by gene.features['mRNA']. Each value in the dictionary refers to a unique 
> database column.
>  
> However, when creating SQLAlchemy models, it's required to explicitly map 
> Columns directly as class attributes, e.g.:
> class Gene:
>     id = Column(Integer, primary_key=True)
>     locus = Column(String)
>     protein_id = Column(String)
>     mRNA = Column(String)
>     CDS = Column(String)
>
> Is there a simple way to provide the dictionary functionality? Something 
> like:
> class Gene:
>     id = Column(Integer, primary_key=True)
>     identifiers = {
>         'locus': Column(String),
>         'protein': Column(String)
>     }
>     ...
> Where a change in Gene.identifiers['locus'] would then result in an update 
> to the 'locus' column. I saw MutableDict in the documentation but it seems 
> non-trivial to implement for something like this.
> Is this even a good idea?
>
>
> I'm going to focus on #1 here, so, this kind of thing is pretty 
> straightforward in Python, throughout SQLAlchemy's examples / extensions 
> there are all kinds of "magic dictionary" kinds of examples going on, and 
> this one is...another one :)     To give a high level for this kind of 
> thing, as far as setting up mappings / tables, that can always be done 
> "magically", that is, you can write event hooks to read any kind of 
> configuration you want that will set up the mapper() / Table with whatever 
> columns are needed.  For the "get / set" a dictionary part, we always use a 
> Python descriptor for that kind of thing, so in this case I'd likely be 
> building a custom kind of descriptor class that also serves as the 
> information to generate the table definition.   Finally, you want the 
> dictionary to be "live", that is, not get /set of the whole dictionary, you 
> want changes in the dictionary to affect the model.    So for the last 
> part, yes there needs to be a custom dictionary object of some kind.   It 
> would use the same Python techniques as MutableDict but it likely would be 
> specific to the case here, and overall this case is not that "hard".    In 
> this case I would create a Python object that is not in fact a "dict", but 
> just an object that has as dict-like interface, e.g. __getitem__(), 
> __setitem__(), keys(), etc., and then it just uses the object that it comes 
> from as the source of data, like:
>
>
> class GeneView:
>     def __init__(self, gene, attrs):
>         self.gene = gene
>         self.attrs = attrs
>
>     def keys(self):
>         return iter(self.attrs)
>
>    def items(self):
>         return ((attr, getattr(self.parent, attr)) for attr in self.attrs)
>
>    def __getitem__(self, key):
>        return getattr(self.parent, key)   # maybe you want to check 
> self.attrs too
>
>    def __setitem__(self, key, value):
>        setattr(self.parent, key, value)  # mutability
>
>
> The above thing comes from a descriptor, a hardcoded version would look 
> like:
>
> class Gene(...):
>     @property
>     def identifiers(self):
>         return GeneView(self, ['locus', 'protein'])
>
>
> I'd then want to make the above "hardcoded" pattern more generic making a 
> custom descriptor class (
> https://docs.python.org/3/howto/descriptor.html#descriptor-example) 
> <https://docs.python.org/3/howto/descriptor.html#descriptor-example> so 
> that the mapping looks like:
>
>
> class Gene(...):
>     identifiers = gene_view_attr('locus', 'protein')
>
>
> to map it, a simple event hook like before_mapper_configured (or 
> declararive's declare_first / declare_last hooks: 
> https://docs.sqlalchemy.org/en/13/orm/extensions/declarative/api.html?highlight=declare_last#declare-last)
>  
> can 
> <https://docs.sqlalchemy.org/en/13/orm/extensions/declarative/api.html?highlight=declare_last#declare-last>
>  
> be used, in this case it probably doesn't matter where in the mapping phase 
> we are since columns can be added at any time:
>
> from sqlalchemy import event
>
> @event.listens_for(Base, "before_mapper_configured", propagate=True)
> def setup_view_attrs(mapper, class_):
>     for value in class_.__dict__.values():
>         if isinstance(value, gene_view_attr):   # look for our descriptor
>             for attr in value.attrs:
>                 setattr(class_, attr, Column(attr, String))  # map columns
>
>
> That's all you need for #1, let me know if you need more detail.
>
>
>
>
>
> 2) Say for the example above I have created a plain Python Location class 
> to represent a location of a feature in self.features:
> class Location:
>     def __init__(self, intervals, ...):
>         self.intervals = intervals
>         ...
>
>     @classmethod
>     def from_flat(cls, flat):
>         # convert flat location to list of tuples
>         # e.g. [(1, 300), (400, 500)]
>
> This class has a classmethod that takes the stored flat string (e.g. 
> 1..300,400..500) and converts it to a list of interval tuples, and has some 
> extra logic in it for use in my applications. I understand I can use 
> orm.reconstructor to instantiate these Location objects from the stored 
> attributes upon object loading, i.e.
> @reconstructor
> def init_on_load(self):
>     self.mRNA = Location.from_flat(self.mRNA)
>     ...
>
>
> However, what I don't get is how I should be shuttling changes in these 
> objects back to the database as updates.
> Is there some way to map this (instantiate Location on loading; mutate 
> Location; convert Location back to flat format to update the Gene 
> attribute)? Do i need to write setter methods for each that directly set 
> the underlying column attribute to the flat representation of the Location?
>
>
> So I usually use descriptors for this kind of thing too, like above.   
> There is also the possibility of using custom datatypes that deal with the 
> tuples and convert to strings for the database side.  If your application 
> never wants to see the "string" form and just wants to see tuples, you can 
> make a TypeDecorator that does this, see the examples at 
> https://docs.sqlalchemy.org/en/13/core/custom_types.html#typedecorator-recipes
>  
> .       Then again if you are going to use the dictionary approach above 
> for these attributes, you can build the coercion between string / tuple 
> into the GeneView class directly.    I'd probably do it that way in this 
> case assuming that's the place you want to see the tuples.
>
>
>
>
> Any help is appreciated, cheers
>
>
> --
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlal...@googlegroups.com <javascript:>.
> To post to this group, send email to sqlal...@googlegroups.com 
> <javascript:>.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/f3105bee-b6dc-489c-82e0-c6d36710a462%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/f3105bee-b6dc-489c-82e0-c6d36710a462%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/20e0b0d3-ab6b-4def-b708-47e761926e9e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Grouping related Columns in table definitions and Python object reconstruction

Reply via email to