Awesome thanks for the extensive reply. This is the first time I've played with descriptor classes, so I'm sorry if this is really basic stuff.
I set up my view class: class View: def __init__(self, gene, attrs): self.gene = gene self.attrs = attrs def keys(self): return iter(self.attrs) def items(self): return ((attr, getattr(self.gene, attr)) for attr in self.attrs) def __getitem__(self, key): return getattr(self.gene, key) def __setitem__(self, key, value): setattr(self.gene, key, value) and a descriptor class: class Descriptor: def __init__(self, *attrs): self.attrs = attrs def __get__(self, instance, owner): return View(instance, self.attrs) def __set__(self, instance, value): self.attrs = value then the Gene class using __declare_first__ becomes: class Gene(Base): __tablename__ = 'Gene' id = Column(Integer, primary_key=True) identifiers = Descriptor('locus', 'protein') features = Descriptor('gene', 'mRNA', 'CDS') def __init__(self, **kwargs): for dic in kwargs.values(): for key, value in dic.items(): setattr(self, key, value) @classmethod def __declare_first__(cls): for value in list(cls.__dict__.values()).copy(): if isinstance(value, Descriptor): for attr in value.attrs: setattr(cls, key, Column(attr, String)) This seems to give me close to the interface I would like. A couple of questions: 1) So I guess this is still just directly mapping columns to Class attributes, accessable by directly getting an attribute, i.e. gene.features['mRNA'] is equivalent to gene.mRNA. Would you then mask the class attributes by e.g. prepending with underscore? 2) Currently, every time a method is called on the descriptor, a new instance of the View class is returned. Is that by design, or is there a way of persisting the one View per attribute (identifiers, features, ...) for each Gene instance? Is that even an overhead I should be worried about? It seems like the equivalent of creating a new dictionary every time I want to access attributes on the Gene object. 3) When I call Base.metadata.create_all(bind=engine) without first creating a Gene instance, the generated SQL is: CREATE TABLE "gene" ( id INTEGER NOT NULL, PRIMARY KEY (id) ) missing all of the other mappings. After calling e.g. gene = Gene(), it's as expected: CREATE TABLE "gene" ( id INTEGER NOT NULL, locus VARCHAR, protein VARCHAR, gene VARCHAR, "mRNA" VARCHAR, "CDS" VARCHAR, PRIMARY KEY (id) ) When I'm first initialising the database, when should I be calling create_all? Do I need to actually create objects before I create the schema and initialise the database? Thanks again On Monday, 1 July 2019 23:25:38 UTC+8, Mike Bayer wrote: > > > > On Mon, Jul 1, 2019, at 5:56 AM, gamcil wrote: > > Hi, > > I'm new to SQLAlchemy/ORMs - previously I had just been interacting > directly with the builtin SQLite driver and had built up my own mess of a > mapping system. > > 1) I'm starting to convert some classes in a Python package to SQLAlchemy > models. . A paired down example of what I'm doing: > > class Gene: > def __init__(self, identifiers, features): > self.identifiers = identifiers > self.features = features > > gene = Gene(identifiers={'locus': 'GENE_0001', 'protein_id': 'PROT_0001'}, > features={'mRNA': '1..300,400..500', 'CDS': '1..300,400..500'}) > > In reality, I'm dealing with many attributes that are all related to the > Gene and stored in the same table, which is why to simplify the interface > of my classes, I grouped related instance attributes in dictionaries. For > example, the location of the mRNA feature of this Gene can then be accessed > by gene.features['mRNA']. Each value in the dictionary refers to a unique > database column. > > However, when creating SQLAlchemy models, it's required to explicitly map > Columns directly as class attributes, e.g.: > class Gene: > id = Column(Integer, primary_key=True) > locus = Column(String) > protein_id = Column(String) > mRNA = Column(String) > CDS = Column(String) > > Is there a simple way to provide the dictionary functionality? Something > like: > class Gene: > id = Column(Integer, primary_key=True) > identifiers = { > 'locus': Column(String), > 'protein': Column(String) > } > ... > Where a change in Gene.identifiers['locus'] would then result in an update > to the 'locus' column. I saw MutableDict in the documentation but it seems > non-trivial to implement for something like this. > Is this even a good idea? > > > I'm going to focus on #1 here, so, this kind of thing is pretty > straightforward in Python, throughout SQLAlchemy's examples / extensions > there are all kinds of "magic dictionary" kinds of examples going on, and > this one is...another one :) To give a high level for this kind of > thing, as far as setting up mappings / tables, that can always be done > "magically", that is, you can write event hooks to read any kind of > configuration you want that will set up the mapper() / Table with whatever > columns are needed. For the "get / set" a dictionary part, we always use a > Python descriptor for that kind of thing, so in this case I'd likely be > building a custom kind of descriptor class that also serves as the > information to generate the table definition. Finally, you want the > dictionary to be "live", that is, not get /set of the whole dictionary, you > want changes in the dictionary to affect the model. So for the last > part, yes there needs to be a custom dictionary object of some kind. It > would use the same Python techniques as MutableDict but it likely would be > specific to the case here, and overall this case is not that "hard". In > this case I would create a Python object that is not in fact a "dict", but > just an object that has as dict-like interface, e.g. __getitem__(), > __setitem__(), keys(), etc., and then it just uses the object that it comes > from as the source of data, like: > > > class GeneView: > def __init__(self, gene, attrs): > self.gene = gene > self.attrs = attrs > > def keys(self): > return iter(self.attrs) > > def items(self): > return ((attr, getattr(self.parent, attr)) for attr in self.attrs) > > def __getitem__(self, key): > return getattr(self.parent, key) # maybe you want to check > self.attrs too > > def __setitem__(self, key, value): > setattr(self.parent, key, value) # mutability > > > The above thing comes from a descriptor, a hardcoded version would look > like: > > class Gene(...): > @property > def identifiers(self): > return GeneView(self, ['locus', 'protein']) > > > I'd then want to make the above "hardcoded" pattern more generic making a > custom descriptor class ( > https://docs.python.org/3/howto/descriptor.html#descriptor-example) > <https://docs.python.org/3/howto/descriptor.html#descriptor-example> so > that the mapping looks like: > > > class Gene(...): > identifiers = gene_view_attr('locus', 'protein') > > > to map it, a simple event hook like before_mapper_configured (or > declararive's declare_first / declare_last hooks: > https://docs.sqlalchemy.org/en/13/orm/extensions/declarative/api.html?highlight=declare_last#declare-last) > > can > <https://docs.sqlalchemy.org/en/13/orm/extensions/declarative/api.html?highlight=declare_last#declare-last> > > be used, in this case it probably doesn't matter where in the mapping phase > we are since columns can be added at any time: > > from sqlalchemy import event > > @event.listens_for(Base, "before_mapper_configured", propagate=True) > def setup_view_attrs(mapper, class_): > for value in class_.__dict__.values(): > if isinstance(value, gene_view_attr): # look for our descriptor > for attr in value.attrs: > setattr(class_, attr, Column(attr, String)) # map columns > > > That's all you need for #1, let me know if you need more detail. > > > > > > 2) Say for the example above I have created a plain Python Location class > to represent a location of a feature in self.features: > class Location: > def __init__(self, intervals, ...): > self.intervals = intervals > ... > > @classmethod > def from_flat(cls, flat): > # convert flat location to list of tuples > # e.g. [(1, 300), (400, 500)] > > This class has a classmethod that takes the stored flat string (e.g. > 1..300,400..500) and converts it to a list of interval tuples, and has some > extra logic in it for use in my applications. I understand I can use > orm.reconstructor to instantiate these Location objects from the stored > attributes upon object loading, i.e. > @reconstructor > def init_on_load(self): > self.mRNA = Location.from_flat(self.mRNA) > ... > > > However, what I don't get is how I should be shuttling changes in these > objects back to the database as updates. > Is there some way to map this (instantiate Location on loading; mutate > Location; convert Location back to flat format to update the Gene > attribute)? Do i need to write setter methods for each that directly set > the underlying column attribute to the flat representation of the Location? > > > So I usually use descriptors for this kind of thing too, like above. > There is also the possibility of using custom datatypes that deal with the > tuples and convert to strings for the database side. If your application > never wants to see the "string" form and just wants to see tuples, you can > make a TypeDecorator that does this, see the examples at > https://docs.sqlalchemy.org/en/13/core/custom_types.html#typedecorator-recipes > > . Then again if you are going to use the dictionary approach above > for these attributes, you can build the coercion between string / tuple > into the GeneView class directly. I'd probably do it that way in this > case assuming that's the place you want to see the tuples. > > > > > Any help is appreciated, cheers > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlal...@googlegroups.com <javascript:>. > To post to this group, send email to sqlal...@googlegroups.com > <javascript:>. > Visit this group at https://groups.google.com/group/sqlalchemy. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/f3105bee-b6dc-489c-82e0-c6d36710a462%40googlegroups.com > > <https://groups.google.com/d/msgid/sqlalchemy/f3105bee-b6dc-489c-82e0-c6d36710a462%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/20e0b0d3-ab6b-4def-b708-47e761926e9e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.