[sqlalchemy] Re: Mapping lots of similar tables / database design

diverman Thu, 03 Apr 2014 04:40:31 -0700

Hello, I used similar approach for not thousands, but for undreds of tables.


At first I created shared base class for all of them:

class TimeSeriesBase(Base):
    __abstract__ = True

    @declared_attr
    def id(cls):
        return Column('id', Integer, primary_key=True)

    @declared_attr
    def locid(cls):
        return Column('loc_id', Integer, ForeignKey(Location.id), 
primary_key=True)

    # here put common stuff


Secondly, a little *type()* magic is needed:

series_tables = {}

def get_or_create_timeseries(name):
    if name in series_tables:
        return series_table[name]
        
    class TimeSeriesMixin(object):
        __tablename__ = name

        # here put name-specific stuff


    series_tables[name] = type(name + 'Model', ( TimeSeriesBase, 
TimeSeriesMixin ), {})

    return series_tables[name]

You can call the function any times and it will always return existing 
class or create new class.
That should work.

P

Dne středa, 2. dubna 2014 0:56:54 UTC+2 Peter Stensmyr napsal(a):
>
> I posted this question on Stack Overflow a few days ago, and got some 
> response but nothing that really solves my problem. I'm hoping that I can 
> get some more input here. The initial recommendation was to keep all the 
> data in two tables (one meta and one data table), but this might become 
> unwieldy with the number of rows (I estimate it will be about 6 billion 
> rows in total).
>
> I have many (~2000) locations with time series data. Each time series has 
> millions of rows. I would like to store these in a Postgres database. My 
> current approach is to have a table for each location time series, and a 
> meta table which stores information about each location (coordinates, 
> elevation etc). I am using SQLAlchemy to create and query the tables. I 
> would like to have a relationship between the meta table and each time 
> series table to do queries like "select all locations that have data 
> between date A and date B" and "select all data for date A and export a csv 
> with coordinates". What is the best way to create many tables with the same 
> structure (only the name is different) and have a relationship with a meta 
> table? Or should I use a different database design?
>
> Currently I am using this type of approach to generate a lot of similar 
> mappings:
>
> from sqlalchemy import create_engine, MetaDatafrom sqlalchemy.types import 
> Float, String, DateTime, Integerfrom sqlalchemy import Column, ForeignKeyfrom 
> sqlalchemy.ext.declarative import declarative_basefrom sqlalchemy.orm import 
> sessionmaker, relationship, backref
> Base = declarative_base()
>
> def make_timeseries(name):
>     class TimeSeries(Base):
>
>         __tablename__ = name
>         table_name = Column(String(50), ForeignKey('locations.table_name'))
>         datetime = Column(DateTime, primary_key=True)
>         value = Column(Float)
>
>         location = relationship('Location', backref=backref('timeseries',
>                                 lazy='dynamic'))
>
>         def __init__(self, table_name, datetime, value):
>             self.table_name = table_name
>             self.datetime = datetime
>             self.value = value
>
>         def __repr__(self):
>             return "{}: {}".format(self.datetime, self.value)
>
>     return TimeSeries
>
> class Location(Base):
>
>     __tablename__ = 'locations'
>     id = Column(Integer, primary_key=True)
>     table_name = Column(String(50), unique=True)
>     lon = Column(Float)
>     lat = Column(Float)
> if __name__ == '__main__':
>     connection_string = 'postgresql://user:pw@localhost/location_test'
>     engine = create_engine(connection_string)
>     metadata = MetaData(bind=engine)
>     Session = sessionmaker(bind=engine)
>     session = Session()
>
>     TS1 = make_timeseries('ts1')
>     # TS2 = make_timeseries('ts2')   # this breaks because of the foreign key
>     Base.metadata.create_all(engine)
>     session.add(TS1("ts1", "2001-01-01", 999))
>     session.add(TS1("ts1", "2001-01-02", -555))
>
>     qs = session.query(Location).first()
>     print qs.timeseries.all()
>
>
> This approach has some problems, most notably that if I create more than 
> one TimeSeries object the foreign key doesn't work. Previously I've used 
> some work-arounds (such as not defining a foreign key), but it all seems 
> like a big hack and I feel that there must be a better way of doing this. 
> How should I organise and access my data?
>
> Any advice appreciated, 
>
> Peter
>

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

[sqlalchemy] Re: Mapping lots of similar tables / database design

Reply via email to