subject:"\[sqlalchemy\] Polymorphic Query All regardless of subclass"

Re: [sqlalchemy] Polymorphic Query All regardless of subclass

2019-02-28 Thread Mike Bayer

On Thu, Feb 28, 2019 at 1:34 PM Andrew Martin  wrote:
>
> Hi All,
>
> Trying something out here for an analytics app here, and I'm not sure any of 
> this is a good idea. I'm trying to create a data model that's flexible enough 
> to handle a lot of different types of analysis. From plain CSVs to  time 
> series to survey questions with complex hierarchies. My approach is to have a 
> pretty plain model and query as needed to construct dataframes that will end 
> up in pandas and sci-kit learn.
>
> The basic idea is that there's a dataset that has variables associated with 
> it. Variables have values associated. I want to retain type information on 
> the values. If I'm reading a file and loading it, I want to store an int as 
> an int or a datetime as a datetime and not stuff everything into a string and 
> have to guess what things are based on variable metadata. That's where I'm 
> kind of stuck. I think I have the multi-table inheritance set up correctly to 
> store different types of value. I'm totally unsure how to query without 
> having to know the subclass. I'd like to be able to extract values for pandas 
> processing with something like this:
>
> vals = Values.value.filter_by(variable_id=123456).all()
>
> without having to know per variable which subclass to choose. I.e., I don't 
> want to have to do
> vals = IntValue.value.filter_by(variable_id=123456).all()
>

so "Values.value.filter_by(variable_id=123456).all()" is not
meaningful to me because I don't know what "Values.value" is, is that
a dynamic relationship hence you're getting a Query ?   What SQL do
you see this rendering?


>
> The second part of my question is only relevant if this whole design isn't 
> hosed, which I'm fine if you tell me that it is and I need to go back to the 
> drawing board.
>
> The second part is how to extract an entire dataset by ID in a way that would 
> be comfortable in pandas. If I were doing this in raw SQL, I'd use a CTE to 
> get the variables by dataset id and use a pivot to create the 
> data-frame-shaped table. I'm confounded by how to do this in alchemy, 
> especially with the polymorphic setup.

can you show me that ?


FTR we usually reduce the JOINs by putting all the "value" columns in
one table but otherwise similar approach, see:
https://docs.sqlalchemy.org/en/latest/_modules/examples/vertical/dictlike-polymorphic.html



>
>
> Here are my current model definitions:
>
>
> from sqlalchemy import Column, Index, Integer, Text, DateTime, Float, 
> ForeignKey
> from sqlalchemy.orm import relationship
>
> from .meta import Base
>
>
> class DataSet(Base):
> __tablename__ = 'datasets'
> id = Column(Integer, primary_key=True)
> name = Column(Text, nullable=False)
> description = Column(Text, nullable=True)
>
>
>
> class Variable(Base):
> __tablename__ = 'variables'
> id = Column(Integer, primary_key=True)
> dataset_id = Column(Integer, ForeignKey('datasets.id'), Nullable=False)
> name = Column(Text, nullable=False)
> description = Column(Text, nullable=True)
> group_var_col_id = Column(Integer, ForeignKey('variables.id'), 
> nullable=True)
> group_var_row_id = Column(Integer, ForeignKey('variables.id'), 
> nullable=True)
>
> value_map = Column(Text, nullable=True) #change to JSONB when move from 
> SQLite to Postgres
>
> dataset = relationship('DataSet', backref='variables')
>
>
> class Value(Base):
> __tablename__ = 'values'
> id = Column(Integer, primary_key=True)
> variable_id = Column(Integer, ForeignKey('variables.id'), Nullable=False)
> observation_id = Column(Text, nullable=False)
> value_type = Column(Text, nullable=False)
>
> variable = relationship('Variable', backref='values')
>
> __mapper_args__ = {'polymorphic_on': value_type}
>
>
> class IntValue(Value):
> __tablename__ = 'int_values'
> id = Column(None, ForeignKey('values.id'), primary_key=True)
> value = Column(Integer, nullable=True)
>
> __mapper_args__ = {'polymorphic_identity': 'int'}
>
>
> class StringValue(Value):
> __tablename__ = 'string_values'
> id = Column(None, ForeignKey('values.id'), primary_key=True)
> value = Column(Text, nullable=True)
>
> __mapper_args__ = {'polymorphic_identity': 'string'}
>
>
> class FloatValue(Value):
> __tablename__ = 'float_values'
> id = Column(None, ForeignKey('values.id'), primary_key=True)
> value = Column(Float, nullable=True)
>
> __mapper_args__ = {'polymorphic_identity': 'float'}
>
>
> class DateTimeValue(Value):
> __tablename__ = 'datetime_values'
> id = Column(None, ForeignKey('values.id'), primary_key=True)
> value = Column(DateTime, nullable=True)
>
> __mapper_args__ = {'polymorphic_identity': 'datetime'}
>
> Thanks in advance for any suggestions you might have!
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and 
>

[sqlalchemy] Polymorphic Query All regardless of subclass

2019-02-28 Thread Andrew Martin

Hi All,

Trying something out here for an analytics app here, and I'm not sure any 
of this is a good idea. I'm trying to create a data model that's flexible 
enough to handle a lot of different types of analysis. From plain CSVs to  
time series to survey questions with complex hierarchies. My approach is to 
have a pretty plain model and query as needed to construct dataframes that 
will end up in pandas and sci-kit learn.

The basic idea is that there's a dataset that has variables associated with 
it. Variables have values associated. I want to retain type information on 
the values. If I'm reading a file and loading it, I want to store an int as 
an int or a datetime as a datetime and not stuff everything into a string 
and have to guess what things are based on variable metadata. That's where 
I'm kind of stuck. I think I have the multi-table inheritance set up 
correctly to store different types of value. I'm totally unsure how to 
query without having to know the subclass. I'd like to be able to extract 
values for pandas processing with something like this:

vals = Values.value.filter_by(variable_id=123456).all()

without having to know per variable which subclass to choose. I.e., I don't 
want to have to do
vals = IntValue.value.filter_by(variable_id=123456).all()


The second part of my question is only relevant if this whole design isn't 
hosed, which I'm fine if you tell me that it is and I need to go back to 
the drawing board.

The second part is how to extract an entire dataset by ID in a way that 
would be comfortable in pandas. If I were doing this in raw SQL, I'd use a 
CTE to get the variables by dataset id and use a pivot to create the 
data-frame-shaped table. I'm confounded by how to do this in alchemy, 
especially with the polymorphic setup.


Here are my current model definitions:


from sqlalchemy import Column, Index, Integer, Text, DateTime, Float, 
ForeignKey
from sqlalchemy.orm import relationship

from .meta import Base


class DataSet(Base):
__tablename__ = 'datasets'
id = Column(Integer, primary_key=True)
name = Column(Text, nullable=False)
description = Column(Text, nullable=True)



class Variable(Base):
__tablename__ = 'variables'
id = Column(Integer, primary_key=True)
dataset_id = Column(Integer, ForeignKey('datasets.id'), Nullable=False)
name = Column(Text, nullable=False)
description = Column(Text, nullable=True)
group_var_col_id = Column(Integer, ForeignKey('variables.id'), nullable=
True)
group_var_row_id = Column(Integer, ForeignKey('variables.id'), nullable=
True)

value_map = Column(Text, nullable=True) #change to JSONB when move from 
SQLite to Postgres

dataset = relationship('DataSet', backref='variables')


class Value(Base):
__tablename__ = 'values'
id = Column(Integer, primary_key=True)
variable_id = Column(Integer, ForeignKey('variables.id'), Nullable=False
)
observation_id = Column(Text, nullable=False)
value_type = Column(Text, nullable=False)

variable = relationship('Variable', backref='values')

__mapper_args__ = {'polymorphic_on': value_type}


class IntValue(Value):
__tablename__ = 'int_values'
id = Column(None, ForeignKey('values.id'), primary_key=True)
value = Column(Integer, nullable=True)

__mapper_args__ = {'polymorphic_identity': 'int'}


class StringValue(Value):
__tablename__ = 'string_values'
id = Column(None, ForeignKey('values.id'), primary_key=True)
value = Column(Text, nullable=True)

__mapper_args__ = {'polymorphic_identity': 'string'}


class FloatValue(Value):
__tablename__ = 'float_values'
id = Column(None, ForeignKey('values.id'), primary_key=True)
value = Column(Float, nullable=True)

__mapper_args__ = {'polymorphic_identity': 'float'}


class DateTimeValue(Value):
__tablename__ = 'datetime_values'
id = Column(None, ForeignKey('values.id'), primary_key=True)
value = Column(DateTime, nullable=True)

__mapper_args__ = {'polymorphic_identity': 'datetime'}

Thanks in advance for any suggestions you might have!

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Polymorphic Query All regardless of subclass

[sqlalchemy] Polymorphic Query All regardless of subclass

2 matches

Site Navigation

Mail list logo

Footer information