Re: [sqlalchemy] Soft Delete Pattern

2023-07-03 Thread Andrew Martin
This is awesome!

Thank you, Mike!

-andrew

On Monday, July 3, 2023 at 11:05:28 PM UTC-5 Mike Bayer wrote:

> this is a major area of functionality that begins with the 
> "with_loader_criteria" feature: 
> https://docs.sqlalchemy.org/en/20/orm/queryguide/api.html#sqlalchemy.orm.with_loader_criteria
>
> integration to make the criteria automatic follows at 
> https://docs.sqlalchemy.org/en/20/orm/session_events.html#adding-global-where-on-criteria
>  
>
> there's an example of this here: 
> https://docs.sqlalchemy.org/en/20/_modules/examples/extending_query/filter_public.html
>
>
> On Mon, Jul 3, 2023, at 11:13 PM, Andrew Martin wrote:
>
> Hello,
>
> I have a base class I tend to use that includes a Boolean is_deleted field 
> so that pretty much every object has that available.
>
> Is there a good pattern for setting a filter on these objects that 
> automatically adds a WHERE is_deleted = 'false'?
>
> Or does that just have to be added as a filter on every .query(...) 
> statement?
>
>
> Thanks,
>
> andrew
>
>
> -- 
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/08e50347-ebd9-47a4-b9f8-43b95ebe4e6bn%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/08e50347-ebd9-47a4-b9f8-43b95ebe4e6bn%40googlegroups.com?utm_medium=email_source=footer>
> .
>
>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/70e5a166-efaa-430f-a2fc-85852ccd59ccn%40googlegroups.com.


[sqlalchemy] Soft Delete Pattern

2023-07-03 Thread Andrew Martin
Hello,

I have a base class I tend to use that includes a Boolean is_deleted field 
so that pretty much every object has that available.

Is there a good pattern for setting a filter on these objects that 
automatically adds a WHERE is_deleted = 'false'?

Or does that just have to be added as a filter on every .query(...) 
statement?


Thanks,

andrew

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/08e50347-ebd9-47a4-b9f8-43b95ebe4e6bn%40googlegroups.com.


Re: [sqlalchemy] Session management for general functions within a class

2022-04-29 Thread Andrew Martin
Hi Simon, thank you for your thoughts. Sorry about the incomplete code. 
This project has gotten out of control, and I'm tired and a little burned 
out and have a launch deadline for Monday, and I was hoping this would be 
enough to uncover some basic stupidity in my approach. As long as I don't 
care about code being duplicated in lots of places, it all works. But 
refactoring to clean things up is uncovering some fundamental lack of 
understanding about how this stuff works.

The session is about as basic as it gets from a utility function.

from sqlalchemy import create_engine
def get_db_session() -> Session:
engine = create_engine(
f"postgresql://{settings.PG_USER}:{settings.PG_PASS}@{settings.PG_DSN}:{settings.PG_PORT}/{settings.DATABLENDER_DB}"
 
# noqa: E501
)
session = Session(engine)
return session

So in my Mixin class there's just 

from app.utils import get_db_session

and self.db_session = get_db_session()

Everything else is working from that. I'll get a complete working example 
up tonight or tomorrow. It's gotten complex because some of the logic is 
distributed and in Airflow DAGs and tasks and stuff. It's not super easy to 
pull out a class and some models and have something to demonstrate. And I 
sure as hell didn't want to just dump the whole thing on people here and be 
like, "Hey can you fix this for me?" lol!

On Friday, April 29, 2022 at 5:51:26 AM UTC-5 Simon King wrote:

> It's difficult to debug this without a script that we can run to reproduce 
> the problem. What kind of object is self.db_session? You use it as a 
> context manager without calling it, so I don't think it can be a 
> sessionmaker or a session.
>
> You're nesting calls to the context manager:
>
> # in load_new_data
> with self.db_session as outersession:
> # add new_obj to outersession
> # call move_to_error
> with self.db_session as innersession:
> # add new_obj to innersession
>
> Are innersession and outersession supposed to be the same object? If they 
> are different sessions, you're trying to add new_obj to both of them, which 
> is going to be a problem.
>
> If it were me, I would explicitly pass the session to the move_to_error 
> method. If you don't like that, you can also use 
> sqlalchemy.orm.object_session to get the session that new_obj already 
> belongs to.
>
> Hope that helps,
>
> Simon
>
> On Fri, Apr 29, 2022 at 5:10 AM Andrew Martin  wrote:
>
>> Hi all, I'm struggling a bit with best practices for my ETL application.
>>
>> Each part of the ETL app is completely separate from the others, but I 
>> have a MixIn for some common functions that each of them need to do, like 
>> move this record to error if there's a data integrity problem. Or move this 
>> record to manual review if there's insufficient data to move it along to 
>> the next stage of the ETL.
>>
>> The problem I'm having is that I don't understand the correct way to pass 
>> an object to a function, update it, and eventually commit it.
>>
>> I have for example:
>>
>> class DataMoverMixin:
>> def __init__(self) -> None:
>> self.db_session = get_db_session()
>> 
>>
>> self.move_to_error(obj: Any, error_stage: str, traceback: Exception) 
>> -> bool:
>> logger.info("Moving object to error.")
>> json_data = json.dumps(obj, cls=AlchemyEncoder)
>> e = Error(
>> id=obj.id,
>> error_stage=error_stage,
>> error_message=repr(traceback),
>> error_data=json_data,
>> )
>> obj.status = "error"
>> with self.db_session as session:
>> session.add(e)
>> session.add(obj)
>> session.commit()
>> logger.info("Successfully moved object to error.")
>> return True
>>
>> class IngestDataManager(DataMoverMixin):
>> def __init__(self):
>> super().__init__()
>> 
>>
>>
>> def load_new_data(self, accounts: List[Dict]) -> bool:
>> for acc in accounts:
>> new_obj = NewObj(**acc)
>> with self.db_session as session:
>> session.add(new_obj)
>> session.commit()
>> # now the raw data is loaded, I need to check if it 
>> conforms and do some stuff  with the newly created id. 
>> session.refresh(new_obj)
>> if not new_obj.important_stuff:
>>  self.move_to_error(new_obj, 
>> "ingest_integrity_error", f"missing {important stuff} for account_id: {
>> new

[sqlalchemy] Session management for general functions within a class

2022-04-28 Thread Andrew Martin
Hi all, I'm struggling a bit with best practices for my ETL application.

Each part of the ETL app is completely separate from the others, but I have 
a MixIn for some common functions that each of them need to do, like move 
this record to error if there's a data integrity problem. Or move this 
record to manual review if there's insufficient data to move it along to 
the next stage of the ETL.

The problem I'm having is that I don't understand the correct way to pass 
an object to a function, update it, and eventually commit it.

I have for example:

class DataMoverMixin:
def __init__(self) -> None:
self.db_session = get_db_session()


self.move_to_error(obj: Any, error_stage: str, traceback: Exception) -> 
bool:
logger.info("Moving object to error.")
json_data = json.dumps(obj, cls=AlchemyEncoder)
e = Error(
id=obj.id,
error_stage=error_stage,
error_message=repr(traceback),
error_data=json_data,
)
obj.status = "error"
with self.db_session as session:
session.add(e)
session.add(obj)
session.commit()
logger.info("Successfully moved object to error.")
return True

class IngestDataManager(DataMoverMixin):
def __init__(self):
super().__init__()



def load_new_data(self, accounts: List[Dict]) -> bool:
for acc in accounts:
new_obj = NewObj(**acc)
with self.db_session as session:
session.add(new_obj)
session.commit()
# now the raw data is loaded, I need to check if it 
conforms and do some stuff  with the newly created id. 
session.refresh(new_obj)
if not new_obj.important_stuff:
 self.move_to_error(new_obj, "ingest_integrity_error", 
f"missing {important stuff} for account_id: {new_obj.id}


This is the simplest example of what does and doesn't work. And I can tell 
from the errors that I must be doing something very anti pattern, but I 
can't quite figure out what.

This pattern gives me a DetachedInstanceError.

So if I change Mixin.move_to_error like so:

. . . 
with self.db_session as session:
session.refresh(obj)
obj.status = "error"
session.add(e)
session.add(obj)
session.commit()
. . .

I get no error. But also the changes to the obj are not actually committed 
to the DB.
The new record for error is committed.
 
My expectation was that by attaching the session to the class that any 
method on the class would reference the same session, and that using the 
context manager was just a good practice to open and close it. But that 
doesn't seem to be the case. 

I might certainly be wrong, but it appears that when you pass an SQLAlchemy 
object to a function inside of a session context manager, it does not carry 
the session with it?

And also reopening what I think is the session in a context manager fixes 
that but also then doesn't allow me to update the object?

I guess I'm just kinda confused, and I'm sure there's a better way to do 
this.

I've searched around a lot to try and understand this problem, but for 
whatever reason, nothing has clicked for me about what I'm doing wrong.

Appreciate any help from people.

-andrew


-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/628f6d67-51ce-4251-a90e-9f27341b793cn%40googlegroups.com.


[sqlalchemy] Idiomatic way to track state during ETL processing

2022-02-28 Thread Andrew Martin
I haven't found any topics here that address this, so it may mean that the 
answer is so simple that I'm just overthinking here.

Context: I'm the lone developer on a new tech team at a company that's 
never had any team in place before. Everything is greenfield. Which is 
great because I get to do everything from scratch with no legacy stuff to 
account for. But it's also kind of a lot because I *have* to do everything 
from scratch.

The ETL pipeline I've prototyped takes a bunch of garbage data from 
SalesForce and tries to make it not-garbage and put it back into 
SalesForce. We start by normalizing addresses through shipping APIs, then 
use the normalized addresses to go to various different data broker APIs 
(FirstAmerican, Experian, Whitepages, etc.) to get owner information, 
followed by contact information.

I'm using apache airflow as a central automation manager and have it set up 
so that DAGs don't do anything more than call methods on a set of objects 
inside an "app".

Each API that we deal with is its own object and has methods for dealing 
with what's there based on current state. I load all the data that needs to 
be processed into postgres so we aren't constantly bumping into the 
SalesForce API, and I track state inside postgres with a state table.

Problem: what's the idiomatic way to manage this? I have something that 
works, but because this is all new, and our team will be growing soon, I 
want to lay down a solid foundation for when we move beyond proof of 
concept.

Model examples:

After the raw SalesForce data is loaded into a dumb table with no 
restrictions, we load the kickoff table that looks like this:

class AccountIngestedData(Base):
__tablename__ = "account_ingested_data"
__table_args__ = {"schema": "datablender"}

account_ingested_data_id = Column(
BigInteger,
primary_key=True,
server_default=text("nextval('account_ingested_data_id_seq'::regclass)"),
)
salesforce_id = Column(ForeignKey(AccountRawData.id), nullable=False)
account_data_source_id = Column(
ForeignKey(AccountDataSource.account_data_source_id), nullable=False
)
account_enrichment_source_field_mapper_id = Column(
ForeignKey(
AccountEnrichmentSourceFieldMapper.account_enrichment_source_field_mapper_id
),
nullable=False,
)
account_state_id = Column(ForeignKey(AccountState.account_state_id), 
nullable=False)
account_experian_file_id = Column(
ForeignKey(AccountExperianFile.account_experian_file_id), nullable=True
)

account_name = Column(Text, nullable=False)
account_address1 = Column(Text, nullable=False)
account_address2 = Column(Text, nullable=False)
account_city = Column(Text, nullable=False)
account_state = Column(Text, nullable=False)
account_zip = Column(Text, nullable=False)

account_data_source = relationship("AccountDataSource")
account_enrichment_source_field_mapper = relationship(
"AccountEnrichmentSourceFieldMapper"
)
account_experian_file = relationship("AccountExperianFile")
account_state = relationship("AccountState")
salesforce = relationship("AccountRawData")

And we have a state table:

class AccountState(Base):
__tablename__ = "account_state"
__table_args__ = {"schema": "datablender"}

account_state_id = Column(
BigInteger,
primary_key=True,
server_default=text("nextval('account_state_id_seq'::regclass)"),
)
account_state_description = Column(Text, nullable=False)

Example values for AccountState are like

10, "ingested"
11, "address valid"
12, "address invalid"
13, "address error"
14, "address manual review"
15, "owner info found"
16, "owner info ambiguous"
17, "owner info error"
18, "owner info manual review"

Of course this list goes on and on as we add more and more integrations. At 
each stage of the automated pipeline, there's a query to get the records 
that correspond to each state.


My question is this: what's the cleanest way to assign the Foreign Key IDs 
to the accounts table based on what's happening?

I feel like there must be a really obvious solution to this that doesn't 
involve a global python dictionary that has to be updated every time a new 
status is added, and also doesn't have to bang another query on the status 
table for every record that gets loaded. But my brain can't figure out how 
to make this happen.

Really curious if anyone has thoughts about this pattern. It's obviously 
common, I just haven't seen the SQLA implementation before.


-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 

[sqlalchemy] Comparing an SQLAlchemy object to a similar object

2021-04-10 Thread Andrew Martin
Hi everyone,

This might be a totally premature optimization, but I want to at least 
start off in the right direction.

I'm pulling messages from a slack channel into an Airflow DAG and writing 
events to a table  if certain attributes of the slack message has changed. 
Slack message structures are really variable depending on how a message was 
posted (i.e., you just type something, you share a message from another 
channel, it has a file attachment, or you share it and add a message, etc. 
The dictionary you get is totally different and unreliable.) So I have a 
SlackMessage class that susses all of this out and creates an object with 
standardized attributes that I can rely on. It looks like this:

class SlackMessage:
def __init__(self, message):
# print(message + "\n")
self.reactions = []
self.ts = ""
self.uuid = ""
self.datetime = ""
self.teams = []
self.codename = ""
self.needs_ticket = ""
self.triage_status = ""
self.resolution_type = ""
self.message_type = ""
self.message_text = ""
self.file_urls = []
self.message_url = ""
self.local_url = ""
self.jira_title = ""
self.datetime_resolved = ""

self.parse_message(message)

The init creates the attributes so I don't have to check for them to exist, 
and the parse_message method populates anything that can be at the time.

I'm pushing these objects to a triage_message table in which I want to add 
new records when a message has changed based on the reactions, 
triage_status, ticket_status, or teams values have changed. This is sort of 
intentionally not upsert. I want to preserve event data so that I can do 
analytics, like how long from the time the message was created until it 
obtained the "complete" reaction.

My SQLAlchemy model is different from the slack_message class and has some 
additional fields like a generated bigint ID, a foreign key to my data 
warehouse date dimension, and some automated date fields for creation and 
update time (although in this design, there really shouldn't be an updated 
date.)

The model looks like this:

class TriageMessage(Base):
__tablename__ = "triage_message"

uuid = Column(Text, nullable=False)
dim_date_id = Column(
ForeignKey("dim_date.id", ondelete="CASCADE", onupdate="CASCADE"),
nullable=False,
)
ts = Column(Integer, nullable=False)
datetime = Column(Date, nullable=False)
datetime_resolved = Column(Date)
codename = Column(Text, nullable=False)
jira_title = Column(Text, nullable=False)
local_url = Column(Text, nullable=False)
message_type = Column(Text, nullable=False)
message_url = Column(Text, nullable=False)
needs_ticket = Column(Boolean, nullable=False)
reactions = Column(ARRAY(Text()), nullable=False)
teams = Column(ARRAY(Text()), nullable=False)
triage_status = Column(Text, nullable=False)
file_urls = Column(ARRAY(Text()))
resolution_type = Column(Text)

I'm trying to figure out what the best way is to decide if a new record 
should be created. These aren't the same class, so it doesn't immediately 
make sense to me to create __eq__ and __ne__ method overrides on these.

There are two different approaches that do seem reasonable to me.

1. a comparison function

def slack_msg_eq_sqla_msg(slack_msg, sqla_msg):


2. convert the slack message object to the sqla object and test for 
equality.

Does one of these approaches make more sense than the other to the group 
here?

Also bonus question! If the uuid of the message doesn't exist in the 
database at all, I can skip all of this and just write the new event. If 
I'm pulling 20k messages at a time from the slack channel and postgres, is 
it worth it to create a dict with uuid as the key so that I'm searching a 
hash table instead of doing a list comprehension to gather new items based 
on the uuid value?

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/da9401d3-53d2-48eb-bb3d-34c54acf2606n%40googlegroups.com.


Re: [sqlalchemy] Generic File Loader issue with Booleans

2020-10-28 Thread Andrew Martin
I will reproduce this when I get done with work this evening and give the 
specifics.

On Wednesday, October 28, 2020 at 7:46:26 AM UTC-5, Mike Bayer wrote:
>
>
>
> On Tue, Oct 27, 2020, at 11:56 PM, Andrew Martin wrote:
>
> This is probably a weirdly specific question, but I have a workflow that 
> involves loading lots of CSVs into Postgres. Some of them are very large, 
> so I want to cut overhead and not use CSV Dictreader. Wanted to use named 
> tuples instead.
>
> Here's a header and a line from a file:
>
> date_dim_id,date_actual,epoch,day_suffix,day_name,day_of_week,day_of_month
> ,day_of_quarter,day_of_year,week_of_month,week_of_year,week_of_year_iso,
> month_actual,month_name,month_name_abbreviated,quarter_actual,quarter_name
> ,year_actual,first_day_of_week,last_day_of_week,first_day_of_month,
> last_day_of_month,first_day_of_quarter,last_day_of_quarter,
> first_day_of_year,last_day_of_year,mm,mmdd,weekend_indr
>
>
> Here's a line:
>
> 20150101,1/1/15,1420070400,1st,Thursday ,4,1,1,1,1,1,2015-W01-4,1,January 
>  
> ,Jan,1,First,2015,12/29/14,1/4/15,1/1/15,1/31/15,1/1/15,3/31/15,1/1/15,12/31/15,12015,1012015,0
>
>
> The idea was a generic file loader that takes a filename and a class and 
> then loads it. Here's the class:
>
>
> class DimDate(Base):
> __tablename__ = "dim_date"
>
> date_dim_id = Column(Integer, primary_key=True)
> date_actual = Column(Date, nullable=False, index=True)
> epoch = Column(BigInteger, nullable=False)
> day_suffix = Column(String(4), nullable=False)
> day_name = Column(String(9), nullable=False)
> day_of_week = Column(Integer, nullable=False)
> day_of_month = Column(Integer, nullable=False)
> day_of_quarter = Column(Integer, nullable=False)
> day_of_year = Column(Integer, nullable=False)
> week_of_month = Column(Integer, nullable=False)
> week_of_year = Column(Integer, nullable=False)
> week_of_year_iso = Column(CHAR(10), nullable=False)
> month_actual = Column(Integer, nullable=False)
> month_name = Column(String(9), nullable=False)
> month_name_abbreviated = Column(CHAR(3), nullable=False)
> quarter_actual = Column(Integer, nullable=False)
> quarter_name = Column(String(9), nullable=False)
> year_actual = Column(Integer, nullable=False)
> first_day_of_week = Column(Date, nullable=False)
> last_day_of_week = Column(Date, nullable=False)
> first_day_of_month = Column(Date, nullable=False)
> last_day_of_month = Column(Date, nullable=False)
> first_day_of_quarter = Column(Date, nullable=False)
> last_day_of_quarter = Column(Date, nullable=False)
> first_day_of_year = Column(Date, nullable=False)
> last_day_of_year = Column(Date, nullable=False)
> mm = Column(CHAR(6), nullable=False)
> mmdd = Column(CHAR(10), nullable=False)
> weekend_indr = Column(Boolean, nullable=False)
>
>
> What I ran into is a couple of things. 1. Python True/False throws errors 
> when writing to a Boolean. 2. You seem to have to int() a 1 or 0 to avoid a 
> dbapi error. 
>
> Is this expected?
>
>
> no, it's not expected, boolean True/False works on all backends without 
> issue, assuming the target column is using the Boolean datatype at the 
> SQLAlchemy level as well as that the actual column in the database is of 
> the appropriate type, for PostgreSQL this would be BOOLEAN.   when you say 
> "throws errors" please share complete stack traces and complete error 
> messages, thanks.
>
> can't do much more here without more specifics.  
>
>
>
>
>
>
> And does it have more to do with Postgres than it does SQLA?
>
> Here's the code I came up with for generic file loader that actually 
> works. I'm sure it's garbage, and I'd like to refactor it, but it works.
>
> def load_file(file_name, cls):
> """assumes that csv file has headers and that headers match the
> names of the columns for a given sqla class. Also assumes that the
> sqla class is inheriting from meta.Base"""
> inst = inspect(cls)
> # attr_names = sorted([c_attr.key for c_attr in 
> inst.mapper.column_attrs])
> attr_vals = sorted([(c.name, c.type) for c in inst.c])
>
>
> s = get_sqla_session()
>
>
> with open(file_name, newline="") as infile:
> reader = csv.reader(infile)
> Data = namedtuple("Data", next(reader))
> tuple_fields = Data._fields
> for data in map(Data._make, [(c.strip() for c in row) for row in 
> reader]):
> new_obj = cls()
> for attr in attr_vals:
> for field in tuple_fie

[sqlalchemy] Generic File Loader issue with Booleans

2020-10-27 Thread Andrew Martin
This is probably a weirdly specific question, but I have a workflow that 
involves loading lots of CSVs into Postgres. Some of them are very large, 
so I want to cut overhead and not use CSV Dictreader. Wanted to use named 
tuples instead.

Here's a header and a line from a file:

date_dim_id,date_actual,epoch,day_suffix,day_name,day_of_week,day_of_month,
day_of_quarter,day_of_year,week_of_month,week_of_year,week_of_year_iso,
month_actual,month_name,month_name_abbreviated,quarter_actual,quarter_name,
year_actual,first_day_of_week,last_day_of_week,first_day_of_month,
last_day_of_month,first_day_of_quarter,last_day_of_quarter,first_day_of_year
,last_day_of_year,mm,mmdd,weekend_indr


Here's a line:

20150101,1/1/15,1420070400,1st,Thursday ,4,1,1,1,1,1,2015-W01-4,1,January 
 
,Jan,1,First,2015,12/29/14,1/4/15,1/1/15,1/31/15,1/1/15,3/31/15,1/1/15,12/31/15,12015,1012015,0


The idea was a generic file loader that takes a filename and a class and 
then loads it. Here's the class:


class DimDate(Base):
__tablename__ = "dim_date"

date_dim_id = Column(Integer, primary_key=True)
date_actual = Column(Date, nullable=False, index=True)
epoch = Column(BigInteger, nullable=False)
day_suffix = Column(String(4), nullable=False)
day_name = Column(String(9), nullable=False)
day_of_week = Column(Integer, nullable=False)
day_of_month = Column(Integer, nullable=False)
day_of_quarter = Column(Integer, nullable=False)
day_of_year = Column(Integer, nullable=False)
week_of_month = Column(Integer, nullable=False)
week_of_year = Column(Integer, nullable=False)
week_of_year_iso = Column(CHAR(10), nullable=False)
month_actual = Column(Integer, nullable=False)
month_name = Column(String(9), nullable=False)
month_name_abbreviated = Column(CHAR(3), nullable=False)
quarter_actual = Column(Integer, nullable=False)
quarter_name = Column(String(9), nullable=False)
year_actual = Column(Integer, nullable=False)
first_day_of_week = Column(Date, nullable=False)
last_day_of_week = Column(Date, nullable=False)
first_day_of_month = Column(Date, nullable=False)
last_day_of_month = Column(Date, nullable=False)
first_day_of_quarter = Column(Date, nullable=False)
last_day_of_quarter = Column(Date, nullable=False)
first_day_of_year = Column(Date, nullable=False)
last_day_of_year = Column(Date, nullable=False)
mm = Column(CHAR(6), nullable=False)
mmdd = Column(CHAR(10), nullable=False)
weekend_indr = Column(Boolean, nullable=False)


What I ran into is a couple of things. 1. Python True/False throws errors 
when writing to a Boolean. 2. You seem to have to int() a 1 or 0 to avoid a 
dbapi error. 

Is this expected? And does it have more to do with Postgres than it does 
SQLA?

Here's the code I came up with for generic file loader that actually works. 
I'm sure it's garbage, and I'd like to refactor it, but it works.

def load_file(file_name, cls):
"""assumes that csv file has headers and that headers match the
names of the columns for a given sqla class. Also assumes that the
sqla class is inheriting from meta.Base"""
inst = inspect(cls)
# attr_names = sorted([c_attr.key for c_attr in 
inst.mapper.column_attrs])
attr_vals = sorted([(c.name, c.type) for c in inst.c])


s = get_sqla_session()


with open(file_name, newline="") as infile:
reader = csv.reader(infile)
Data = namedtuple("Data", next(reader))
tuple_fields = Data._fields
for data in map(Data._make, [(c.strip() for c in row) for row in 
reader]):
new_obj = cls()
for attr in attr_vals:
for field in tuple_fields:
if attr[0] == field:
if str(attr[1]) == "BOOLEAN":
setattr(new_obj, field, int(getattr(data, field
)))
else:
# print(data, attr[1])
setattr(new_obj, field, getattr(data, field))


new_obj.id = IDService.create_snowflake_id()
new_obj.is_deleted = 0


s.add(new_obj)
s.commit()


This isn't really a big deal, and maybe not even appropriate, but I had an 
absolute shit 12 hours trying to figure out why SQLA booleans are such a 
hassle.

If anyone has thoughts, I'm happy to hear them.

Thank you!

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 

[sqlalchemy] Re: Regarding Data type consideration

2019-08-18 Thread Andrew Martin
Hi Mohan,

Could you give us some more information about the structure of your 
Dataframe and how you are using SqlAlchemy?

On Sunday, August 18, 2019 at 4:24:09 AM UTC-5, Mohan Raj wrote:
>
> Hi,
>
> I used "sqlalchemy" in my python code for inserting a Dataframe from 
> Pandas(0.25.0) to Microsoft SQL Server 2014.
>
> While inserting the data, all the columns are considered as varchar type, 
> instead there are integer values in the column. I need to consider the data 
> type based on second row of Dataframe instead of first row.
>
> Help me to resolve the above mentioned query.
>
> Regards
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/937377ba-b321-4e64-97d8-c98d1b2d8ef5%40googlegroups.com.


Re: [sqlalchemy] How to inspect filters

2019-06-27 Thread Andrew Martin
Oh that's really interesting. Thank you for that. I'll definitely tuck that
away in my back pocket. My background is really heavy in raw SQL, and
meta-programming raw SQL is *awful.* Debugging sql that writes sql and
execs it is not fun. I'm not allowed to use sqlalchemy at work because no
one else on the team uses python, and we can't go around implementing stuff
in a way that only one person knows how to work on. But I really want to
get away from the SQL-only approach in my personal/side projects. For some
reason I often find myself really blocked when it comes to sqlalchemy.
Every time I approach my databases I just flip to sql mode and totally
forget that everything in sqla is just plain python, and I can treat it
that way. I see the obvious-level mapping between the two and just kind of
hit a block beyond that. I should probably sit down and read the source
code to try and get past the block, that way I'm not so surprised by answer
that Mike and people like you give me. Anyway, probably TMI. Cheers and
thanks!

On Thu, Jun 27, 2019 at 11:09 AM Jonathan Vanasco 
wrote:

>
>
> On Wednesday, June 26, 2019 at 2:43:44 PM UTC-4, Andrew Martin wrote:
>>
>> That's very interesting, Jonathan. Could you show me a quick example of
>> that approach? I'm not sure I *need* to do that, but I think I would learn
>> about SQLAlchemy from such an example and trying to understand it.
>>
>
> One large project has an 'internal api' that tries to centralize the
> sqlalchemy interface.
>
> Let's say we're searching for a "user".  I would create a dict for the
> data like this:
>
> query_metadata = {'requester': 'site-admin',  # admin interface, user
> interface, user api, etc
>   'filters': {'Username=': 'foo',   # just an
> internal notation
>   },
>'query_data': {},  # information related to what is
> in the query as it is built
>   }
>
>
> This payload is basically the same stuff you'd pass to as queryargs to one
> of your functions above.  We essentially pass it to our version of your
> CRUD service which then acts on it to generate the query.
>
> We don't implement this approach on every query - just on a handful of
> queries that have intense logic with dozens of if/else statements and that
> connect to multiple "filters".  Stashing this in an easily accessible
> manner has just been much easier than constantly trying to examine the
> query to act on it.
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "sqlalchemy" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/sqlalchemy/chGVkNwmKyQ/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sqlalchemy/39a4a326-c384-4c91-909a-40d9c3acb323%40googlegroups.com
> <https://groups.google.com/d/msgid/sqlalchemy/39a4a326-c384-4c91-909a-40d9c3acb323%40googlegroups.com?utm_medium=email_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/CAOVGraLheEcKaoKuwZRUBRQEgGaQ5dG7UuvV2YuW5czhzFLvqg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] How to inspect filters

2019-06-26 Thread Andrew Martin
That's very interesting, Jonathan. Could you show me a quick example of
that approach? I'm not sure I *need* to do that, but I think I would learn
about SQLAlchemy from such an example and trying to understand it.

On Wed, Jun 26, 2019 at 11:00 AM Jonathan Vanasco 
wrote:

> FWIW, I found a better approach to a similar problem was to create a
> dict/object I used to log metadata about the query I wanted... then build
> the query or analyze it based on that metadata.  All the information is in
> the sqlalchemy query, but the execution performance a development time was
> much faster when I stopped analyzing the query and just consulted the dict.
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "sqlalchemy" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/sqlalchemy/chGVkNwmKyQ/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sqlalchemy/18bed722-c8c3-42a3-97de-a5f58987fdcf%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/CAOVGraJ1QapxWARpnS86pN3ez5fHykA60bRRRgboRXertBXSxA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] How to inspect filters

2019-06-25 Thread Andrew Martin
Thank you, as always, for both the specific answer and the general advice. 
Much appreciated!

On Sunday, June 23, 2019 at 5:19:49 PM UTC-5, Mike Bayer wrote:
>
>
>
> On Sun, Jun 23, 2019, at 5:45 PM, Andrew Martin wrote:
>
> I have a generic CRUDService for my web app. It's a pattern that was 
> loosely suggested to me by Mike a while back. I've probably not implemented 
> it the way he intended, but it works pretty well in a pretty small amount 
> of code.
>
>
> The main work, however, that needs to be done is in the read_all method. 
> It needs to be able to handle different filters depending on which view is 
> calling it. Which it can as it is now. But I want to fidget with the 
> defaults a little. Basically, I almost never want to show objects that have 
> been soft deleted. Only admin users/views should be able to fidget with 
> things that we've "deleted". The way the method works now is that if I 
> don't pass any filters in at all, it sets a filter to 
> cls.is_deleted==False. But I'd like to be able to inspect incoming filters. 
> Say the view looks at the resource context and determines that a view 
> should only be allowed to see objects based on some ACL rule. I might 
> create a filter for that view controller that looks like 
> filters = (Project.usergroup==User.usergroup)
>
> (defining the applicable usergroup for a given context is a separate 
> service, but the relationship should be obvious, I think?)
>
> and then have the view query with CRUDService.read_all(Project, 
> request.dbsession, filters). The way I've got it now, that would just 
> replace the default filter, and I would have to add 
> Project.is_deleted==False to the tuple of filters every time I call this. I 
> would like to be able to inspect the filters that are passed into the 
> read_all (which, honestly, I should rename and call it read_many) method 
> and say, if there isn't any reference to the is_deleted column in the 
> filters, then set that as false, otherwise, accept the value in the 
> function parameter. But I find the filter's tuple values quite opaque. Once 
> they are instantiated, they are class-specific, and I'm trying to get at a 
> general way of understanding them. I'll note that this column is guaranteed 
> to be present in the framework. I've modified the declarative base such 
> that every object has the is_deleted. I appreciate any help you might be 
> able to give. It's very likely I'm overthinking this like a numbskull and 
> the answer is painfully obvious. Here's the service class:
>
>
> you can inspect a ColumnElement structure to look for a certain column 
> like this:
>
> from sqlalchemy.sql import visitors
> for elem in visitors.iterate(Project.usergroup == User.usergroup, {}):
>if isinstance(elem, Column) and elem.name == "is_deleted":
>return True
>
> internally, the ORM usually uses visitors.traverse(expression, {}, 
> {"column": visit_column}) but the above is easier to illustrate quickly.
>
> If it were me, I'd likely keep this as an explicit option on read_all(), 
> e.g. read_all(... include_deleted=True), much simpler and won't be called 
> "magic" by anyone, but it can go either way.
>
> also if you know me at all you'd know I would not do the "s.commit()" 
> inside all of the CR/U/D methods.   Your methods that do a "read()" are 
> beginning a transaction and leaving it open implicitly (there's no 
> rollback() or commit() at the end, nor should there be), whereas your CRUD 
> methods are fully committing the transaction for individual 
> INSERT/UPDATE/DELETE statements, and this inconsistency and the difficulty 
> it introduces in what happens when multiple CRUDService methods are called 
> sequentially will eventually lead to architectural problems.
>  
>
>
>
> from pyramid.request import Request
> from crudites.models.meta import Base
> from sqlalchemy.orm.session import Session
> from uuid import UUID
> from typing import List, Tuple, Dict
> from crudites.services.id_service import IDService
>
>
> class CRUDService:
> '''TODO: docstring'''
>
> @staticmethod
> def create(cls: Base, s: Session) -> bool:
> unique_id, public_id = IDService.create_db_ids()
> cls.unique_id=unique_id
> cls.public_id=public_id
> s.add(cls)
> s.commit()
> return True
>
> @staticmethod
> def read_by_pkid(cls: Base, s: Session, id: int) -> Base:
> q = s.query(cls).get(id)
> return q
>
> @staticmethod
> def read_by_unique_id(cls: Base, s: Session, id: UUID) -> Base:
> q = s.query(cls).filter_by(unique_id=id).first()
> retur

[sqlalchemy] How to inspect filters

2019-06-23 Thread Andrew Martin
I have a generic CRUDService for my web app. It's a pattern that was 
loosely suggested to me by Mike a while back. I've probably not implemented 
it the way he intended, but it works pretty well in a pretty small amount 
of code.


The main work, however, that needs to be done is in the read_all method. It 
needs to be able to handle different filters depending on which view is 
calling it. Which it can as it is now. But I want to fidget with the 
defaults a little. Basically, I almost never want to show objects that have 
been soft deleted. Only admin users/views should be able to fidget with 
things that we've "deleted". The way the method works now is that if I 
don't pass any filters in at all, it sets a filter to 
cls.is_deleted==False. But I'd like to be able to inspect incoming filters. 
Say the view looks at the resource context and determines that a view 
should only be allowed to see objects based on some ACL rule. I might 
create a filter for that view controller that looks like 
filters = (Project.usergroup==User.usergroup)

(defining the applicable usergroup for a given context is a separate 
service, but the relationship should be obvious, I think?)

and then have the view query with CRUDService.read_all(Project, 
request.dbsession, filters). The way I've got it now, that would just 
replace the default filter, and I would have to add 
Project.is_deleted==False to the tuple of filters every time I call this. I 
would like to be able to inspect the filters that are passed into the 
read_all (which, honestly, I should rename and call it read_many) method 
and say, if there isn't any reference to the is_deleted column in the 
filters, then set that as false, otherwise, accept the value in the 
function parameter. But I find the filter's tuple values quite opaque. Once 
they are instantiated, they are class-specific, and I'm trying to get at a 
general way of understanding them. I'll note that this column is guaranteed 
to be present in the framework. I've modified the declarative base such 
that every object has the is_deleted. I appreciate any help you might be 
able to give. It's very likely I'm overthinking this like a numbskull and 
the answer is painfully obvious. Here's the service class:

from pyramid.request import Request
from crudites.models.meta import Base
from sqlalchemy.orm.session import Session
from uuid import UUID
from typing import List, Tuple, Dict
from crudites.services.id_service import IDService


class CRUDService:
'''TODO: docstring'''

@staticmethod
def create(cls: Base, s: Session) -> bool:
unique_id, public_id = IDService.create_db_ids()
cls.unique_id=unique_id
cls.public_id=public_id
s.add(cls)
s.commit()
return True

@staticmethod
def read_by_pkid(cls: Base, s: Session, id: int) -> Base:
q = s.query(cls).get(id)
return q

@staticmethod
def read_by_unique_id(cls: Base, s: Session, id: UUID) -> Base:
q = s.query(cls).filter_by(unique_id=id).first()
return q

@staticmethod
def read_by_public_id(cls: Base, s: Session, id: str) -> Base:
q = s.query(cls).filter_by(public_id=id).first()
return q

@staticmethod
def read_all(cls: Base, s: Session, filters: Tuple=None) -> List[Base]:
if not filters:
filters = (cls.is_deleted==False,)
q = s.query(cls).filter(*filters).all()
return q

@staticmethod
def update(cls: Base, s: Session) -> bool:
s.add(cls)
s.commit()
return True

@staticmethod
def delete(cls: Base, s: Session)-> bool:
cls.is_deleted=True
s.add(cls)
s.commit()
return True




Thanks,
-andrew

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/da43b503-1e7a-4a0e-bd32-619d96163c1c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Re: Better Implementation of Query Mixin?

2019-03-24 Thread Andrew Martin
I just wanted to give you a shout-out. Your comment really made something 
click for me. What I was doing in my first post sort of gets me a few steps 
towards what I'm trying to do. (Side note, I'm not really building a 
specific app at the moment. I'm building a fairly opinionated framework for 
making data science/analytics apps. So the over-planning trap is real, but 
also expected).

When I thought about your comment, I realized that I don't need or want a 
UserService and a CommentService and a GroupService, etc. etc. with each 
getting CRUD and Form functionality from mixins. 

The mixin approach is basically just shifting my veggies around the plate. 
It surfaces and separates the operations on models . . . a little. The 
methods are still really tightly coupled to classes, and for no very good 
reason at all.

What I really want is a generic CRUD service, a Form service, a Login (or 
maybe it's Auth) service, etc. Basically, the service doesn't care about 
the model until you tell it to. And because services are stateless as you 
said, you can call them from one another. Like the CRUD service can use the 
Form service for populating data. I really like where this is going.

So, thanks again for popping in with your thought. It really nudged me in a 
good direction.

On Saturday, March 23, 2019 at 1:55:34 PM UTC-5, Mike Bayer wrote:
>
> Just FTR i was thinking about this Q today and I usually would opt to 
> make a Service object stateless, and things like "request" and User 
> would be passed into individual methods explicitly.   I usually start 
> application with a pattern that might be more verbose to start with, 
> but once you write half a dozen cases front to back, the most 
> appropriate pattern usually makes itself apparent and you then 
> refactor. 
>
> e.g. don't overplan ahead, do it stupidly simple then expect to 
> refactor a few times.but also don't create state where it isnt 
> needed, if that makes sense. 
>
>
> On Sat, Mar 23, 2019 at 11:24 AM Andrew Martin  > wrote: 
> > 
> > I just realized that almost everything about how I'm using mixins here 
> is pretty much wrong. This is probably a garbage question and can be 
> deleted, but I don't want to do it myself in case someone is writing a 
> response to tell me that. 
> > 
> > On Saturday, March 23, 2019 at 9:00:49 AM UTC-5, Andrew Martin wrote: 
> >> 
> >> I like to keep my models separate from actions on it, so I only use 
> them for defining tables, relationships, and indexes. To perform actions on 
> a model I use a service that inherits from the model and provides . . . 
> well. services. It's an interface pattern. I'm making these more generic, 
> and separating out repeated code into a mixin. It works fine, but I kind of 
> hate the implementation because it feels wrong and fragile to me. I was 
> wondering if anyone had some suggestions to improve how I'm doing this. 
> >> 
> >> Here are some examples. 
> >> 
> >> 
> >> class User(Base): 
> >> __tablename__ = 'users' 
> >> __table_args__ = {'sqlite_autoincrement': True} 
> >> 
> >> # postgres implementation for later 
> >> # user_id_seq = Sequence('user_id_seq', metadata=Base.metadata) 
> >> # id = Column(BigInteger, user_id_seq, 
> server_default=user_id_seq.next_value(), primary_key=True) 
> >> id = Column(Integer, primary_key=True) 
> >> resource_uid = Column(Text, nullable=False) 
> >> username = Column(Text, nullable=False, unique=True) 
> >> hashed_password = Column(Text, nullable=True) 
> >> is_enabled = Column(Integer, default=1, nullable=False) 
> >> 
> >> 
> >> class CRUDMixIn: 
> >> def __init__(self): 
> >> # super().__init__() 
> >> print('initing crud mixin') 
> >> # This assumes that there are only two MixIns used in the 
> service in this order 
> >> # e.g.: class XService(FormMixIn, CRUDMixIn, User): 
> >> self.model = self.__class__.mro()[3] 
> >> 
> >> def get_one_by_id(self, id): 
> >> one_row = self.request.dbsession.query(self.model).filter(
> self.model.id == id).first() 
> >> return one 
> >> 
> >> def get_all(self): 
> >> all_rows = self.request.dbsession.query(self.model).all() 
> >> return all_rows 
> >> 
> >> 
> >> class UserService(FormMixIn, CRUDMixIn, User): 
> >> def __init__(self, request: Request): 
> >> super().__init__() 
> >> self.request = request 
> >> 
> >> # other user related met

Re: [sqlalchemy] Re: Better Implementation of Query Mixin?

2019-03-23 Thread Andrew Martin
Yeah, I'm kind of rethinking my approach here. Appreciate the thoughts.

On Saturday, March 23, 2019 at 1:55:34 PM UTC-5, Mike Bayer wrote:
>
> Just FTR i was thinking about this Q today and I usually would opt to 
> make a Service object stateless, and things like "request" and User 
> would be passed into individual methods explicitly.   I usually start 
> application with a pattern that might be more verbose to start with, 
> but once you write half a dozen cases front to back, the most 
> appropriate pattern usually makes itself apparent and you then 
> refactor. 
>
> e.g. don't overplan ahead, do it stupidly simple then expect to 
> refactor a few times.but also don't create state where it isnt 
> needed, if that makes sense. 
>
>
> On Sat, Mar 23, 2019 at 11:24 AM Andrew Martin  > wrote: 
> > 
> > I just realized that almost everything about how I'm using mixins here 
> is pretty much wrong. This is probably a garbage question and can be 
> deleted, but I don't want to do it myself in case someone is writing a 
> response to tell me that. 
> > 
> > On Saturday, March 23, 2019 at 9:00:49 AM UTC-5, Andrew Martin wrote: 
> >> 
> >> I like to keep my models separate from actions on it, so I only use 
> them for defining tables, relationships, and indexes. To perform actions on 
> a model I use a service that inherits from the model and provides . . . 
> well. services. It's an interface pattern. I'm making these more generic, 
> and separating out repeated code into a mixin. It works fine, but I kind of 
> hate the implementation because it feels wrong and fragile to me. I was 
> wondering if anyone had some suggestions to improve how I'm doing this. 
> >> 
> >> Here are some examples. 
> >> 
> >> 
> >> class User(Base): 
> >> __tablename__ = 'users' 
> >> __table_args__ = {'sqlite_autoincrement': True} 
> >> 
> >> # postgres implementation for later 
> >> # user_id_seq = Sequence('user_id_seq', metadata=Base.metadata) 
> >> # id = Column(BigInteger, user_id_seq, 
> server_default=user_id_seq.next_value(), primary_key=True) 
> >> id = Column(Integer, primary_key=True) 
> >> resource_uid = Column(Text, nullable=False) 
> >> username = Column(Text, nullable=False, unique=True) 
> >> hashed_password = Column(Text, nullable=True) 
> >> is_enabled = Column(Integer, default=1, nullable=False) 
> >> 
> >> 
> >> class CRUDMixIn: 
> >> def __init__(self): 
> >> # super().__init__() 
> >> print('initing crud mixin') 
> >> # This assumes that there are only two MixIns used in the 
> service in this order 
> >> # e.g.: class XService(FormMixIn, CRUDMixIn, User): 
> >> self.model = self.__class__.mro()[3] 
> >> 
> >> def get_one_by_id(self, id): 
> >> one_row = self.request.dbsession.query(self.model).filter(
> self.model.id == id).first() 
> >> return one 
> >> 
> >> def get_all(self): 
> >> all_rows = self.request.dbsession.query(self.model).all() 
> >> return all_rows 
> >> 
> >> 
> >> class UserService(FormMixIn, CRUDMixIn, User): 
> >> def __init__(self, request: Request): 
> >> super().__init__() 
> >> self.request = request 
> >> 
> >> # other user related methods and business logic 
> >> 
> >> 
> >> 
> >> What is obviously really gross about this is getting the class for the 
> MixIn. Relying on the MRO means that anyone using it has to keep the same 
> order, and that feels wrong. But it doesn't feel as wrong repeating a bunch 
> of boilerplate CRUD code. I've looked at more than a few web/CRUD 
> frameworks, and I don't see people doing things like this. Most often what 
> I see is people putting generic CRUD functions in the Declarative Base, and 
> I really don't like that coupling there. I'd much prefer to have the model 
> layer separated from its actions. I had thought about setting the model in 
> the UserService like this: 
> >> 
> >> class UserService(FormMixIn, CRUDMixIn, User): 
> >> def __init__(self, request: Request): 
> >> super().__init__() 
> >> self.request = request 
> >> self.model = User 
> >> 
> >> 
> >> But that returns a  'sqlalchemy.ext.declarative.api.DeclarativeMeta'> instead of  models.User>, so I still have to get to the MRO there to get the user model 
> to q

[sqlalchemy] Re: Better Implementation of Query Mixin?

2019-03-23 Thread Andrew Martin
I just realized that almost everything about how I'm using mixins here is 
pretty much wrong. This is probably a garbage question and can be deleted, 
but I don't want to do it myself in case someone is writing a response to 
tell me that.

On Saturday, March 23, 2019 at 9:00:49 AM UTC-5, Andrew Martin wrote:
>
> I like to keep my models separate from actions on it, so I only use them 
> for defining tables, relationships, and indexes. To perform actions on a 
> model I use a service that inherits from the model and provides . . . well. 
> services. It's an interface pattern. I'm making these more generic, and 
> separating out repeated code into a mixin. It works fine, but I kind of 
> hate the implementation because it feels wrong and fragile to me. I was 
> wondering if anyone had some suggestions to improve how I'm doing this.
>
> Here are some examples.
>
>
> class User(Base):
> __tablename__ = 'users'
> __table_args__ = {'sqlite_autoincrement': True}
>
> # postgres implementation for later
> # user_id_seq = Sequence('user_id_seq', metadata=Base.metadata)
> # id = Column(BigInteger, user_id_seq, 
> server_default=user_id_seq.next_value(), primary_key=True)
> id = Column(Integer, primary_key=True)
> resource_uid = Column(Text, nullable=False)
> username = Column(Text, nullable=False, unique=True)
> hashed_password = Column(Text, nullable=True)
> is_enabled = Column(Integer, default=1, nullable=False)
>
>
> class CRUDMixIn:
> def __init__(self):
> # super().__init__()
> print('initing crud mixin')
> # This assumes that there are only two MixIns used in the service in 
> this order
> # e.g.: class XService(FormMixIn, CRUDMixIn, User):
> self.model = self.__class__.mro()[3]
>
> def get_one_by_id(self, id):
> one_row = 
> self.request.dbsession.query(self.model).filter(self.model.id == id).first()
> return one
>
> def get_all(self):
> all_rows = self.request.dbsession.query(self.model).all()
> return all_rows
>
>
> class UserService(FormMixIn, CRUDMixIn, User):
> def __init__(self, request: Request):
> super().__init__()
> self.request = request
>
> # other user related methods and business logic
>
>
>
> What is obviously really gross about this is getting the class for the 
> MixIn. Relying on the MRO means that anyone using it has to keep the same 
> order, and that feels wrong. But it doesn't feel as wrong repeating a bunch 
> of boilerplate CRUD code. I've looked at more than a few web/CRUD 
> frameworks, and I don't see people doing things like this. Most often what 
> I see is people putting generic CRUD functions in the Declarative Base, and 
> I really don't like that coupling there. I'd much prefer to have the model 
> layer separated from its actions. I had thought about setting the model in 
> the UserService like this:
>
> class UserService(FormMixIn, CRUDMixIn, User):
> def __init__(self, request: Request):
> super().__init__()
> self.request = request
> self.model = User
>
>
> But that returns a  'sqlalchemy.ext.declarative.api.DeclarativeMeta'> instead of  models.User>, so I still have to get to the MRO there to get the user model 
> to query and it ends up being just as ugly. Although, I guess that's more 
> stable than what I'm doing now because the model MRO isn't going to change 
> often (or ever, maybe?).
>
> Anyway, I'm curious if anyone has thoughts about how I can make this 
> better or less fragile.
>
> thanks!
>
>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] Better Implementation of Query Mixin?

2019-03-23 Thread Andrew Martin
I like to keep my models separate from actions on it, so I only use them 
for defining tables, relationships, and indexes. To perform actions on a 
model I use a service that inherits from the model and provides . . . well. 
services. It's an interface pattern. I'm making these more generic, and 
separating out repeated code into a mixin. It works fine, but I kind of 
hate the implementation because it feels wrong and fragile to me. I was 
wondering if anyone had some suggestions to improve how I'm doing this.

Here are some examples.


class User(Base):
__tablename__ = 'users'
__table_args__ = {'sqlite_autoincrement': True}

# postgres implementation for later
# user_id_seq = Sequence('user_id_seq', metadata=Base.metadata)
# id = Column(BigInteger, user_id_seq, 
server_default=user_id_seq.next_value(), primary_key=True)
id = Column(Integer, primary_key=True)
resource_uid = Column(Text, nullable=False)
username = Column(Text, nullable=False, unique=True)
hashed_password = Column(Text, nullable=True)
is_enabled = Column(Integer, default=1, nullable=False)


class CRUDMixIn:
def __init__(self):
# super().__init__()
print('initing crud mixin')
# This assumes that there are only two MixIns used in the service in 
this order
# e.g.: class XService(FormMixIn, CRUDMixIn, User):
self.model = self.__class__.mro()[3]

def get_one_by_id(self, id):
one_row = self.request.dbsession.query(self.model).filter(self.model.id 
== id).first()
return one

def get_all(self):
all_rows = self.request.dbsession.query(self.model).all()
return all_rows


class UserService(FormMixIn, CRUDMixIn, User):
def __init__(self, request: Request):
super().__init__()
self.request = request

# other user related methods and business logic



What is obviously really gross about this is getting the class for the 
MixIn. Relying on the MRO means that anyone using it has to keep the same 
order, and that feels wrong. But it doesn't feel as wrong repeating a bunch 
of boilerplate CRUD code. I've looked at more than a few web/CRUD 
frameworks, and I don't see people doing things like this. Most often what 
I see is people putting generic CRUD functions in the Declarative Base, and 
I really don't like that coupling there. I'd much prefer to have the model 
layer separated from its actions. I had thought about setting the model in 
the UserService like this:

class UserService(FormMixIn, CRUDMixIn, User):
def __init__(self, request: Request):
super().__init__()
self.request = request
self.model = User


But that returns a  
instead of , so I still have to get to the MRO there to 
get the user model to query and it ends up being just as ugly. Although, I 
guess that's more stable than what I'm doing now because the model MRO 
isn't going to change often (or ever, maybe?).

Anyway, I'm curious if anyone has thoughts about how I can make this better 
or less fragile.

thanks!


-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Re: Polymorphic Query All regardless of subclass

2019-03-01 Thread Andrew Martin
Yeah, for dynamic pivots there are variations on that theme, but for both 
postgres and SQL Server, they boil down to doing something to get the 
columns you want to pivot and then generating the actual pivot query from 
that. It's always gross, and I agree--kind of wrong. It's one of those 
hacks that feels kind of cool at first, and then a year later, you're 
working with 20,000 lines of SQL that's generating and executing SQL and 
it's impossible to debug and and you want to go back in time and murder 
yourself for ever telling your boss that this was possible. Yay!

I have it in my head that the SQLAlchemy layer should make building that 
kind of query both safer and less wrong, but I'll maybe take a look at it 
later after I make more progress on other things.

For now I think I'll be able to work with your example to get me where I 
need to go though, so I'll leave this alone unless I discover some great 
epiphany that might be useful to anyone else in the group.

Thank you again!

-andrew

On Friday, March 1, 2019 at 12:21:18 PM UTC-6, Mike Bayer wrote:
>
> OK the first part of everything I get, that was the model you have, OK. 
>
> The second part with that new table "single_values", is the idea that 
> that is a simplified version of the more complex model you want to do? 
>  Also the way that CTE is being used, that is, get data from it then 
> use that to *render* SQL, that's kind of wrong, there has to be 
> another way to do what you want with that.   I see some XML stuff 
> going on and I don't know what that has to do with anything, so I'd 
> need to see what the *query* you want to do looks like, e.g. what is 
> the input/output you want. 
>
> But if you can work with the example I gave you, do that, I don't have 
> a pressing need to figure out the other way you were doing it if you 
> dont. 
>
>
> On Fri, Mar 1, 2019 at 12:23 PM Andrew Martin  > wrote: 
> > 
> > Thanks for the reply, Mike. I should've offered the SQL equivalent of 
> what I was going for to begin with. This is for a side project and I'm at 
> work, so I didn't put as much thought into the question as I should've. 
> > 
> > I think the link to the Polymorphic valued vertical table is what I'm 
> after. I'll have to dig into to understand it fully, but it looks like what 
> I want to do. 
> > 
> > Let me see if I can clear up what I was originally after. I only have 
> SQL server at work, but the first part is basically the same in postgres 
> (which is what I'll be using at some point when I get outside the bounds of 
> what SQLite can handle) 
> > 
> > The way I had the tables defined before with the Values class and the 
> subclassessubclass leads me to this raw SQL for what I'm wanting. Note, 
> I've renamed values to vals because shadowning is bad. Here's the whole 
> creation/populating and the queries I'm looking at. 
> > 
> > CREATE TABLE datasets ( 
> > id INTEGER NOT NULL, 
> > uid varchar(50) NOT NULL, 
> > name varchar(50) NOT NULL, 
> > description varchar(50), 
> > CONSTRAINT pk_datasets PRIMARY KEY (id) 
> > ) 
> > 
> > CREATE TABLE variables ( 
> > id INTEGER NOT NULL, 
> > uid varchar(50) NOT NULL, 
> > dataset_id INT NOT NULL, 
> > name varchar(50) NOT NULL, 
> > description varchar(50), 
> > group_var_col_id INTEGER, 
> > group_var_row_id INTEGER, 
> > value_map varchar(50), 
> > CONSTRAINT pk_variables PRIMARY KEY (id), 
> > CONSTRAINT fk_variables_dataset_id_datasets FOREIGN KEY(dataset_id) 
> REFERENCES datasets (id), 
> > CONSTRAINT fk_variables_group_var_col_id_variables FOREIGN 
> KEY(group_var_col_id) REFERENCES variables (id), 
> > CONSTRAINT fk_variables_group_var_row_id_variables FOREIGN 
> KEY(group_var_row_id) REFERENCES variables (id) 
> > ) 
> > 
> > INSERT INTO datasets (id, uid, name, description) 
> > values 
> > (1, 'a1', 'ds1', 'test1') 
> > 
> > INSERT INTO variables (id, uid, dataset_id, name, description) 
> > values 
> > (2, 'av2', 1, 'v2', 'testvar2'), 
> > (1, 'av1', 1, 'v1', 'testvar1'), 
> > (3, 'av3', 1, 'v3', 'testvar3') 
> > 
> > 
> > CREATE TABLE vals ( 
> > id INTEGER NOT NULL, 
> > uid varchar(50) NOT NULL, 
> > variable_id INT NOT NULL, 
> > observation_id varchar(50) NOT NULL, 
> > val_type varchar(50) NOT NULL, 
> > CONSTRAINT pk_values PRIMARY KEY (id), 
> > CONSTRAINT fk_values_variable_id_variables FOREIGN KEY(variable_id) 
> REFERENCES variables (id) 
> > ) 
> > 
> > INSERT INTO vals (id, uid, variable_id, observation_id, val_type) 
> >

[sqlalchemy] Re: Polymorphic Query All regardless of subclass

2019-03-01 Thread Andrew Martin
'),
(3, 'val3', 1, 'respid3', 'int', '3'),
(4, 'val4', 1, 'respid4', 'int', '4'),
(5, 'val5', 1, 'respid5', 'int', '5'),
(11, 'val11', 3, 'respid1', 'float', '1.0'),
(12, 'val12', 3, 'respid2', 'float', '2.0'),
(13, 'val13', 3, 'respid3', 'float', '3.0'),
(14, 'val14', 3, 'respid4', 'float', '4.0'),
(15, 'val15', 3, 'respid5', 'float', '5.0')


DECLARE @DynamicPivotQuery AS NVARCHAR(MAX)

;WITH ColumnNames (names) AS (
SELECT TOP 1 STUFF((SELECT ',' + QUOTENAME(name)
FROM variables
WHERE dataset_id = 1
FOR XML PATH('')), 1,1,'') [cols]
FROM (SELECT DISTINCT name FROM variables WHERE dataset_id = 1) n
)

SELECT @DynamicPivotQuery = 
  N'SELECT observation_id, ' +  (SELECT names FROM ColumnNames) + '
FROM 
(SELECT sv.observation_id, v.name, sv.value
FROM single_values sv
INNER JOIN variables v on v.id = sv.variable_id
WHERE v.dataset_id = 1) s
PIVOT(MAX(value) 
  FOR name IN (' + (SELECT names FROM ColumnNames) + ')) p'
--print @DynamicPivotQuery
EXEC sp_executesql @DynamicPivotQuery

Like I said, it's not only ugly, it's also gross. When I get home from work 
today, I can show how I would do it in postgres if that's at all useful to 
anyone.

Anyway, like I said above, I think your polymorphic example is where I want 
to go with the data model. I just need to understand it because I hate not 
understanding code I'm using. I suspect that for some of these cases, I 
need to either, use pandas to do the pivot, which is fineas long as I'm 
keeping types. Or I should call a stored procedure to do the dynamic pivot 
if I just need to grab the whole dataset in that format.

Thanks!



On Thursday, February 28, 2019 at 12:34:16 PM UTC-6, Andrew Martin wrote:
>
> Hi All,
>
> Trying something out here for an analytics app here, and I'm not sure any 
> of this is a good idea. I'm trying to create a data model that's flexible 
> enough to handle a lot of different types of analysis. From plain CSVs to  
> time series to survey questions with complex hierarchies. My approach is to 
> have a pretty plain model and query as needed to construct dataframes that 
> will end up in pandas and sci-kit learn.
>
> The basic idea is that there's a dataset that has variables associated 
> with it. Variables have values associated. I want to retain type 
> information on the values. If I'm reading a file and loading it, I want to 
> store an int as an int or a datetime as a datetime and not stuff everything 
> into a string and have to guess what things are based on variable metadata. 
> That's where I'm kind of stuck. I think I have the multi-table inheritance 
> set up correctly to store different types of value. I'm totally unsure how 
> to query without having to know the subclass. I'd like to be able to 
> extract values for pandas processing with something like this:
>
> vals = Values.value.filter_by(variable_id=123456).all()
>
> without having to know per variable which subclass to choose. I.e., I 
> don't want to have to do
> vals = IntValue.value.filter_by(variable_id=123456).all()
>
>
> The second part of my question is only relevant if this whole design isn't 
> hosed, which I'm fine if you tell me that it is and I need to go back to 
> the drawing board.
>
> The second part is how to extract an entire dataset by ID in a way that 
> would be comfortable in pandas. If I were doing this in raw SQL, I'd use a 
> CTE to get the variables by dataset id and use a pivot to create the 
> data-frame-shaped table. I'm confounded by how to do this in alchemy, 
> especially with the polymorphic setup.
>
>
> Here are my current model definitions:
>
>
> from sqlalchemy import Column, Index, Integer, Text, DateTime, Float, 
> ForeignKey
> from sqlalchemy.orm import relationship
>
> from .meta import Base
>
>
> class DataSet(Base):
> __tablename__ = 'datasets'
> id = Column(Integer, primary_key=True)
> name = Column(Text, nullable=False)
> description = Column(Text, nullable=True)
>
>
>
> class Variable(Base):
> __tablename__ = 'variables'
> id = Column(Integer, primary_key=True)
> dataset_id = Column(Integer, ForeignKey('datasets.id'), Nullable=False
> )
> name = Column(Text, nullable=False)
> description = Column(Text, nullable=True)
> group_var_col_id = Column(Integer, ForeignKey('variables.id'), 
> nullable=True)
> group_var_row_id = Column(Integer, ForeignKey('variables.id'), 
> nullable=True)
>
> value_map = Column(Text, nullable=True) #change to JSONB when move 
> from SQLite to Postgres
>
> dataset = relationship('DataSet', backref='variables')
>
>
> class Value(Base):
> __tablename__ = 'values'
> id = Column(Integer, primary_key=True)
> variable_id = Column(Integer, ForeignKey('varia

[sqlalchemy] Polymorphic Query All regardless of subclass

2019-02-28 Thread Andrew Martin
Hi All,

Trying something out here for an analytics app here, and I'm not sure any 
of this is a good idea. I'm trying to create a data model that's flexible 
enough to handle a lot of different types of analysis. From plain CSVs to  
time series to survey questions with complex hierarchies. My approach is to 
have a pretty plain model and query as needed to construct dataframes that 
will end up in pandas and sci-kit learn.

The basic idea is that there's a dataset that has variables associated with 
it. Variables have values associated. I want to retain type information on 
the values. If I'm reading a file and loading it, I want to store an int as 
an int or a datetime as a datetime and not stuff everything into a string 
and have to guess what things are based on variable metadata. That's where 
I'm kind of stuck. I think I have the multi-table inheritance set up 
correctly to store different types of value. I'm totally unsure how to 
query without having to know the subclass. I'd like to be able to extract 
values for pandas processing with something like this:

vals = Values.value.filter_by(variable_id=123456).all()

without having to know per variable which subclass to choose. I.e., I don't 
want to have to do
vals = IntValue.value.filter_by(variable_id=123456).all()


The second part of my question is only relevant if this whole design isn't 
hosed, which I'm fine if you tell me that it is and I need to go back to 
the drawing board.

The second part is how to extract an entire dataset by ID in a way that 
would be comfortable in pandas. If I were doing this in raw SQL, I'd use a 
CTE to get the variables by dataset id and use a pivot to create the 
data-frame-shaped table. I'm confounded by how to do this in alchemy, 
especially with the polymorphic setup.


Here are my current model definitions:


from sqlalchemy import Column, Index, Integer, Text, DateTime, Float, 
ForeignKey
from sqlalchemy.orm import relationship

from .meta import Base


class DataSet(Base):
__tablename__ = 'datasets'
id = Column(Integer, primary_key=True)
name = Column(Text, nullable=False)
description = Column(Text, nullable=True)



class Variable(Base):
__tablename__ = 'variables'
id = Column(Integer, primary_key=True)
dataset_id = Column(Integer, ForeignKey('datasets.id'), Nullable=False)
name = Column(Text, nullable=False)
description = Column(Text, nullable=True)
group_var_col_id = Column(Integer, ForeignKey('variables.id'), nullable=
True)
group_var_row_id = Column(Integer, ForeignKey('variables.id'), nullable=
True)

value_map = Column(Text, nullable=True) #change to JSONB when move from 
SQLite to Postgres

dataset = relationship('DataSet', backref='variables')


class Value(Base):
__tablename__ = 'values'
id = Column(Integer, primary_key=True)
variable_id = Column(Integer, ForeignKey('variables.id'), Nullable=False
)
observation_id = Column(Text, nullable=False)
value_type = Column(Text, nullable=False)

variable = relationship('Variable', backref='values')

__mapper_args__ = {'polymorphic_on': value_type}


class IntValue(Value):
__tablename__ = 'int_values'
id = Column(None, ForeignKey('values.id'), primary_key=True)
value = Column(Integer, nullable=True)

__mapper_args__ = {'polymorphic_identity': 'int'}


class StringValue(Value):
__tablename__ = 'string_values'
id = Column(None, ForeignKey('values.id'), primary_key=True)
value = Column(Text, nullable=True)

__mapper_args__ = {'polymorphic_identity': 'string'}


class FloatValue(Value):
__tablename__ = 'float_values'
id = Column(None, ForeignKey('values.id'), primary_key=True)
value = Column(Float, nullable=True)

__mapper_args__ = {'polymorphic_identity': 'float'}


class DateTimeValue(Value):
__tablename__ = 'datetime_values'
id = Column(None, ForeignKey('values.id'), primary_key=True)
value = Column(DateTime, nullable=True)

__mapper_args__ = {'polymorphic_identity': 'datetime'}

Thanks in advance for any suggestions you might have!

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] Re: Best use of engine_from_config

2016-09-04 Thread Andrew Martin
Also, do be clear, what I have works. I'm passing the request object to the 
model and then creating the engine based on the request. It works, but I 
have a nagging feeling that this is not good and definitely not best.

On Sunday, September 4, 2016 at 10:39:55 AM UTC-4, Andrew Martin wrote:
>
> I have a question about using creating engines along with pretty much any 
> python web framework, but I'm specifically working with Pyramid. For 
> obvious reasons. :) 
>
> The sqlalchemy docs state this:
>
> "The typical usage of create_engine() 
> <http://docs.sqlalchemy.org/en/latest/core/engines.html#sqlalchemy.create_engine>
>  is 
> once per particular database URL, held globally for the lifetime of a 
> single application process. A single Engine 
> <http://docs.sqlalchemy.org/en/latest/core/connections.html#sqlalchemy.engine.Engine>
>  manages 
> many individual DBAPI connections on behalf of the process and is intended 
> to be called upon in a concurrent fashion. The Engine 
> <http://docs.sqlalchemy.org/en/latest/core/connections.html#sqlalchemy.engine.Engine>
>  is *not* synonymous to the DBAPI connect function, which represents just 
> one connection resource - the Engine 
> <http://docs.sqlalchemy.org/en/latest/core/connections.html#sqlalchemy.engine.Engine>
>  is 
> most efficient when created just once at the module level of an 
> application, not per-object or per-function call."
>
> But with most frameworks I've worked with--and with Pyramid 
> specifically--the best way to access the .ini settings is through the 
> request object, which means that the engine only lives for the lifetime of 
> the request? I think?
>
> I'm unsure about this.
>
> My sqlalchemy setup is a little weird (read: dumb) in the sense that I'm 
> not using the ORM layer at all. I use the engine to execute functions on a 
> postgres server which returns only a single json_aggs result. Then I either 
> make it a dict if it's going to a template or stick it as json if it's 
> going to an API that expects json. It probably sounds goofy, but it works 
> extremely well. And it's fast as hell.
>
> But I'm uncomfortable with the way I'm using engine_from_config. According 
> to the docs, I should create one and only one engine that exists for the 
> lifetime of the application. But the only reasonable way for me to access 
> the settings object is to go through a request object. And I know I must be 
> misunderstanding something here because, now that I think about it, there's 
> no effing way any web framework is reading the config from a .ini file for 
> every request. I mean, php would do that, but we are Python here.
>
> So, all that aside, what's the best way to do this?
>
> In my mind, if it's a per-request read of the settings, the it belongs 
> inside an object as part of the constructor. But if it's not, where do I 
> put it while keeping with best practices for web frameworks?
>

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Best use of engine_from_config

2016-09-04 Thread Andrew Martin
That’s a really interesting approach, and I think I might try it.

My concern here is whether I’m using things in the appropriate way. I think I’m 
failing to understand something about the way sqlalchemy works, and that’s the 
reason I’m asking this. For myself and for posterity. :)

-andrew

> On Sep 4, 2016, at 11:22 AM, Michał Dobrzański <mike.dobrzan...@gmail.com> 
> wrote:
> 
> For Pyramid application I cache engines in global variable under connection 
> string keys. I cache engines when they're created. This way each time I need 
> to use the engine I reach to that dictionary instead of creating it with each 
> request.
> 
> On Sun, Sep 4, 2016 at 4:39 PM, Andrew Martin <agmar...@gmail.com 
> <mailto:agmar...@gmail.com>> wrote:
> I have a question about using creating engines along with pretty much any 
> python web framework, but I'm specifically working with Pyramid. For obvious 
> reasons. :) 
> 
> The sqlalchemy docs state this:
> 
> "The typical usage of create_engine() 
> <http://docs.sqlalchemy.org/en/latest/core/engines.html#sqlalchemy.create_engine>
>  is once per particular database URL, held globally for the lifetime of a 
> single application process. A single Engine 
> <http://docs.sqlalchemy.org/en/latest/core/connections.html#sqlalchemy.engine.Engine>
>  manages many individual DBAPI connections on behalf of the process and is 
> intended to be called upon in a concurrent fashion. The Engine 
> <http://docs.sqlalchemy.org/en/latest/core/connections.html#sqlalchemy.engine.Engine>
>  is not synonymous to the DBAPI connect function, which represents just one 
> connection resource - the Engine 
> <http://docs.sqlalchemy.org/en/latest/core/connections.html#sqlalchemy.engine.Engine>
>  is most efficient when created just once at the module level of an 
> application, not per-object or per-function call."
> 
> But with most frameworks I've worked with--and with Pyramid specifically--the 
> best way to access the .ini settings is through the request object, which 
> means that the engine only lives for the lifetime of the request? I think?
> 
> I'm unsure about this.
> 
> My sqlalchemy setup is a little weird (read: dumb) in the sense that I'm not 
> using the ORM layer at all. I use the engine to execute functions on a 
> postgres server which returns only a single json_aggs result. Then I either 
> make it a dict if it's going to a template or stick it as json if it's going 
> to an API that expects json. It probably sounds goofy, but it works extremely 
> well. And it's fast as hell.
> 
> But I'm uncomfortable with the way I'm using engine_from_config. According to 
> the docs, I should create one and only one engine that exists for the 
> lifetime of the application. But the only reasonable way for me to access the 
> settings object is to go through a request object. And I know I must be 
> misunderstanding something here because, now that I think about it, there's 
> no effing way any web framework is reading the config from a .ini file for 
> every request. I mean, php would do that, but we are Python here.
> 
> So, all that aside, what's the best way to do this?
> 
> In my mind, if it's a per-request read of the settings, the it belongs inside 
> an object as part of the constructor. But if it's not, where do I put it 
> while keeping with best practices for web frameworks?
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com 
> <mailto:sqlalchemy+unsubscr...@googlegroups.com>.
> To post to this group, send email to sqlalchemy@googlegroups.com 
> <mailto:sqlalchemy@googlegroups.com>.
> Visit this group at https://groups.google.com/group/sqlalchemy 
> <https://groups.google.com/group/sqlalchemy>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.
> 
> 
> -- 
> You received this message because you are subscribed to a topic in the Google 
> Groups "sqlalchemy" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/sqlalchemy/kD2cccX6fzY/unsubscribe 
> <https://groups.google.com/d/topic/sqlalchemy/kD2cccX6fzY/unsubscribe>.
> To unsubscribe from this group and all its topics, send an email to 
> sqlalchemy+unsubscr...@googlegroups.com 
> <mailto:sqlalchemy+unsubscr...@googlegroups.com>.
> To post to this group, send email to sqlalchemy@googlegroups.com 
> <mailto:sqlalchemy@googlegroups.com>.
> Visit this group at https://groups.google.com/group/sqlalchemy 
> <h

[sqlalchemy] Best use of engine_from_config

2016-09-04 Thread Andrew Martin
I have a question about using creating engines along with pretty much any 
python web framework, but I'm specifically working with Pyramid. For 
obvious reasons. :) 

The sqlalchemy docs state this:

"The typical usage of create_engine() 

 is 
once per particular database URL, held globally for the lifetime of a 
single application process. A single Engine 

 manages 
many individual DBAPI connections on behalf of the process and is intended 
to be called upon in a concurrent fashion. The Engine 

 is *not* synonymous to the DBAPI connect function, which represents just 
one connection resource - the Engine 

 is 
most efficient when created just once at the module level of an 
application, not per-object or per-function call."

But with most frameworks I've worked with--and with Pyramid 
specifically--the best way to access the .ini settings is through the 
request object, which means that the engine only lives for the lifetime of 
the request? I think?

I'm unsure about this.

My sqlalchemy setup is a little weird (read: dumb) in the sense that I'm 
not using the ORM layer at all. I use the engine to execute functions on a 
postgres server which returns only a single json_aggs result. Then I either 
make it a dict if it's going to a template or stick it as json if it's 
going to an API that expects json. It probably sounds goofy, but it works 
extremely well. And it's fast as hell.

But I'm uncomfortable with the way I'm using engine_from_config. According 
to the docs, I should create one and only one engine that exists for the 
lifetime of the application. But the only reasonable way for me to access 
the settings object is to go through a request object. And I know I must be 
misunderstanding something here because, now that I think about it, there's 
no effing way any web framework is reading the config from a .ini file for 
every request. I mean, php would do that, but we are Python here.

So, all that aside, what's the best way to do this?

In my mind, if it's a per-request read of the settings, the it belongs 
inside an object as part of the constructor. But if it's not, where do I 
put it while keeping with best practices for web frameworks?

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] bulk_update_mapping stale data error

2016-05-03 Thread Andrew Martin
This is an admittedly wonky situation. And I don't have much code to offer 
because I think the problem is conceptual on my part, rather than 
code-related.

We're basically using google spreadsheets as a quick and dirty interface 
for certain types of data entry. I know how terrible this is. Please just 
bare with me. I have to work with what this is now while I make it better.

Each morning, we take whatever is in the google spreadsheet and update one 
column in a table in postgress. But postgress is being updated by a 
different service in real-time. The particular table that we are looking at 
here is has an odd behavior. If someone runs a delete operation from the 
GUI, it actually deletes the row instead of doing something sane like 
flagging it as deleted.

So we have some situations where there are more rows in the previous day's 
spreadsheet than there are in the postgress table, and I *think* that's 
what is causing the error. I get a traceback that looks like this:

  File "/opt/smsport/assessment_goal_sync.py", line 51, in get_current_goals
self.session.bulk_update_mappings(Assessment, current_vals)
  File 
"/data/anaconda/lib/python2.7/site-packages/sqlalchemy/orm/session.py", 
line 2329, in bulk_update_mappings
self._bulk_save_mappings(mapper, mappings, True, False, False, False)
  File 
"/data/anaconda/lib/python2.7/site-packages/sqlalchemy/orm/session.py", 
line 2351, in _bulk_save_mappings
transaction.rollback(_capture_exception=True)
  File 
"/data/anaconda/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", 
line 60, in __exit__
compat.reraise(exc_type, exc_value, exc_tb)
  File 
"/data/anaconda/lib/python2.7/site-packages/sqlalchemy/orm/session.py", 
line 2343, in _bulk_save_mappings
isstates, update_changed_only)
  File 
"/data/anaconda/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", 
line 121, in _bulk_update
bookkeeping=False)
  File 
"/data/anaconda/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", 
line 716, in _emit_update_statements
(table.description, len(records), rows))
StaleDataError: UPDATE statement on table 'assessment' expected to update 
786 row(s); 785 were matched.



My code that is causing the problem is literally this:

self.session.bulk_update_mappings(Assessment, current_vals)
self.session.commit()

Where the argument current_vals is a list of dictionaries that contain 
primary key and the value to be updated.

This works in most cases. But it fails if a record in the database has been 
deleted.

So I have 3 questions:

Am I interpreting this error correctly?
If so, is there some way I can work around this to tell 
bulk_update_mappings to ignore that situation and just update what is 
matched?

Is there a better, more idiomatic way to handle this situation?

Thanks!

-andrew

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.