It's difficult to debug this without a script that we can run to reproduce
the problem. What kind of object is self.db_session? You use it as a
context manager without calling it, so I don't think it can be a
sessionmaker or a session.
You're nesting calls to the context manager:
# in load_new_data
with self.db_session as outersession:
# add new_obj to outersession
# call move_to_error
with self.db_session as innersession:
# add new_obj to innersession
Are innersession and outersession supposed to be the same object? If they
are different sessions, you're trying to add new_obj to both of them, which
is going to be a problem.
If it were me, I would explicitly pass the session to the move_to_error
method. If you don't like that, you can also use
sqlalchemy.orm.object_session to get the session that new_obj already
belongs to.
Hope that helps,
Simon
On Fri, Apr 29, 2022 at 5:10 AM Andrew Martin <[email protected]> wrote:
> Hi all, I'm struggling a bit with best practices for my ETL application.
>
> Each part of the ETL app is completely separate from the others, but I
> have a MixIn for some common functions that each of them need to do, like
> move this record to error if there's a data integrity problem. Or move this
> record to manual review if there's insufficient data to move it along to
> the next stage of the ETL.
>
> The problem I'm having is that I don't understand the correct way to pass
> an object to a function, update it, and eventually commit it.
>
> I have for example:
>
> class DataMoverMixin:
> def __init__(self) -> None:
> self.db_session = get_db_session()
> <insert a number of other things here that all my ETL classes share>
>
> self.move_to_error(obj: Any, error_stage: str, traceback: Exception)
> -> bool:
> logger.info("Moving object to error.")
> json_data = json.dumps(obj, cls=AlchemyEncoder)
> e = Error(
> id=obj.id,
> error_stage=error_stage,
> error_message=repr(traceback),
> error_data=json_data,
> )
> obj.status = "error"
> with self.db_session as session:
> session.add(e)
> session.add(obj)
> session.commit()
> logger.info("Successfully moved object to error.")
> return True
>
> class IngestDataManager(DataMoverMixin):
> def __init__(self):
> super().__init__()
> <insert some class-specific things here>
>
>
> def load_new_data(self, accounts: List[Dict]) -> bool:
> for acc in accounts:
> new_obj = NewObj(**acc)
> with self.db_session as session:
> session.add(new_obj)
> session.commit()
> # now the raw data is loaded, I need to check if it
> conforms and do some stuff with the newly created id.
> session.refresh(new_obj)
> if not new_obj.important_stuff:
> self.move_to_error(new_obj, "ingest_integrity_error",
> f"missing {important stuff} for account_id: {new_obj.id}
>
>
> This is the simplest example of what does and doesn't work. And I can tell
> from the errors that I must be doing something very anti pattern, but I
> can't quite figure out what.
>
> This pattern gives me a DetachedInstanceError.
>
> So if I change Mixin.move_to_error like so:
>
> . . .
> with self.db_session as session:
> session.refresh(obj)
> obj.status = "error"
> session.add(e)
> session.add(obj)
> session.commit()
> . . .
>
> I get no error. But also the changes to the obj are not actually committed
> to the DB.
> The new record for error is committed.
>
> My expectation was that by attaching the session to the class that any
> method on the class would reference the same session, and that using the
> context manager was just a good practice to open and close it. But that
> doesn't seem to be the case.
>
> I might certainly be wrong, but it appears that when you pass an
> SQLAlchemy object to a function inside of a session context manager, it
> does not carry the session with it?
>
> And also reopening what I think is the session in a context manager fixes
> that but also then doesn't allow me to update the object?
>
> I guess I'm just kinda confused, and I'm sure there's a better way to do
> this.
>
> I've searched around a lot to try and understand this problem, but for
> whatever reason, nothing has clicked for me about what I'm doing wrong.
>
> Appreciate any help from people.
>
> -andrew
>
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sqlalchemy/628f6d67-51ce-4251-a90e-9f27341b793cn%40googlegroups.com
> <https://groups.google.com/d/msgid/sqlalchemy/628f6d67-51ce-4251-a90e-9f27341b793cn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
http://www.sqlalchemy.org/
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable
Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/sqlalchemy/CAFHwexdT2Fhh1ysSXX5DapBETB1ziWP86uSh2hDBwp6%2BB%2BKFLg%40mail.gmail.com.