Hey Mike, thanks for your speedy reply.
I guess I left out some key details and simplified my examples a little too
much, which is probably why you're confused.
For one, I have actually implemented downstream_edges and upstream_edges as
dictionary collections, i.e.:
src_node = relationship(
"Node",
foreign_keys=[src_node_id],
backref=backref(
"downstream_edges",
collection_class=attribute_mapped_collection("downstream_key"),
)
)
@property
def downstream_key(self):
return self.dst_node, self.color
The purpose of this is to provide an easy way of doing "upserts" -- that
is, checking if an edge already exists before inserting, since the edge
table has a UNIQUE KEY (`src_node_id`,`dst_node_id`,`color`) that would be
violated otherwise.
This allows me to do:
downstream_key = (dst_node, color)
try:
edge = node.downstream_edges[downstream_key]
except KeyError:
edge = Edge(dst_node=dst_node, color=color)
node.downstream_edges[downstream_key] = edge
I realize that this means self.downstream_edges is being fully loaded every
time I do this "upsert". But this is why I can't follow your suggestion of
simply adding Edges with Edge(src_node=src_node, dst_node=dst_node,
color=color), because there's no check to prevent violating the UNIQUE KEY.
So the question then is what is the most performant way of achieving this
"upsert" behavior without triggering a full load of the downstream_edges.
There is a second problem I'm running into when trying to make use of
dynamic relationships, or any relationship definition which requires flush
+ query to load. When I add a new edge to a collection, if I want to have
event listeners respond to those modifications, for example by using
@event.listens_for(Session, "before_flush")
then I don't seem to be allowed to reference any relationship inside that
event listener which requires database querying to fetch. I'm assuming this
is because the "before_flush" event listener cannot itself load a
relationship that would require a flush to load, because we are already in
mid-flush.
Given that, if I've added a new edge to downstream_edges, which is
lazy="dynamic" let's say, how can I see all the items in that collection,
or in a subset of that collection (which could be a lazy="dynamic"
relationship or a primaryjoin + viewonly=True relationship) without
requiring an additional flush at an a point in time when one is not allowed?
Note that I have also tried to accomplish this using "after_flush" and
"after_flush_postexec" event listeners, but the problem with these is I
don't seem to be allowed to make further ORM modifications inside these
listeners which would require more UPDATE statements, as I get an SAWarning
telling me what I'm doing is not recommended.
Thanks,
-Dane
On Thursday, December 30, 2021 at 5:33:38 PM UTC-8 Mike Bayer wrote:
>
>
> On Thu, Dec 30, 2021, at 8:10 PM, Dane K Barney wrote:
>
> Given the following example model:
>
> class Node(Base):
> __tablename__ = "node"
> id = Column(Integer, primary_key=True, autoincrement=True)
>
> class Edge(Base):
> __tablename__ = "edge"
> id = Column(Integer, primary_key=True, autoincrement=True)
> color = Column(Enum(Color))
> src_node_id = Column(Integer, ForeignKey("node.id"))
> dst_node_id = Column(Integer, ForeignKey("node.id"))
> src_node = relationship(
> "Node",
> foreign_keys=[src_node_id],
> backref=backref("downstream_edges")
> )
> dst_node = relationship(
> "Node",
> foreign_keys=[dst_node_id],
> backref=backref("upstream_edges")
> )
>
> The collections Node.downstream_edges and Node.upstream_edges are so large
> that it is detrimental to load them entirely on a regular basis (although I
> still need the ability to do so). So for performance reasons, I am trying
> to come up with a way of sub-dividing these relationships based on the
> "color" attribute. One way I have tried to do this is by defining
> additional relationships on the Node class which filter the query using a
> custom primaryjoin. For example:
>
> blue_downstream_edges = relationship(
> "Edge",
> primaryjoin="and_(Node.id == Edge.src_node_id, Edge.color == 'blue'",
> viewonly=True,
> )
>
> The downside to this approach is that the viewonly=True attribute means I
> cannot directly make modifications to this collection. When I want to add
> new Edges, I have to do it on the complete downstream_edges relationship,
> which will trigger the full load that I'm trying to avoid.
>
>
> I would try to add new Edge objects without appending them to any list,
> just create a new Edge(src_node=n1, dest_node=n2).
>
>
>
> Another idea I had was to use the lazy="dynamic" attribute on the
> downstream_edges and upstream_edges relationships, so that I could do
> something like:
>
> blue_downstream_edges = node.downstream_edges.filter(Edge.color == 'blue')
>
>
> yes, you should put "dynamic" on these relationships.
>
>
> As I understand it, I would then be able to append items to this subset
> collection. However, those changes would not be seen in the complete
> downstream_edges collection.
>
>
> they would ? they need to be flushed first, but that's how it works, if
> you want to see the "complete collection" then you are loading that
> complete collection.
>
> OTOH if you want to just have the "complete collection" all in memory at
> once, OK, then ...you wouldn't use dynamic? is it kind of like two
> different modes of operation you want to use, one where collections are
> fully in memory and another where they arent ?
>
>
>
> In short, what I'm trying to achieve is a way to provide subsets of a
> large collection, which are modifiable, and those modifications are visible
> in both the larger collection and the subset collection.
>
> I'm guessing this would need to involve some sort of shared access to the
> same in-memory ORM mapper class, because a modification made to one
> representation of the collection that has not yet been flushed/persisted to
> the database would not be visible to another representation of that
> collection, assuming it requires a database query to load.
>
> Is this possible at all?
>
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sqlalchemy/d834ff0d-654b-4b55-906c-0f062c7bb26fn%40googlegroups.com
>
> <https://groups.google.com/d/msgid/sqlalchemy/d834ff0d-654b-4b55-906c-0f062c7bb26fn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>
--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
http://www.sqlalchemy.org/
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable
Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/sqlalchemy/1820d488-7511-4909-b3b3-c961dc32dba7n%40googlegroups.com.