Re: [sqlalchemy] Re: Temporarily disable DB persistence for optimization routine

Ruben Di Battista Sat, 13 Jan 2018 07:32:31 -0800

 

On Friday, January 12, 2018 at 10:54:49 AM UTC+1, Simon King wrote:
>
> If I understand your code correctly, scheduler.propagate() creates a 
> large number of Passage instances, and you only want a small subset of 
> them to be added to the database. Is that correct? 
>


Correct!
 

> I would guess that the passages are getting added to the session 
> because you are setting their 'satellite' property to point to a 
> Satellite which is already in the database. This then causes the 
> passages to be added to the session due to the default cascade rules 
> on the relationship 
> (http://docs.sqlalchemy.org/en/latest/orm/cascades.html). 
>
> If that really is the case, you can change the cascade rules for that 
> relationship, and then you'll probably need to explicitly add the 
> passages you want to *keep* to the session instead. 
>
> Hope that helps, 
>
> Simon 
>

Dear Simon, 
thank you. That was the case. Modifying the cascade disabling the backref 
cascade does not load in the DB the passages at propagation time. 

But now, when I manually add the subset of passages after the optimization, 
I get a: 

InvalidRequestError: Can't attach instance <Satellite at 0x108c4dfd0>; 
another instance with key [...] is already present in this section. 

So, I suppose that disabling the backref cascade now SQLAlchemy is not 
capable anymore to recognize the already loaded Satellite objects...

Should I maybe merge somewhere?
 

>
> On Fri, Jan 12, 2018 at 2:10 AM, Mike Bayer <mik...@zzzcomputing.com 
> <javascript:>> wrote: 
> > I can't give you much detail except to say the unique object recipe is 
> > doing an .add() when it finds an identity that isn't taken, if you 
> > don't want those persisted then take out the part of the recipe doing 
> > add().  However, you'd need to alter the recipe further such that if 
> > the program asks for that same identity again which you didn't want to 
> > flush to the DB, and you'd like to use the same object, you need to 
> > pull that from some kind of local dictionary of "pending" objects with 
> > those identities, if that makes sense. 
> > 
> > the second email with the after_attach thing implies you are already 
> > adding an object to the Session. 
> > 
> > Neither of these code examples show example of use, where you are 
> > doing things that make objects and you'd like them to not be 
> > persisted.   If you need to create unique objects in memory without 
> > persisting, you just need to store them in some dictionary that sets 
> > up the in-memory uniqueness you are looking for. 
> > 
> > 
> > 
> > On Thu, Jan 11, 2018 at 11:37 AM, Ruben Di Battista 
> > <rubendi...@gmail.com <javascript:>> wrote: 
> >> Last copy paste went wrong. 
> >> 
> >> The uniqueness is ensured by: 
> >> 
> >> @event.listens_for(orm.session.Session, "after_attach") 
> >> def after_attach(session, instance): 
> >>     # when ConstrainedSatellite objects are attached to a Session, 
> >>     # figure out if in the database there's already the Constraint, 
> >>     # requested, if yes return that object, if not create a new one. 
> >>     # This is an adaptation of the UniqueObject pattern 
> >>     # suggested by SQLAlchemy documentation 
> >>     # 
> https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/UniqueObject 
> >>     if isinstance(instance, UniqueAssociationProxy): 
> >>         instance = instance.ensure_unicity(session) 
> >> 
> >> 
> >> 
> >> 
> >> On Thursday, January 4, 2018 at 6:05:38 PM UTC+1, Ruben Di Battista 
> wrote: 
> >>> 
> >>> Hello, 
> >>> I'm writing a satellite passage scheduler that has a database 
> persistence 
> >>> layer to store the scheduled passages. 
> >>> 
> >>> The DB schema is organized as follows: 
> >>> - A table storing the satellites (using NORAD No as Primary Key) 
> >>> - A table storing the ground stations where to compute the passages of 
> the 
> >>> satellites 
> >>> - A table storing the passages of these satellites, with two foreign 
> keys 
> >>> linking each passage to a Ground Station and a Satellite 
> >>> - A table storing all the types of constraints a satellite can have 
> >>> - A table storing all the types of weights (used to perform the 
> >>> scheduling) that can be assigned to each satellite 
> >>> 
> >>> Than I configured some association proxies (with the related `Unique 
> >>> Object` pattern) in order to assign the weights and the satellites as 
> a 
> >>> dictionary 
> >>> 
> >>> sat.constraints['min_elevation']= 10 
> >>> 
> >>> The details of the relationships here below: 
> >>> 
> >>> # Relationship definitions 
> >>> orm.mapper(Satellite, satellite, properties={ 
> >>>     'passages': orm.relationship(Passage, 
> >>>                                  backref='satellite', 
> >>>                                  order_by=passage.c.aos, 
> >>>                                  cascade='all, delete-orphan'), 
> >>> 
> >>> 
> >>>     '_constraints': orm.relationship( 
> >>>         ConstrainedSatellite, backref='satellite', 
> >>> 
> >>> collection_class=orm.collections.attribute_mapped_collection('name'), 
> >>>         cascade='all, delete-orphan'), 
> >>> 
> >>> 
> >>>     '_weights': orm.relationship( 
> >>>         WeightedSatellite, backref='satellite', 
> >>> 
> >>> collection_class=orm.collections.attribute_mapped_collection('name'), 
> >>>         lazy='joined', 
> >>>         cascade='all, delete-orphan'), 
> >>> 
> >>> 
> >>>     '_tle': satellite.c.tle 
> >>> 
> >>> 
> >>> }) 
> >>> 
> >>> 
> >>> orm.mapper(Constraint, constraint, properties={ 
> >>>     'satellites': orm.relationship(ConstrainedSatellite, 
> >>> backref='constraint') 
> >>> 
> >>> 
> >>> }) 
> >>> 
> >>> 
> >>> orm.mapper(Weight, weight, properties={ 
> >>>     'satellites': orm.relationship(WeightedSatellite, 
> backref='weight') 
> >>> }) 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> orm.mapper(ConstrainedSatellite, constraint_satellite) 
> >>> 
> >>> 
> >>> orm.mapper(WeightedSatellite, weight_satellite) 
> >>> 
> >>> 
> >>> orm.mapper(PassageData, passage_data) 
> >>> 
> >>> 
> >>> orm.mapper(Passage, passage, properties={ 
> >>>     'angles': orm.relationship(PassageData, backref='passage', 
> >>>                                order_by=passage_data.c.time, 
> >>>                                cascade='all, delete-orphan') 
> >>> 
> >>> 
> >>> }, 
> >>>     confirm_deleted_rows=False 
> >>> ) 
> >>> 
> >>> 
> >>> orm.mapper(GroundStation, ground_station, properties={ 
> >>>     'passages': orm.relationship(Passage, backref='ground_station', 
> >>>                                  order_by=passage.c.aos, 
> >>>                                  cascade='all, delete-orphan') 
> >>> }) 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> # Association Proxies 
> >>> Satellite.constraints = association_proxy( 
> >>>     '_constraints', 'value', creator=constrained_sat_creator 
> >>> ) 
> >>> 
> >>> 
> >>> Satellite.weights = association_proxy( 
> >>>     '_weights', 'value', creator=weighted_sat_creator 
> >>> ) 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> ConstrainedSatellite.constraint_name = association_proxy('constraint', 
> >>> 'name') 
> >>> WeightedSatellite.weight_name = association_proxy('weight', 'name') 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> From the tests everything is working as expected. The problem is that 
> I'm 
> >>> getting some performance issues while performing the scheduling 
> procedure: 
> >>> this procedure needs to read the `constraints` and `weights` 
> associated to 
> >>> each satellite to select the best passages among all the ones possible 
> on 
> >>> each ground station. While reading the values of the `constraints` and 
> >>> `weights`, SQLAlchemy needs to store the Passage object in the 
> Database. So 
> >>> all the passages are stored in the database during the propagation of 
> all 
> >>> the possible passages, and then I need to manually expunge or delete 
> the 
> >>> passages that haven't been scheduled from the DB. 
> >>> 
> >>> What I would like to achieve is to "shut down", after selecting the 
> ground 
> >>> stations and the satellites for which to perform the optimization, the 
> >>> SQLAlchemy persistence in order to perform the optimization procedure 
> only 
> >>> on Python objects, without having SQLAlchemy to store them in the DB, 
> and 
> >>> then just storing efficiently the optimized, smaller, list of them 
> that are 
> >>> computed by the scheduling algorithm. 
> >>> 
> >>> Currently the scheduling script is like this (schematically): 
> >>> 
> >>> 
> >>>     # Create a DB session 
> >>>     session_factory = orm.sessionmaker(db) 
> >>>     session = orm.scoped_session(session_factory) 
> >>> 
> >>> 
> >>>     # Retrieve satellites from DB 
> >>>     sats = session.query(Satellite).all() 
> >>> 
> >>> 
> >>>     # Retrieve gss 
> >>>     ground_stations = session.query(GroundStation).all() 
> >>> 
> >>> 
> >>>     # Init Scheduler instance 
> >>>     scheduler = Scheduler(sats, ground_stations, start_day) 
> >>> 
> >>> 
> >>>     # This methods generates all the possible passages of all 
> satellites 
> >>> on all ground stations 
> >>>     # it needs to read `constraints` and `weights` for each satellite 
> to 
> >>> perform what it needs. 
> >>>     # currently all the passages get stored in the DB 
> >>>     all_passages = scheduler.propagate() 
> >>> 
> >>> 
> >>>     # This method selects from the totality of all_passages, a subset 
> that 
> >>> fulfills all 
> >>>     # the constraints. 
> >>>     scheduled_passages = prop.schedule(all_passages, iterations, pool) 
> >>> 
> >>> 
> >>>     # Remove from session passages not scheduled 
> >>>     # === This is the thing I would like to avoid since it's slow === 
> >>>     for passage in all_passages: 
> >>>         if passage not in scheduler_passages: 
> >>>             session.expunge(passage) 
> >>>             session.delete(passage) 
> >>> 
> >>> 
> >>>     session.commit() 
> >>>     session.close() 
> >>> 
> >>> As you can see, I need to manually expunge or delete the passages that 
> are 
> >>> not scheduled by the algorithm. If the number of passages is huge, 
> that 
> >>> means performing a DELETE query for each passage, that is slow. What I 
> would 
> >>> like to achieve is to have `scheduler.propagate`, and 
> `scheduler.schedule` 
> >>> method to perform operations only on Python objects (fully loaded with 
> all 
> >>> the correct parameters from the DB), avoiding the need to store and 
> then 
> >>> delete all the passages in the DB. Can I "shut down" temporarily the 
> DB 
> >>> persistence? 
> >>> 
> >>> Thanks for you help, 
> >>> RdB 
> >>> 
> >>> PS: I'm on MySQL (MariaDB) 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >> -- 
> >> SQLAlchemy - 
> >> The Python SQL Toolkit and Object Relational Mapper 
> >> 
> >> http://www.sqlalchemy.org/ 
> >> 
> >> To post example code, please provide an MCVE: Minimal, Complete, and 
> >> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> >> description. 
> >> --- 
> >> You received this message because you are subscribed to the Google 
> Groups 
> >> "sqlalchemy" group. 
> >> To unsubscribe from this group and stop receiving emails from it, send 
> an 
> >> email to sqlalchemy+...@googlegroups.com <javascript:>. 
> >> To post to this group, send email to sqlal...@googlegroups.com 
> <javascript:>. 
> >> Visit this group at https://groups.google.com/group/sqlalchemy. 
> >> For more options, visit https://groups.google.com/d/optout. 
> > 
> > -- 
> > SQLAlchemy - 
> > The Python SQL Toolkit and Object Relational Mapper 
> > 
> > http://www.sqlalchemy.org/ 
> > 
> > To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example.  See  http://stackoverflow.com/help/mcve for a full 
> description. 
> > --- 
> > You received this message because you are subscribed to the Google 
> Groups "sqlalchemy" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to sqlalchemy+...@googlegroups.com <javascript:>. 
> > To post to this group, send email to sqlal...@googlegroups.com 
> <javascript:>. 
> > Visit this group at https://groups.google.com/group/sqlalchemy. 
> > For more options, visit https://groups.google.com/d/optout. 
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Re: Temporarily disable DB persistence for optimization routine

Reply via email to