Re: [sqlalchemy] Re: Temporarily disable DB persistence for optimization routine
I confirm what I said. The run in multiprocessing was regenerating instances because after deserialization they were getting new IDs. I tried to implement a custom __hash__ but it seems that SQLAlchemy does not get it. What I did was disabling the backref cascade for `Satellite` and `GroundStation` objects and then, after optimization, doing: for passage in results: # I need to merge since if coming from multiprocessing the instance # IDs change. passage.satellite = session.merge(passage.satellite) passage.ground_station = session.merge(passage.ground_station) session.add(passage) This looks working as expected. Thanks to Mike and Simon pointing me on the right track! -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Re: Temporarily disable DB persistence for optimization routine
I was not able to find anything generating new instances. In facts the culprit was that with the `prop.schedule()` method I'm using multiprocessing. So the instances of the `Passage` objects that are in different processes, gain a different `id` (and moreover they're referenced instances of related objects as GroundStation or Satellite) and when I "join" the result from all the processes, the Passage objects contain references to the objects that became "new" since they were run in a separate process. In facts I removed the parallel multiprocessing part, and everything is working as expected. How should I cope with multiprocessing (I'm already using `scoped session`)? Merging objects when joining the results in the different processes (tried on the fly, so far getting other errors... did not investigate enough)? Are there best practices? On Mon, Jan 15, 2018 at 11:53 AM, Simon Kingwrote: > Yes, if you can't find where you are creating new Satellite instances, > I'd probably stick an assert statement in Satellite.__init__ and see > where it gets triggered. > > Simon > > On Mon, Jan 15, 2018 at 10:34 AM, Ruben Di Battista > wrote: > > Dear Simon, > > > > thanks again for your kind help. > > > > Actually the creation of new instances is not intended. But I'm not > getting > > where they are created... > > > > I give you more insight: > > > > This is the scheduler object with the associated propagate() method > > > > class Scheduler(six.with_metaclass(abc.ABCMeta)): > > """ This class gets a list of GroundStation objects and a list of > > Satellites > > objects and compute all the passages of the Satellites over the > > GroundStations > > > > Args: > > sat_list(list): List of Satellites objects > > gs_list(list): List of GroundStation objects > > start_day(datetime): The datetime object representing the day > from > > which to start the propagation > > time_of_propagation(int): Number of hours to propagate > > [default:24] > > deltaT(float): Time step to use for angles retrieval, in seconds > > """ > > > > def __init__(self, sat_list, gs_list, start_day, > time_of_propagation=24, > > deltaT=0.05): > > # Monkey patch the Satellite class with the cost_function > specific > > # of the scheduler algorithm. > > sat_class = type(sat_list[0]) > > sat_class.cost_function = self.cost_function > > > > self.sat_list = sat_list > > self.gs_list = gs_list > > self.start_day = start_day > > self.time_of_propagation = time_of_propagation > > self.deltaT = deltaT > > > > def propagate(self): > > """ This method computes all the passages of the Satellites over > the > > GroundStations > > > > Args: > > > > Returns: > > all_passages(PassageList): A list ordered from the earliest > > passage\ > > of all passages > > > > Raises: > > ModelNotAvailable: When a satellite is too far from Earth and > > the > > models available in Orbital are not good, a > > ModelNotAvailable is > > raised > > """ > > > > all_passages = PassageList() > > > > # Loop Over the gs list > > for gs in self.gs_list: > > # Loop over the satellites list > > for sat in self.sat_list: > > # Compute all the passages in the specified period > > passages = \ > > sat.get_next_passes(gs, self.start_day, > > self.time_of_propagation, > > deltaT=self.deltaT) > > > > # Unfolding the list of passages in a flat list > > all_passages = all_passages + passages > > > > return all_passages > > > > > > It just basically loops over all the ground station and satellites and > > generates all the passages. Maybe the fact that I monkey patch the > satellite > > class induces the creation of a new instance of `Satellite`? > > > > The details of the `get_next_passes` method of the `Satellite` class for > > what concerns the `Passage` instance creation, skipping the algorithmic > > part, are: > > > > def _generate_passage(self, next_pass, ground_station, deltaT): > > """ This method returns a Passage Object from the data returned > from > > the original Orbital.get_next_passes method. > > > > """ > > > > aos, los, tca = next_pass > > > > return Passage(satellite=self, > >ground_station=ground_station, > >aos=aos, los=los, tca=tca, > >deltaT=deltaT) > > > > > > `self` should be a reference to the instance of `Satellite` already > loaded > > from DB. I will try to dive more into the code... > > > > > >
Re: [sqlalchemy] Re: Temporarily disable DB persistence for optimization routine
Yes, if you can't find where you are creating new Satellite instances, I'd probably stick an assert statement in Satellite.__init__ and see where it gets triggered. Simon On Mon, Jan 15, 2018 at 10:34 AM, Ruben Di Battistawrote: > Dear Simon, > > thanks again for your kind help. > > Actually the creation of new instances is not intended. But I'm not getting > where they are created... > > I give you more insight: > > This is the scheduler object with the associated propagate() method > > class Scheduler(six.with_metaclass(abc.ABCMeta)): > """ This class gets a list of GroundStation objects and a list of > Satellites > objects and compute all the passages of the Satellites over the > GroundStations > > Args: > sat_list(list): List of Satellites objects > gs_list(list): List of GroundStation objects > start_day(datetime): The datetime object representing the day from > which to start the propagation > time_of_propagation(int): Number of hours to propagate > [default:24] > deltaT(float): Time step to use for angles retrieval, in seconds > """ > > def __init__(self, sat_list, gs_list, start_day, time_of_propagation=24, > deltaT=0.05): > # Monkey patch the Satellite class with the cost_function specific > # of the scheduler algorithm. > sat_class = type(sat_list[0]) > sat_class.cost_function = self.cost_function > > self.sat_list = sat_list > self.gs_list = gs_list > self.start_day = start_day > self.time_of_propagation = time_of_propagation > self.deltaT = deltaT > > def propagate(self): > """ This method computes all the passages of the Satellites over the > GroundStations > > Args: > > Returns: > all_passages(PassageList): A list ordered from the earliest > passage\ > of all passages > > Raises: > ModelNotAvailable: When a satellite is too far from Earth and > the > models available in Orbital are not good, a > ModelNotAvailable is > raised > """ > > all_passages = PassageList() > > # Loop Over the gs list > for gs in self.gs_list: > # Loop over the satellites list > for sat in self.sat_list: > # Compute all the passages in the specified period > passages = \ > sat.get_next_passes(gs, self.start_day, > self.time_of_propagation, > deltaT=self.deltaT) > > # Unfolding the list of passages in a flat list > all_passages = all_passages + passages > > return all_passages > > > It just basically loops over all the ground station and satellites and > generates all the passages. Maybe the fact that I monkey patch the satellite > class induces the creation of a new instance of `Satellite`? > > The details of the `get_next_passes` method of the `Satellite` class for > what concerns the `Passage` instance creation, skipping the algorithmic > part, are: > > def _generate_passage(self, next_pass, ground_station, deltaT): > """ This method returns a Passage Object from the data returned from > the original Orbital.get_next_passes method. > > """ > > aos, los, tca = next_pass > > return Passage(satellite=self, >ground_station=ground_station, >aos=aos, los=los, tca=tca, >deltaT=deltaT) > > > `self` should be a reference to the instance of `Satellite` already loaded > from DB. I will try to dive more into the code... > > > Thanks a lot for the kind help of all of you, > > On Monday, January 15, 2018 at 10:06:24 AM UTC+1, Simon King wrote: >> >> On Sat, Jan 13, 2018 at 3:31 PM, Ruben Di Battista >> wrote: >> > >> > >> > On Friday, January 12, 2018 at 10:54:49 AM UTC+1, Simon King wrote: >> >> >> >> If I understand your code correctly, scheduler.propagate() creates a >> >> large number of Passage instances, and you only want a small subset of >> >> them to be added to the database. Is that correct? >> > >> > >> > Correct! >> > >> >> >> >> I would guess that the passages are getting added to the session >> >> because you are setting their 'satellite' property to point to a >> >> Satellite which is already in the database. This then causes the >> >> passages to be added to the session due to the default cascade rules >> >> on the relationship >> >> (http://docs.sqlalchemy.org/en/latest/orm/cascades.html). >> >> >> >> If that really is the case, you can change the cascade rules for that >> >> relationship, and then you'll probably need to explicitly add the >> >> passages you want to *keep* to the session instead. >> >> >> >> Hope that helps, >> >> >> >>
Re: [sqlalchemy] Re: Temporarily disable DB persistence for optimization routine
Dear Simon, thanks again for your kind help. Actually the creation of new instances is not intended. But I'm not getting where they are created... I give you more insight: This is the scheduler object with the associated propagate() method class Scheduler(six.with_metaclass(abc.ABCMeta)): """ This class gets a list of GroundStation objects and a list of Satellites objects and compute all the passages of the Satellites over the GroundStations Args: sat_list(list): List of Satellites objects gs_list(list): List of GroundStation objects start_day(datetime): The datetime object representing the day from which to start the propagation time_of_propagation(int): Number of hours to propagate [default:24] deltaT(float): Time step to use for angles retrieval, in seconds """ def __init__(self, sat_list, gs_list, start_day, time_of_propagation=24, deltaT=0.05): # Monkey patch the Satellite class with the cost_function specific # of the scheduler algorithm. sat_class = type(sat_list[0]) sat_class.cost_function = self.cost_function self.sat_list = sat_list self.gs_list = gs_list self.start_day = start_day self.time_of_propagation = time_of_propagation self.deltaT = deltaT def propagate(self): """ This method computes all the passages of the Satellites over the GroundStations Args: Returns: all_passages(PassageList): A list ordered from the earliest passage\ of all passages Raises: ModelNotAvailable: When a satellite is too far from Earth and the models available in Orbital are not good, a ModelNotAvailable is raised """ all_passages = PassageList() # Loop Over the gs list for gs in self.gs_list: # Loop over the satellites list for sat in self.sat_list: # Compute all the passages in the specified period passages = \ sat.get_next_passes(gs, self.start_day, self.time_of_propagation, deltaT=self.deltaT) # Unfolding the list of passages in a flat list all_passages = all_passages + passages return all_passages It just basically loops over all the ground station and satellites and generates all the passages. Maybe the fact that I monkey patch the satellite class induces the creation of a new instance of `Satellite`? The details of the `get_next_passes` method of the `Satellite` class for what concerns the `Passage` instance creation, skipping the algorithmic part, are: def _generate_passage(self, next_pass, ground_station, deltaT): """ This method returns a Passage Object from the data returned from the original Orbital.get_next_passes method. """ aos, los, tca = next_pass return Passage(satellite=self, ground_station=ground_station, aos=aos, los=los, tca=tca, deltaT=deltaT) `self` should be a reference to the instance of `Satellite` already loaded from DB. I will try to dive more into the code... Thanks a lot for the kind help of all of you, On Monday, January 15, 2018 at 10:06:24 AM UTC+1, Simon King wrote: > > On Sat, Jan 13, 2018 at 3:31 PM, Ruben Di Battista >wrote: > > > > > > On Friday, January 12, 2018 at 10:54:49 AM UTC+1, Simon King wrote: > >> > >> If I understand your code correctly, scheduler.propagate() creates a > >> large number of Passage instances, and you only want a small subset of > >> them to be added to the database. Is that correct? > > > > > > Correct! > > > >> > >> I would guess that the passages are getting added to the session > >> because you are setting their 'satellite' property to point to a > >> Satellite which is already in the database. This then causes the > >> passages to be added to the session due to the default cascade rules > >> on the relationship > >> (http://docs.sqlalchemy.org/en/latest/orm/cascades.html). > >> > >> If that really is the case, you can change the cascade rules for that > >> relationship, and then you'll probably need to explicitly add the > >> passages you want to *keep* to the session instead. > >> > >> Hope that helps, > >> > >> Simon > > > > > > Dear Simon, > > thank you. That was the case. Modifying the cascade disabling the > backref > > cascade does not load in the DB the passages at propagation time. > > > > But now, when I manually add the subset of passages after the > optimization, > > I get a: > > > > InvalidRequestError: Can't attach instance ; > > another instance with key [...] is already present in this section. > > > >
Re: [sqlalchemy] Re: Temporarily disable DB persistence for optimization routine
On Sat, Jan 13, 2018 at 3:31 PM, Ruben Di Battistawrote: > > > On Friday, January 12, 2018 at 10:54:49 AM UTC+1, Simon King wrote: >> >> If I understand your code correctly, scheduler.propagate() creates a >> large number of Passage instances, and you only want a small subset of >> them to be added to the database. Is that correct? > > > Correct! > >> >> I would guess that the passages are getting added to the session >> because you are setting their 'satellite' property to point to a >> Satellite which is already in the database. This then causes the >> passages to be added to the session due to the default cascade rules >> on the relationship >> (http://docs.sqlalchemy.org/en/latest/orm/cascades.html). >> >> If that really is the case, you can change the cascade rules for that >> relationship, and then you'll probably need to explicitly add the >> passages you want to *keep* to the session instead. >> >> Hope that helps, >> >> Simon > > > Dear Simon, > thank you. That was the case. Modifying the cascade disabling the backref > cascade does not load in the DB the passages at propagation time. > > But now, when I manually add the subset of passages after the optimization, > I get a: > > InvalidRequestError: Can't attach instance ; > another instance with key [...] is already present in this section. > > So, I suppose that disabling the backref cascade now SQLAlchemy is not > capable anymore to recognize the already loaded Satellite objects... > > Should I maybe merge somewhere? > It sounds like your propagate() function is creating new Satellite instances with the same primary key as instances that have been loaded from the database. Merging is one way to avoid that, and the UniqueObject pattern is another. It's difficult to say which is more appropriate for your usage without seeing more code. But if propagate() is only supposed to be calculating passages, I don't understand why new Satellites are being created at all - shouldn't all the Satellites already be loaded? Simon -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Re: Temporarily disable DB persistence for optimization routine
On Friday, January 12, 2018 at 10:54:49 AM UTC+1, Simon King wrote: > > If I understand your code correctly, scheduler.propagate() creates a > large number of Passage instances, and you only want a small subset of > them to be added to the database. Is that correct? > Correct! > I would guess that the passages are getting added to the session > because you are setting their 'satellite' property to point to a > Satellite which is already in the database. This then causes the > passages to be added to the session due to the default cascade rules > on the relationship > (http://docs.sqlalchemy.org/en/latest/orm/cascades.html). > > If that really is the case, you can change the cascade rules for that > relationship, and then you'll probably need to explicitly add the > passages you want to *keep* to the session instead. > > Hope that helps, > > Simon > Dear Simon, thank you. That was the case. Modifying the cascade disabling the backref cascade does not load in the DB the passages at propagation time. But now, when I manually add the subset of passages after the optimization, I get a: InvalidRequestError: Can't attach instance ; another instance with key [...] is already present in this section. So, I suppose that disabling the backref cascade now SQLAlchemy is not capable anymore to recognize the already loaded Satellite objects... Should I maybe merge somewhere? > > On Fri, Jan 12, 2018 at 2:10 AM, Mike Bayer> wrote: > > I can't give you much detail except to say the unique object recipe is > > doing an .add() when it finds an identity that isn't taken, if you > > don't want those persisted then take out the part of the recipe doing > > add(). However, you'd need to alter the recipe further such that if > > the program asks for that same identity again which you didn't want to > > flush to the DB, and you'd like to use the same object, you need to > > pull that from some kind of local dictionary of "pending" objects with > > those identities, if that makes sense. > > > > the second email with the after_attach thing implies you are already > > adding an object to the Session. > > > > Neither of these code examples show example of use, where you are > > doing things that make objects and you'd like them to not be > > persisted. If you need to create unique objects in memory without > > persisting, you just need to store them in some dictionary that sets > > up the in-memory uniqueness you are looking for. > > > > > > > > On Thu, Jan 11, 2018 at 11:37 AM, Ruben Di Battista > > wrote: > >> Last copy paste went wrong. > >> > >> The uniqueness is ensured by: > >> > >> @event.listens_for(orm.session.Session, "after_attach") > >> def after_attach(session, instance): > >> # when ConstrainedSatellite objects are attached to a Session, > >> # figure out if in the database there's already the Constraint, > >> # requested, if yes return that object, if not create a new one. > >> # This is an adaptation of the UniqueObject pattern > >> # suggested by SQLAlchemy documentation > >> # > https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/UniqueObject > >> if isinstance(instance, UniqueAssociationProxy): > >> instance = instance.ensure_unicity(session) > >> > >> > >> > >> > >> On Thursday, January 4, 2018 at 6:05:38 PM UTC+1, Ruben Di Battista > wrote: > >>> > >>> Hello, > >>> I'm writing a satellite passage scheduler that has a database > persistence > >>> layer to store the scheduled passages. > >>> > >>> The DB schema is organized as follows: > >>> - A table storing the satellites (using NORAD No as Primary Key) > >>> - A table storing the ground stations where to compute the passages of > the > >>> satellites > >>> - A table storing the passages of these satellites, with two foreign > keys > >>> linking each passage to a Ground Station and a Satellite > >>> - A table storing all the types of constraints a satellite can have > >>> - A table storing all the types of weights (used to perform the > >>> scheduling) that can be assigned to each satellite > >>> > >>> Than I configured some association proxies (with the related `Unique > >>> Object` pattern) in order to assign the weights and the satellites as > a > >>> dictionary > >>> > >>> sat.constraints['min_elevation']= 10 > >>> > >>> The details of the relationships here below: > >>> > >>> # Relationship definitions > >>> orm.mapper(Satellite, satellite, properties={ > >>> 'passages': orm.relationship(Passage, > >>> backref='satellite', > >>> order_by=passage.c.aos, > >>> cascade='all, delete-orphan'), > >>> > >>> > >>> '_constraints': orm.relationship( > >>> ConstrainedSatellite, backref='satellite', > >>> > >>>
Re: [sqlalchemy] Re: Temporarily disable DB persistence for optimization routine
If I understand your code correctly, scheduler.propagate() creates a large number of Passage instances, and you only want a small subset of them to be added to the database. Is that correct? I would guess that the passages are getting added to the session because you are setting their 'satellite' property to point to a Satellite which is already in the database. This then causes the passages to be added to the session due to the default cascade rules on the relationship (http://docs.sqlalchemy.org/en/latest/orm/cascades.html). If that really is the case, you can change the cascade rules for that relationship, and then you'll probably need to explicitly add the passages you want to *keep* to the session instead. Hope that helps, Simon On Fri, Jan 12, 2018 at 2:10 AM, Mike Bayerwrote: > I can't give you much detail except to say the unique object recipe is > doing an .add() when it finds an identity that isn't taken, if you > don't want those persisted then take out the part of the recipe doing > add(). However, you'd need to alter the recipe further such that if > the program asks for that same identity again which you didn't want to > flush to the DB, and you'd like to use the same object, you need to > pull that from some kind of local dictionary of "pending" objects with > those identities, if that makes sense. > > the second email with the after_attach thing implies you are already > adding an object to the Session. > > Neither of these code examples show example of use, where you are > doing things that make objects and you'd like them to not be > persisted. If you need to create unique objects in memory without > persisting, you just need to store them in some dictionary that sets > up the in-memory uniqueness you are looking for. > > > > On Thu, Jan 11, 2018 at 11:37 AM, Ruben Di Battista > wrote: >> Last copy paste went wrong. >> >> The uniqueness is ensured by: >> >> @event.listens_for(orm.session.Session, "after_attach") >> def after_attach(session, instance): >> # when ConstrainedSatellite objects are attached to a Session, >> # figure out if in the database there's already the Constraint, >> # requested, if yes return that object, if not create a new one. >> # This is an adaptation of the UniqueObject pattern >> # suggested by SQLAlchemy documentation >> # https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/UniqueObject >> if isinstance(instance, UniqueAssociationProxy): >> instance = instance.ensure_unicity(session) >> >> >> >> >> On Thursday, January 4, 2018 at 6:05:38 PM UTC+1, Ruben Di Battista wrote: >>> >>> Hello, >>> I'm writing a satellite passage scheduler that has a database persistence >>> layer to store the scheduled passages. >>> >>> The DB schema is organized as follows: >>> - A table storing the satellites (using NORAD No as Primary Key) >>> - A table storing the ground stations where to compute the passages of the >>> satellites >>> - A table storing the passages of these satellites, with two foreign keys >>> linking each passage to a Ground Station and a Satellite >>> - A table storing all the types of constraints a satellite can have >>> - A table storing all the types of weights (used to perform the >>> scheduling) that can be assigned to each satellite >>> >>> Than I configured some association proxies (with the related `Unique >>> Object` pattern) in order to assign the weights and the satellites as a >>> dictionary >>> >>> sat.constraints['min_elevation']= 10 >>> >>> The details of the relationships here below: >>> >>> # Relationship definitions >>> orm.mapper(Satellite, satellite, properties={ >>> 'passages': orm.relationship(Passage, >>> backref='satellite', >>> order_by=passage.c.aos, >>> cascade='all, delete-orphan'), >>> >>> >>> '_constraints': orm.relationship( >>> ConstrainedSatellite, backref='satellite', >>> >>> collection_class=orm.collections.attribute_mapped_collection('name'), >>> cascade='all, delete-orphan'), >>> >>> >>> '_weights': orm.relationship( >>> WeightedSatellite, backref='satellite', >>> >>> collection_class=orm.collections.attribute_mapped_collection('name'), >>> lazy='joined', >>> cascade='all, delete-orphan'), >>> >>> >>> '_tle': satellite.c.tle >>> >>> >>> }) >>> >>> >>> orm.mapper(Constraint, constraint, properties={ >>> 'satellites': orm.relationship(ConstrainedSatellite, >>> backref='constraint') >>> >>> >>> }) >>> >>> >>> orm.mapper(Weight, weight, properties={ >>> 'satellites': orm.relationship(WeightedSatellite, backref='weight') >>> }) >>> >>> >>> >>> >>> orm.mapper(ConstrainedSatellite, constraint_satellite) >>> >>> >>> orm.mapper(WeightedSatellite, weight_satellite) >>> >>> >>> orm.mapper(PassageData, passage_data) >>> >>> >>> orm.mapper(Passage, passage, properties={ >>>
Re: [sqlalchemy] Re: Temporarily disable DB persistence for optimization routine
I can't give you much detail except to say the unique object recipe is doing an .add() when it finds an identity that isn't taken, if you don't want those persisted then take out the part of the recipe doing add(). However, you'd need to alter the recipe further such that if the program asks for that same identity again which you didn't want to flush to the DB, and you'd like to use the same object, you need to pull that from some kind of local dictionary of "pending" objects with those identities, if that makes sense. the second email with the after_attach thing implies you are already adding an object to the Session. Neither of these code examples show example of use, where you are doing things that make objects and you'd like them to not be persisted. If you need to create unique objects in memory without persisting, you just need to store them in some dictionary that sets up the in-memory uniqueness you are looking for. On Thu, Jan 11, 2018 at 11:37 AM, Ruben Di Battistawrote: > Last copy paste went wrong. > > The uniqueness is ensured by: > > @event.listens_for(orm.session.Session, "after_attach") > def after_attach(session, instance): > # when ConstrainedSatellite objects are attached to a Session, > # figure out if in the database there's already the Constraint, > # requested, if yes return that object, if not create a new one. > # This is an adaptation of the UniqueObject pattern > # suggested by SQLAlchemy documentation > # https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/UniqueObject > if isinstance(instance, UniqueAssociationProxy): > instance = instance.ensure_unicity(session) > > > > > On Thursday, January 4, 2018 at 6:05:38 PM UTC+1, Ruben Di Battista wrote: >> >> Hello, >> I'm writing a satellite passage scheduler that has a database persistence >> layer to store the scheduled passages. >> >> The DB schema is organized as follows: >> - A table storing the satellites (using NORAD No as Primary Key) >> - A table storing the ground stations where to compute the passages of the >> satellites >> - A table storing the passages of these satellites, with two foreign keys >> linking each passage to a Ground Station and a Satellite >> - A table storing all the types of constraints a satellite can have >> - A table storing all the types of weights (used to perform the >> scheduling) that can be assigned to each satellite >> >> Than I configured some association proxies (with the related `Unique >> Object` pattern) in order to assign the weights and the satellites as a >> dictionary >> >> sat.constraints['min_elevation']= 10 >> >> The details of the relationships here below: >> >> # Relationship definitions >> orm.mapper(Satellite, satellite, properties={ >> 'passages': orm.relationship(Passage, >> backref='satellite', >> order_by=passage.c.aos, >> cascade='all, delete-orphan'), >> >> >> '_constraints': orm.relationship( >> ConstrainedSatellite, backref='satellite', >> >> collection_class=orm.collections.attribute_mapped_collection('name'), >> cascade='all, delete-orphan'), >> >> >> '_weights': orm.relationship( >> WeightedSatellite, backref='satellite', >> >> collection_class=orm.collections.attribute_mapped_collection('name'), >> lazy='joined', >> cascade='all, delete-orphan'), >> >> >> '_tle': satellite.c.tle >> >> >> }) >> >> >> orm.mapper(Constraint, constraint, properties={ >> 'satellites': orm.relationship(ConstrainedSatellite, >> backref='constraint') >> >> >> }) >> >> >> orm.mapper(Weight, weight, properties={ >> 'satellites': orm.relationship(WeightedSatellite, backref='weight') >> }) >> >> >> >> >> orm.mapper(ConstrainedSatellite, constraint_satellite) >> >> >> orm.mapper(WeightedSatellite, weight_satellite) >> >> >> orm.mapper(PassageData, passage_data) >> >> >> orm.mapper(Passage, passage, properties={ >> 'angles': orm.relationship(PassageData, backref='passage', >>order_by=passage_data.c.time, >>cascade='all, delete-orphan') >> >> >> }, >> confirm_deleted_rows=False >> ) >> >> >> orm.mapper(GroundStation, ground_station, properties={ >> 'passages': orm.relationship(Passage, backref='ground_station', >> order_by=passage.c.aos, >> cascade='all, delete-orphan') >> }) >> >> >> >> >> # Association Proxies >> Satellite.constraints = association_proxy( >> '_constraints', 'value', creator=constrained_sat_creator >> ) >> >> >> Satellite.weights = association_proxy( >> '_weights', 'value', creator=weighted_sat_creator >> ) >> >> >> >> >> ConstrainedSatellite.constraint_name = association_proxy('constraint', >> 'name') >> WeightedSatellite.weight_name = association_proxy('weight', 'name') >> >> >> >> >> From the
[sqlalchemy] Re: Temporarily disable DB persistence for optimization routine
Last copy paste went wrong. The uniqueness is ensured by: @event.listens_for(orm.session.Session, "after_attach") def after_attach(session, instance): # when ConstrainedSatellite objects are attached to a Session, # figure out if in the database there's already the Constraint, # requested, if yes return that object, if not create a new one. # This is an adaptation of the UniqueObject pattern # suggested by SQLAlchemy documentation # https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/UniqueObject if isinstance(instance, UniqueAssociationProxy): instance = instance.ensure_unicity(session) On Thursday, January 4, 2018 at 6:05:38 PM UTC+1, Ruben Di Battista wrote: > > Hello, > I'm writing a satellite passage scheduler that has a database persistence > layer to store the scheduled passages. > > The DB schema is organized as follows: > - A table storing the satellites (using NORAD No as Primary Key) > - A table storing the ground stations where to compute the passages of the > satellites > - A table storing the passages of these satellites, with two foreign keys > linking each passage to a Ground Station and a Satellite > - A table storing all the types of constraints a satellite can have > - A table storing all the types of weights (used to perform the > scheduling) that can be assigned to each satellite > > Than I configured some association proxies (with the related `Unique > Object` pattern) in order to assign the weights and the satellites as a > dictionary > > sat.constraints['min_elevation']= 10 > > The details of the relationships here below: > > # Relationship definitions > orm.mapper(Satellite, satellite, properties={ > 'passages': orm.relationship(Passage, > backref='satellite', > order_by=passage.c.aos, > cascade='all, delete-orphan'), > > > '_constraints': orm.relationship( > ConstrainedSatellite, backref='satellite', > collection_class=orm.collections.attribute_mapped_collection( > 'name'), > cascade='all, delete-orphan'), > > > '_weights': orm.relationship( > WeightedSatellite, backref='satellite', > collection_class=orm.collections.attribute_mapped_collection( > 'name'), > lazy='joined', > cascade='all, delete-orphan'), > > > '_tle': satellite.c.tle > > > }) > > > orm.mapper(Constraint, constraint, properties={ > 'satellites': orm.relationship(ConstrainedSatellite, backref= > 'constraint') > > > }) > > > orm.mapper(Weight, weight, properties={ > 'satellites': orm.relationship(WeightedSatellite, backref='weight') > }) > > > > > orm.mapper(ConstrainedSatellite, constraint_satellite) > > > orm.mapper(WeightedSatellite, weight_satellite) > > > orm.mapper(PassageData, passage_data) > > > orm.mapper(Passage, passage, properties={ > 'angles': orm.relationship(PassageData, backref='passage', >order_by=passage_data.c.time, >cascade='all, delete-orphan') > > > }, > confirm_deleted_rows=False > ) > > > orm.mapper(GroundStation, ground_station, properties={ > 'passages': orm.relationship(Passage, backref='ground_station', > order_by=passage.c.aos, > cascade='all, delete-orphan') > }) > > > > > # Association Proxies > Satellite.constraints = association_proxy( > '_constraints', 'value', creator=constrained_sat_creator > ) > > > Satellite.weights = association_proxy( > '_weights', 'value', creator=weighted_sat_creator > ) > > > > > ConstrainedSatellite.constraint_name = association_proxy('constraint', > 'name') > WeightedSatellite.weight_name = association_proxy('weight', 'name') > > > > > From the tests everything is working as expected. The problem is that I'm > getting some performance issues while performing the scheduling procedure: > this procedure needs to read the `constraints` and `weights` associated to > each satellite to select the best passages among all the ones possible on > each ground station. While reading the values of the `constraints` and > `weights`, SQLAlchemy needs to store the Passage object in the Database. So > all the passages are stored in the database during the propagation of all > the possible passages, and then I need to manually expunge or delete the > passages that haven't been scheduled from the DB. > > What I would like to achieve is to "shut down", after selecting the ground > stations and the satellites for which to perform the optimization, the > SQLAlchemy persistence in order to perform the optimization procedure only > on Python objects, without having SQLAlchemy to store them in the DB, and > then just storing efficiently the optimized, smaller, list of them that are > computed by the scheduling algorithm. > > Currently the scheduling script is like this (schematically):
[sqlalchemy] Re: Temporarily disable DB persistence for optimization routine
Dear Mike, thank you for the fast response as usual. Your comment made me think. Actually I was not adding things in the session directly. I revised my code and I believe the behaviour I'm describing is related to the application of the UniqueObject patter described in the documentation. What I'm doing is, having this mixin: """ https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/UniqueObject """ import abc def _unique(session, instance, hash_key, query_func): cache = getattr(session, '_unique_cache', None) if cache is None: session._unique_cache = cache = {} key = (type(instance), hash_key()) if key in cache: return cache[key] else: with session.no_autoflush: q = query_func(session) obj = q.first() if not obj: obj = instance if(isinstance(instance, UniqueAssociationProxy)): session.add(obj) cache[key] = obj return obj class UniqueAssociationProxy(object): @abc.abstractmethod def unique_hash(self, *arg, **kw): raise NotImplementedError() @abc.abstractmethod def unique_filter(self, query, *arg, **kw): raise NotImplementedError() @abc.abstractmethod def ensure_unicity(self, session): return NotImplementedError() Applied to the many by many mappings class ConstrainedSatellite(UniqueAssociationProxy): """ This class is used to be mapped with SQLALchemy in the association object """ def __init__(self, constraint_name, value): # Temporarily assigning the name to a string # in order to ensure uniqueness # https://goo.gl/LbJ7wf # self._constraint_name = name self._constraint_name = constraint_name self.value = value @property def name(self): if self.constraint is not None: return self.constraint.name else: return self._constraint_name def unique_hash(self): return self._constraint_name def unique_filter(self, session): return session.query(Constraint).filter_by(name=self._constraint_name) def ensure_unicity(self, session): instance = Constraint(name=self._constraint_name) self.constraint = _unique(session, instance, self.unique_hash, self.unique_filter) return self class WeightedSatellite(UniqueAssociationProxy): """ This class is uded to be mapped with SQLAlchemy in the association proxy with the Weights """ def __init__(self, weight_name, value): # Temporarily assigning the name to a string # in order to ensure uniqueness # https://goo.gl/LbJ7wf self._weight_name = weight_name self.value = value @property def name(self): if self.weight is not None: return self.weight.name else: return self._weight_name def unique_hash(self): return self._weight_name def unique_filter(self, session): return session.query(Weight).filter_by(name=self._weight_name) def ensure_unicity(self, session): instance = Weight(name=self._weight_name) self.weight = _unique(session, instance, self.unique_hash, self.unique_filter) return self (In the previous message the mappings are reported). Then the uniqueness is ensured by: """ https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/UniqueObject """ import abc def _unique(session, instance, hash_key, query_func): cache = getattr(session, '_unique_cache', None) if cache is None: session._unique_cache = cache = {} key = (type(instance), hash_key()) if key in cache: return cache[key] else: with session.no_autoflush: q = query_func(session) obj = q.first() if not obj: obj = instance if(isinstance(instance, UniqueAssociationProxy)): session.add(obj) cache[key] = obj return obj class UniqueAssociationProxy(object): @abc.abstractmethod def unique_hash(self, *arg, **kw): raise NotImplementedError() @abc.abstractmethod def unique_filter(self, query, *arg, **kw): raise NotImplementedError() @abc.abstractmethod def ensure_unicity(self, session): return NotImplementedError() In this way I'm making SQLA to store everything at instance init time, if I well understand. Could you please help me understand how to improve the situation? Thanks in advance. On Thursday, January 4, 2018 at 6:05:38 PM UTC+1, Ruben Di Battista wrote: > > Hello, > I'm writing a satellite passage scheduler that has a database persistence > layer to store the scheduled passages. > > The DB schema is organized as follows: > - A table storing the satellites (using NORAD No as Primary Key) > - A table storing the