[sqlalchemy] why query_chooser rather than shard_chooser in this case?
Hello again, I'm getting errors in a certain case which lead me to suspect that I'm missing some big picture sharding concept, so to better understand sharding I'm playing with the SQLAlchemy sharding unit tests (sqlalchemy/test/orm/sharding/test_shard.py). Here's one of the investigative tests I've added in order to better understand query_chooser: def test_read(self): session = create_session() query = session.query(WeatherLocation) print get tokyo: # query_chooser returns: ['asia'] tokyo = query.filter_by(city='Tokyo').filter_by (continent='Asia').first() print access tokyo: # query_chooser returns: ['north_america', 'asia', 'europe', 'south_america'] assert tokyo.city == Tokyo My question: If we already have an instance of tokyo from the 'get tokyo' code snippet, why is a new query_cls being instantiated to rerfesh the tokyo object on access (thus having to traverse all 4 shards) rather than using shard_chooser and the got instance to compute the shard based on its continent value? Is there some way I can optimize this case, perhaps by setting the shard_id somewhere, so that 4 queries aren't executed in this case? Hope that was clear enough. Thanks again for your time, --diana -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] why query_chooser rather than shard_chooser in this case?
diana wrote: Hello again, I'm getting errors in a certain case which lead me to suspect that I'm missing some big picture sharding concept, so to better understand sharding I'm playing with the SQLAlchemy sharding unit tests (sqlalchemy/test/orm/sharding/test_shard.py). Here's one of the investigative tests I've added in order to better understand query_chooser: def test_read(self): session = create_session() query = session.query(WeatherLocation) print get tokyo: # query_chooser returns: ['asia'] tokyo = query.filter_by(city='Tokyo').filter_by (continent='Asia').first() print access tokyo: # query_chooser returns: ['north_america', 'asia', 'europe', 'south_america'] assert tokyo.city == Tokyo My question: If we already have an instance of tokyo from the 'get tokyo' code snippet, why is a new query_cls being instantiated to rerfesh the tokyo object on access (thus having to traverse all 4 shards) rather than using shard_chooser and the got instance to compute the shard based on its continent value? You just got a new tokyo from the DB, and I assume no inherited tables are in effect, the session is brand new, so no SQL should be emitted when accessing tokyo.city, which I am assuming is a textual field. The key city should be present in tokyo.__dict__, and no Session should be accessed.Nothing I can see from the above code indicates a second SQL should be emitted. of course the details of the mapping might say something totally different (i.e. deferred(), joined table inhertance, etc.) -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] why query_chooser rather than shard_chooser in this case?
A, deferred (new to me), thanks! in sqlalchemy/test/orm/sharding/test_shard.py: mapper(WeatherLocation, weather_locations, properties={ 'reports':relation(Report, backref='location'), 'city': deferred(weather_locations.c.city), }) When I comment out the deferred property, it behaves as I would suspect (one query_chooser call). Ok, that answers Question #1. Question #2 similar, but w/ session.add(). I'll send a new email for Question #2. Thanks, --diana On Mon, Jan 11, 2010 at 3:25 PM, Michael Bayer mike...@zzzcomputing.com wrote: of course the details of the mapping might say something totally different (i.e. deferred(), joined table inhertance, etc.) -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] why query_chooser rather than shard_chooser in this case?
Again, this investigative test is loosely based on SQLAlchemy's sharding test: sqlalchemy/test/orm/sharding/test_shard.py def test_update(self): print \n session = create_session() query = session.query(WeatherLocation) # query_chooser returns: ['asia'] print get tokyo: tokyo = query.filter_by(city='Tokyo').filter_by(continent='Asia').first() # no new SQL print access tokyo: assert tokyo.city == Tokyo # no new SQL print change tokyo: tokyo.city = Tokyo_city_name_changed # uses shard_chooser by instance print save tokyo: session.add(tokyo) session.commit() # query_chooser returns: ['north_america', 'asia', 'europe', 'south_america'] print access tokyo 2: assert tokyo.city == Tokyo_city_name_changed My question #2: If we already have an instance of tokyo from the 'save tokyo' code snippet, why is a new query_cls being instantiated to refresh the tokyo object in 'access tokyo 2' (thus having to traverse all 4 shards) rather than using shard_chooser and the got instance to compute the shard based on its continent value? Is there some way I can optimize this case, perhaps by setting the shard_id somewhere, so that 4 queries aren't executed in this case? Thanks, --diana On Mon, Jan 11, 2010 at 3:38 PM, Diana Clarke diana.joan.cla...@gmail.com wrote: Question #2 similar, but w/ session.add(). I'll send a new email for Question #2. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] why query_chooser rather than shard_chooser in this case?
Diana Clarke wrote: Again, this investigative test is loosely based on SQLAlchemy's sharding test: sqlalchemy/test/orm/sharding/test_shard.py def test_update(self): print \n session = create_session() query = session.query(WeatherLocation) # query_chooser returns: ['asia'] print get tokyo: tokyo = query.filter_by(city='Tokyo').filter_by(continent='Asia').first() # no new SQL print access tokyo: assert tokyo.city == Tokyo # no new SQL print change tokyo: tokyo.city = Tokyo_city_name_changed # uses shard_chooser by instance print save tokyo: session.add(tokyo) session.commit() # query_chooser returns: ['north_america', 'asia', 'europe', 'south_america'] print access tokyo 2: assert tokyo.city == Tokyo_city_name_changed My question #2: If we already have an instance of tokyo from the 'save tokyo' code snippet, why is a new query_cls being instantiated to refresh the tokyo object in 'access tokyo 2' (thus having to traverse all 4 shards) rather than using shard_chooser and the got instance to compute the shard based on its continent value? Is there some way I can optimize this case, perhaps by setting the shard_id somewhere, so that 4 queries aren't executed in this case? well there's two things, one left over from previous. one is that commit() expires all attributes in the session. that is why new SQL is emitted. check the docs for rationale there. but also, the loading of deferred attributes as earlier and expired attributes here does have the primary key, so its a bug that shard_chooser is being run here, since the internal function doing that is calling query._get(), whereas ShardedQuery is being simple and only overriding get(). You might want to change ShardedQuery to override _get() instead (which leads me further towards pulling the trigger of moving shard.py out to examples altogether for 06, since it really is not supportable as a core element, just FYI). Its also possibly worth it to get your ShardChooser to the point where it can recognize what is effectively a get() based on filtering criterion. You can do this by imitating the approach in the example FindContinent chooser in examples/sharding/attribute_shard.py. Thanks, --diana On Mon, Jan 11, 2010 at 3:38 PM, Diana Clarke diana.joan.cla...@gmail.com wrote: Question #2 similar, but w/ session.add(). I'll send a new email for Question #2. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] why query_chooser rather than shard_chooser in this case?
Thanks, Michael. This will take me a bit to digest, and I'm about to start the second shift as wife and mother... tomorrow maybe. Thanks again for the quick responses -- greatly exceeding expectations! Cheers, --diana On Mon, Jan 11, 2010 at 5:14 PM, Michael Bayer mike...@zzzcomputing.com wrote: well there's two things, one left over from previous. one is that commit() expires all attributes in the session. that is why new SQL is emitted. check the docs for rationale there. but also, the loading of deferred attributes as earlier and expired attributes here does have the primary key, so its a bug that shard_chooser is being run here, since the internal function doing that is calling query._get(), whereas ShardedQuery is being simple and only overriding get(). You might want to change ShardedQuery to override _get() instead (which leads me further towards pulling the trigger of moving shard.py out to examples altogether for 06, since it really is not supportable as a core element, just FYI). Its also possibly worth it to get your ShardChooser to the point where it can recognize what is effectively a get() based on filtering criterion. You can do this by imitating the approach in the example FindContinent chooser in examples/sharding/attribute_shard.py. Thanks, --diana -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.