[sqlalchemy] why query_chooser rather than shard_chooser in this case?

2010-01-11 Thread diana
Hello again,

I'm getting errors in a certain case which lead me to suspect that I'm
missing some big picture sharding concept, so to better understand
sharding I'm playing with the SQLAlchemy sharding unit tests
(sqlalchemy/test/orm/sharding/test_shard.py).

Here's one of the investigative tests I've added in order to better
understand query_chooser:

def test_read(self):
session = create_session()
query = session.query(WeatherLocation)

print get tokyo:
# query_chooser returns: ['asia']
tokyo = query.filter_by(city='Tokyo').filter_by
(continent='Asia').first()

print access tokyo:
# query_chooser returns: ['north_america', 'asia', 'europe',
'south_america']
assert tokyo.city == Tokyo

My question: If we already have an instance of tokyo from the 'get
tokyo' code snippet, why is a new query_cls being instantiated to
rerfesh the tokyo object on access (thus having to traverse all 4
shards) rather than using shard_chooser and the got instance to
compute the shard based on its continent value? Is there some way I
can optimize this case, perhaps by setting the shard_id somewhere, so
that 4 queries aren't executed in this case? Hope that was clear
enough.

Thanks again for your time,

--diana
-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.




Re: [sqlalchemy] why query_chooser rather than shard_chooser in this case?

2010-01-11 Thread Michael Bayer
diana wrote:
 Hello again,

 I'm getting errors in a certain case which lead me to suspect that I'm
 missing some big picture sharding concept, so to better understand
 sharding I'm playing with the SQLAlchemy sharding unit tests
 (sqlalchemy/test/orm/sharding/test_shard.py).

 Here's one of the investigative tests I've added in order to better
 understand query_chooser:

 def test_read(self):
 session = create_session()
 query = session.query(WeatherLocation)

 print get tokyo:
   # query_chooser returns: ['asia']
 tokyo = query.filter_by(city='Tokyo').filter_by
 (continent='Asia').first()

 print access tokyo:
   # query_chooser returns: ['north_america', 'asia', 'europe',
 'south_america']
 assert tokyo.city == Tokyo

 My question: If we already have an instance of tokyo from the 'get
 tokyo' code snippet, why is a new query_cls being instantiated to
 rerfesh the tokyo object on access (thus having to traverse all 4
 shards) rather than using shard_chooser and the got instance to
 compute the shard based on its continent value?

You just got a new tokyo from the DB, and I assume no inherited tables are
in effect, the session is brand new, so no SQL should be emitted when
accessing tokyo.city, which I am assuming is a textual field.   The key
city should be present in tokyo.__dict__, and no Session should be
accessed.Nothing I can see from the above code indicates a second SQL
should be emitted.

of course the details of the mapping might say something totally different
(i.e. deferred(), joined table inhertance, etc.)



-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.




Re: [sqlalchemy] why query_chooser rather than shard_chooser in this case?

2010-01-11 Thread Diana Clarke
A, deferred (new to me), thanks!

  in sqlalchemy/test/orm/sharding/test_shard.py:

   mapper(WeatherLocation, weather_locations, properties={
'reports':relation(Report, backref='location'),
'city': deferred(weather_locations.c.city),
})

When I comment out the deferred property, it behaves as I would
suspect (one query_chooser call).

Ok, that answers Question #1.

Question #2 similar, but w/ session.add(). I'll send a new email for
Question #2.

Thanks,

--diana

On Mon, Jan 11, 2010 at 3:25 PM, Michael Bayer mike...@zzzcomputing.com wrote:

 of course the details of the mapping might say something totally different
 (i.e. deferred(), joined table inhertance, etc.)
-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.




Re: [sqlalchemy] why query_chooser rather than shard_chooser in this case?

2010-01-11 Thread Diana Clarke
Again, this investigative test is loosely based on SQLAlchemy's
sharding test: sqlalchemy/test/orm/sharding/test_shard.py

def test_update(self):
print \n
session = create_session()
query = session.query(WeatherLocation)

# query_chooser returns: ['asia']
print get tokyo:
tokyo =
query.filter_by(city='Tokyo').filter_by(continent='Asia').first()

# no new SQL
print access tokyo:
assert tokyo.city == Tokyo

# no new SQL
print change tokyo:
tokyo.city = Tokyo_city_name_changed

# uses shard_chooser by instance
print save tokyo:
session.add(tokyo)
session.commit()

# query_chooser returns: ['north_america', 'asia', 'europe',
'south_america']
print access tokyo 2:
assert tokyo.city == Tokyo_city_name_changed

My question #2: If we already have an instance of tokyo from the 'save
tokyo' code snippet, why is a new query_cls being instantiated to
refresh the tokyo object in 'access tokyo 2' (thus having to traverse
all 4 shards) rather than using shard_chooser and the got instance to
compute the shard based on its continent value? Is there some way I
can optimize this case, perhaps by setting the shard_id somewhere, so
that 4 queries aren't executed in this case?

Thanks,

--diana

On Mon, Jan 11, 2010 at 3:38 PM, Diana Clarke
diana.joan.cla...@gmail.com wrote:
 Question #2 similar, but w/ session.add(). I'll send a new email for
 Question #2.
-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.




Re: [sqlalchemy] why query_chooser rather than shard_chooser in this case?

2010-01-11 Thread Michael Bayer
Diana Clarke wrote:
 Again, this investigative test is loosely based on SQLAlchemy's
 sharding test: sqlalchemy/test/orm/sharding/test_shard.py

 def test_update(self):
 print \n
 session = create_session()
 query = session.query(WeatherLocation)

 # query_chooser returns: ['asia']
 print get tokyo:
 tokyo =
 query.filter_by(city='Tokyo').filter_by(continent='Asia').first()

 # no new SQL
 print access tokyo:
 assert tokyo.city == Tokyo

 # no new SQL
 print change tokyo:
 tokyo.city = Tokyo_city_name_changed

 # uses shard_chooser by instance
 print save tokyo:
 session.add(tokyo)
 session.commit()

 # query_chooser returns: ['north_america', 'asia', 'europe',
 'south_america']
 print access tokyo 2:
 assert tokyo.city == Tokyo_city_name_changed

 My question #2: If we already have an instance of tokyo from the 'save
 tokyo' code snippet, why is a new query_cls being instantiated to
 refresh the tokyo object in 'access tokyo 2' (thus having to traverse
 all 4 shards) rather than using shard_chooser and the got instance to
 compute the shard based on its continent value? Is there some way I
 can optimize this case, perhaps by setting the shard_id somewhere, so
 that 4 queries aren't executed in this case?

well there's two things, one left over from previous.  one is that
commit() expires all attributes in the session.  that is why new SQL is
emitted.   check the docs for rationale there.

but also, the loading of deferred attributes as earlier and expired
attributes here does have the primary key, so its a bug that shard_chooser
is being run here, since the internal function doing that is calling
query._get(), whereas ShardedQuery is being simple and only overriding
get().  You might want to change ShardedQuery to override _get() instead
(which leads me further towards pulling the trigger of moving shard.py out
to examples altogether for 06, since it really is not supportable as a
core element, just FYI).

Its also possibly worth it to get your ShardChooser to the point where it
can recognize what is effectively a get() based on filtering criterion.  
You can do this by imitating the approach in the example FindContinent
chooser in examples/sharding/attribute_shard.py.


 Thanks,

 --diana

 On Mon, Jan 11, 2010 at 3:38 PM, Diana Clarke
 diana.joan.cla...@gmail.com wrote:
 Question #2 similar, but w/ session.add(). I'll send a new email for
 Question #2.
 --
 You received this message because you are subscribed to the Google Groups
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/sqlalchemy?hl=en.




-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.




Re: [sqlalchemy] why query_chooser rather than shard_chooser in this case?

2010-01-11 Thread Diana Clarke
Thanks, Michael.

This will take me a bit to digest, and I'm about to start the second
shift as wife and mother... tomorrow maybe.

Thanks again for the quick responses -- greatly exceeding expectations!

Cheers,

--diana

On Mon, Jan 11, 2010 at 5:14 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 well there's two things, one left over from previous.  one is that
 commit() expires all attributes in the session.  that is why new SQL is
 emitted.   check the docs for rationale there.

 but also, the loading of deferred attributes as earlier and expired
 attributes here does have the primary key, so its a bug that shard_chooser
 is being run here, since the internal function doing that is calling
 query._get(), whereas ShardedQuery is being simple and only overriding
 get().  You might want to change ShardedQuery to override _get() instead
 (which leads me further towards pulling the trigger of moving shard.py out
 to examples altogether for 06, since it really is not supportable as a
 core element, just FYI).

 Its also possibly worth it to get your ShardChooser to the point where it
 can recognize what is effectively a get() based on filtering criterion.
 You can do this by imitating the approach in the example FindContinent
 chooser in examples/sharding/attribute_shard.py.


 Thanks,

 --diana
-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.