Re: [sqlalchemy] ORM events order

2013-12-06 Thread Tim Kersten
Thank you very much

On Thursday, December 5, 2013 5:20:57 PM UTC, Michael Bayer wrote:


 On Dec 5, 2013, at 12:14 PM, Tim Kersten t...@io41.com javascript: 
 wrote: 

  thank you. 
  
  What of the relative ordering of the different ORM event types? i.e. 
  
  before_flush 
  before_delete 
  after_flush 
  etc 
  
  When looking at before_flush I see the before_delete has not yet been 
 fired, yet is has been fired in the after_flush. Is this guaranteed to 
 always be the case? 

 yes, before_flush and after_flush provide boundaries around the mechanics 
 of the flush itself.  before_delete as well as the other mapper-level 
 events like before_update before_insert after_update etc. are all within 
 the flush mechanics. 

 you can’t necessarily rely upon the ordering of insert/update/delete 
 events within the flush however, relative to different objects and 
 especially across different kinds of objects.  The mapper-level flush 
 events are fired right as individual batches of objects are being prepared 
 for INSERT/UPDATE/DELETE statements. 




  
  
  On 5 Dec 2013, at 16:01, Michael Bayer 
  mik...@zzzcomputing.comjavascript: 
 wrote: 
  
  
  On Dec 5, 2013, at 10:51 AM, Tim Kersten t...@io41.com javascript: 
 wrote: 
  
  Hi Folks, 
  
  Is the order ORM events ( 
 http://docs.sqlalchemy.org/en/rel_0_9/orm/events.html ) are fired in 
 deterministic and guaranteed to be the same every time? I've searched the 
 docs and google but couldn't anything relating to their relative order. 
  
  
  the events are ordered.   when you do an event.listen it appends the 
 event listener to a list of listeners.   the events are fired from the 
 beginning of the list on forward.there’s actually an undocumented 
 argument to event.listen() “insert=True” that will cause the listener to be 
 inserted at position zero rather than appended. 
  
  the reason the order of events is not really mentioned much is because 
 there’s complex cases where the order of listener application has not been 
 evaluated or tested.  When you make use of mapper or instrumentation events 
 against un-mapped base classes and such, the actual append() operation 
 doesn’t occur until later, when classes are actually mapped, and this works 
 by shuttling the event listener functions around to those classes.  In 
 these cases we don’t as yet have guarantees in place as to the order of the 
 listeners being first applied, e.g. if you had a class that is a product of 
 two mixins, and each mixin has listeners applied to it, that sort of thing. 
  
  however, the order of listeners once applied should definitely be the 
 same each time assuming no changes to the listener collections. 
  
  
  
  -- 
  You received this message because you are subscribed to the Google 
 Groups sqlalchemy group. 
  To unsubscribe from this group and stop receiving emails from it, send 
 an email to sqlalchemy+...@googlegroups.com javascript:. 
  To post to this group, send email to 
  sqlal...@googlegroups.comjavascript:. 

  Visit this group at http://groups.google.com/group/sqlalchemy. 
  For more options, visit https://groups.google.com/groups/opt_out. 



-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] postgres schema per model

2013-12-06 Thread Richard Gerd Kuesters
well ... i think i was right when i said that this implementation 
doesn't work, and i don't know exactly why.


using abstract, declared_attr or __table_args__ via inheritance is not 
creating my models in another schema.


if i put it explicitly in the model, it works fine. inherited? nopes. no 
way. one question that can be raised is that am i already using 
__table_args__ in those inherited ones?. yes, i am, but not all of 
them, and even so all tables are created in the public schema.


i'm using sa 0.8.3, psycopg2 and postgres 9.3.



On 12/05/2013 01:37 PM, Michael Bayer wrote:

here’s an example, works on this end:

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class AltSchema(object):
__table_args__ = {schema: test_schema}

class A(AltSchema, Base):
__tablename__ = 'a'

id = Column(Integer, primary_key=True)
bs = relationship(B)

class B(AltSchema, Base):
__tablename__ = 'b'

id = Column(Integer, primary_key=True)
a_id = Column(Integer, ForeignKey('test_schema.a.id'))

e = create_engine(postgresql://scott:tiger@localhost/test, echo=True)
Base.metadata.create_all(e)

echo=True will tell all here...



On Dec 5, 2013, at 8:20 AM, Richard Gerd Kuesters 
rich...@humantech.com.br mailto:rich...@humantech.com.br wrote:



hi all!

another question: i have a postgres database, and i would like to 
work with schemas for module models. so far so good, but i didn't 
find much information besides it is available in sa docs.


so, i came into this: 
http://stackoverflow.com/questions/9298296/sqlalchemy-support-of-postgres-schemas


but, i would like to declare it at the model level, not the metadata 
level. is it possible?


i tried using *__table_args__ = {'schema': 'foo'}* in *__abstract__*, 
with and without *declared_attr* decorator and also without 
*__abstract__*, neither worked and all tables were created on public 
schema.


any tips? :)

i'm asking this because i have a LOT of tables, and declare 
*__table_args__* in all of them just because the schema seems kinda 
weird, since we can mixin almost everything in sa.



thanks in advance!
richard.

--
You received this message because you are subscribed to the Google 
Groups sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, 
send an email to sqlalchemy+unsubscr...@googlegroups.com 
mailto:sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com 
mailto:sqlalchemy@googlegroups.com.

Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.




--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


[sqlalchemy] Parallel connection attempts in QueuePool

2013-12-06 Thread Sunil Adapa
Hello,

I have faced a problem in my production server (gevent based); when a 
connection attempt is made and MySQL server does not respond (due to listen 
backlog full), the whole application hangs. This seems to be because 
SQLAlchemy QueuePool does not allow multiple connection attempts 
simultaneously. It is waiting for overflow count lock. I suggest that we 
allow multiple connection attempts at the same time as I don't see any side 
effects of doing so. Details follow.

Details of the problem:

python-sqlalchemy 0.7.4-1
python 2.7.3-0ubuntu2
python-gevent 1.0~b1-1
mysql-server  5.5.34-0ubuntu0.12.0

I opened a gevent backdoor connection to the hung server and created a test 
method and ran it.

def test():
  import pdb
  pdb.set_trace()
  import sqlalchemy
  import mysql.connector
  p = sqlalchemy.pool.manage(mysql.connector, pool_size=128)
  p.connect(host='myhost', port=3306, user='myuser', password='mypassword', 
buffered=True)

It ran with following trace (excerpt):

(Pdb) step
--Call--
 /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(931)connect()
- def connect(self, *args, **kw):

[...]

(Pdb) 
 /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(210)connect()
- return _ConnectionFairy(self).checkout()

[...]

(Pdb) 
 /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(371)__init__()
- rec = self._connection_record = pool._do_get()

[...]

(Pdb) 
--Call--
 /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(671)_do_get()
- def _do_get(self):

[...]

(Pdb) 
 /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(676)_do_get()
- except sqla_queue.Empty:
(Pdb) 
 /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(677)_do_get()
- if self._max_overflow  -1 and \
(Pdb) 
 /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(678)_do_get()
- self._overflow = self._max_overflow:
(Pdb) 
 /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(687)_do_get()
- if self._overflow_lock is not None:
(Pdb) 
 /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(688)_do_get()
- self._overflow_lock.acquire()
(Pdb) 
 /usr/lib/pymodules/python2.7/gevent/coros.py(98)acquire()
- def acquire(self, blocking=True, timeout=None):

Analysis:

Before making a connection attempt the overflow counter lock is obtained 
and it is being released only after the connection either succeeds or 
fails. In my case, a connection remained hung possibly because of a surge 
in new DB connections and SYN backlog overflew on the database server (I 
have since added a timeout and tuned my database server to have much higher 
backlog). While this connection didn't respond, any new connection attempt 
as seen in the above trace waited trying to acquire overflow lock. The 
whole application became in capable of serving requests. Cause is this code:

class QueuePool(Pool):
def _do_get(self):

[...]

if self._overflow_lock is not None:
self._overflow_lock.acquire()

if self._max_overflow  -1 and \
self._overflow = self._max_overflow:
if self._overflow_lock is not None:
self._overflow_lock.release()
return self._do_get()

try:
con = self._create_connection()
self._overflow += 1
finally:
if self._overflow_lock is not None:
self._overflow_lock.release()
return con

Changeset 5f0a7bb cleaned up this code but does not seem to have changed 
the flow (behaviour should be the same on trunk). Since disabling the 
overflow with max_overflow = -1 does not use lock at all, this behaviour is 
possibly an oversight rather than intended behaviour.

Possible solution:

Since the overflow lock seems to be to only maintain overflow count, I 
suggest that we increment the counter *before* connection attempt, don't 
hold the lock during connection attempt and then decrement the counter in 
case of an error. If there is interest in doing this, I shall find time for 
a patch and possibly a test case.

Thank you,

-- 
Sunil Mohan Adapa

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] fake models ?

2013-12-06 Thread Michael Bayer
I can only do this by making an example for you and for that I need the actual 
bits I’d be working with.  The examples at 
http://docs.sqlalchemy.org/en/rel_0_9/orm/examples.html#module-examples.vertical
 already show how to create simple “queryable” properties if you only need to 
see the basic idea.




On Dec 6, 2013, at 6:06 AM, Richard Gerd Kuesters rich...@humantech.com.br 
wrote:

 well ... i was trying to be a bit lazy on describing my models, because i'm 
 interested in querying python classes that were not mapped to any table, but 
 to a lot of tables.
 
 my model works, it doesn't have all the flaws pointed here. i just thought it 
 would be more clear to see. but no problem. let's just forget about 
 schematics and go to a simple class where i can save to the database any 
 attribute that doesn't start with a underscore -- a pretty common pattern 
 between pythonistas :)
 
 so, i have two tables that defines a structure (entity and attribute), and 
 some other tables that contains their values:
 instance, to tell me what entity i'm using;
 types * values, to tell me what i have persisted, from what instance it is, 
 the type of the attribute and, from there, any validation i may find 
 necessary.
 
 now we can start again? :)
 
 what i would like to do is get this persisted pseudo-model and add mappings 
 so i can query on using the sqlalchemy orm api, even better if i would also 
 be able to insert, update, delete ...
 
 
 my best regards,
 richard.
 
 
 
 On 12/05/2013 07:44 PM, Michael Bayer wrote:
 
 Overall the different pieces aren’t making sense entirely.
 
 We have the notion of a “schema”, stored in the database - that is, 
 Entity/Attribute.  Those tables are fixed per type.  It can tell me for 
 example that there’s a “Person” type with two attributes associated, “name” 
 and “website”.
 
 So what happens when I do this:
 
 class Person(Model):
 name = StringType(required=True)
 website = URLType()
 
 is there a metaclass that’s writing to the database at that point the 
 Entity/Attribute rows for that type?  It’s not clear when I say 
 Person.website, am I just going by the fact that we have Person.website in 
 the Python model, and if so what am I getting with the Attribute or even the 
 Value table?  There could just be a table called “url_values” and I join 
 from Instance to that.   The schema seems to be stated twice here in two 
 very different ways.
 
 Also not clear what the purpose of Instance.valid_attributes are, this seems 
 redundant vs. Entity already referring to Attribute.
 
 
 
 
 On Dec 5, 2013, at 11:48 AM, Richard Gerd Kuesters 
 rich...@humantech.com.br wrote:
 
 the vertical mapper example was kind of a base for me to develop my eav 
 system. i have added also a table that defines the structure of the entity, 
 since then I use schematics (and its json schema import - export utility). 
 it's very handy.
 
 i think that a pseudo-code can explain a little better, those ones in sa 
 (i'll just write simplified):
 
 
 class Entity(Base):
 id = Column(int, pk)
 name = Column(str, unique)
 
 
 class Attribute(Base):
 id = Column(int, pk)
 name = Column(str)
 discriminator = Column(enum)  # bool, string, integer, blob, etc
 entity_id = Column(fk(Entity.id))
 
 entity = relationship(Entity)
 
 tbl_args = uniqueconstraint(name, entity_id)
 
 
 basically, with those classes i can define how my eav objects are (of 
 course, they're much more complete, but for now it can give the idea). so, 
 when I have this schematics model:
 
 
 class Person(Model):
 name = StringType(required=True)
 website = URLType()
 
 
 it will be interpreted as:
 
 
 person_entity = Entity()
 person_entity.name('person')
 
 session.add(person_entity)
 session.commit()
 
 name_attr = Attribute()
 name_attr.name = 'name'
 name_attr.discriminator = TypeDiscriminator.STRING  # i used here a 
 slightly modified code from decl_enum, a post you wrote in your blog
 name_attr.entity. = person_entity
 
 session.add(name_attr)
 session.commit()
 
 website_attr = Attribute()
 ...
 
 
 i think this can start to illustrate better what the Person model really 
 is. now, let's go to the values (in poor code again, lol):
 
 
 class Instance(Base):
 id = Column(int, pk)
 entity_id = Column(fk(Entity.id))
 
 entity = relationship(Entity)
 valid_attributes = relationship(Attribute, 
 primaryjoin=entity_id==Attribute.entity_id)
 
 
 class Value(Base):
 id = Column(int, pk)
 attribute_id = Column(fk(Attribute.id))
 discriminator = Column(enum)
 
 __mapper_args__ = dict(polymorphic_on=discriminator)
 
 
 class StringValue(Value):
 id = Column(fk(Value.id))
 value = Column(string)
 
 __mapper_args__ = dict(polymorphic_identity=TypeDiscriminator.STRING)
 
 
 class BoolValue(Value):
  ...
 
 
 then, with a dozen of ifs and elses, I can translate the values given to a 
 Person instance from schematics directly to the database.
 

[sqlalchemy] Re: Parallel connection attempts in QueuePool

2013-12-06 Thread Jonathan Vanasco


On Friday, December 6, 2013 10:16:21 AM UTC-5, Sunil Adapa wrote:

 Before making a connection attempt the overflow counter lock is obtained 
 and it is being released only after the connection either succeeds or 
 fails. In my case, a connection remained hung possibly because of a surge 
 in new DB connections and SYN backlog overflew on the database server (I 
 have since added a timeout and tuned my database server to have much higher 
 backlog). While this connection didn't respond, any new connection attempt 
 as seen in the above trace waited trying to acquire overflow lock. The 
 whole application became in capable of serving requests. Cause is this code:


It sounds to me like you have an anti-pattern somewhere.  Your entire 
application shouldn't hang because of a database connectivity issue. 
 Specific requests to code that is trying to interact with the database 
should hang -- but the application shouldn't itself.  SqlAlchemy shouldn't 
even connect to the database on a request that doesn't have DB logic in it.

The immediate concern I see with your fix -- at least in your situation -- 
is that it could create a further surge and backlog on the MySQL server. 
 I've worked with a few large online properties where some sort of surge 
like this ( where a bunch of app-servers kept reconnecting to mysql in a 
short time ) ended up creating a self-inflicted Denial of Service that took 
the sites offline, and in most of the instances the DB got corrupted and 
had to be repaired.

This could be a good fix, Mike is the arbiter of all that stuff -- but this 
seems to me like it would facilitate a different problem.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


[sqlalchemy] Re: Many2many, referential integrity and introspection

2013-12-06 Thread Jonathan Vanasco
can you share your existing schema for these relations ?



-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Parallel connection attempts in QueuePool

2013-12-06 Thread Michael Bayer

On Dec 6, 2013, at 10:16 AM, Sunil Adapa su...@innopark.in wrote:

 Hello,
 
 I have faced a problem in my production server (gevent based); when a 
 connection attempt is made and MySQL server does not respond (due to listen 
 backlog full), the whole application hangs. This seems to be because 
 SQLAlchemy QueuePool does not allow multiple connection attempts 
 simultaneously. It is waiting for overflow count lock. I suggest that we 
 allow multiple connection attempts at the same time as I don't see any side 
 effects of doing so. Details follow.

OK, I see this is with gevent - while I like the idea of gevent, I’m not deeply 
familiar with best practices for it.  The QueuePool specifically uses 
thread-based locks to achieve it’s work.  I can’t comment on what modifications 
might be needed to it in order to work with gevent’s model, but overall I’d 
suggest an entirely different pool implementation optimized for gevent. 
When I spent some time trying out gevent I noticed that QueuePool might have 
been having problems, and this is not surprising.

For starters, I’d probably use NullPool with a gevent-based application, if 
there are in fact gevent-specific issues occurring.

 
 Analysis:
 
 Before making a connection attempt the overflow counter lock is obtained and 
 it is being released only after the connection either succeeds or fails. In 
 my case, a connection remained hung possibly because of a surge in new DB 
 connections and SYN backlog overflew on the database server (I have since 
 added a timeout and tuned my database server to have much higher backlog). 
 While this connection didn't respond, any new connection attempt as seen in 
 the above trace waited trying to acquire overflow lock. The whole application 
 became in capable of serving requests. Cause is this code:
 
 class QueuePool(Pool):
 def _do_get(self):
 
 [...]
 
 if self._overflow_lock is not None:
 self._overflow_lock.acquire()
 
 if self._max_overflow  -1 and \
 self._overflow = self._max_overflow:
 if self._overflow_lock is not None:
 self._overflow_lock.release()
 return self._do_get()
 
 try:
 con = self._create_connection()
 self._overflow += 1
 finally:
 if self._overflow_lock is not None:
 self._overflow_lock.release()
 return con

 
 Changeset 5f0a7bb cleaned up this code but does not seem to have changed the 
 flow (behaviour should be the same on trunk). Since disabling the overflow 
 with max_overflow = -1 does not use lock at all, this behaviour is possibly 
 an oversight rather than intended behavior.

Noting that I haven’t deeply gotten into this code at the moment, overall I’m 
confused about “the application became incapable of serving requests” - if the 
QueuePool serves out as many connections as it’s supposed to, its supposed to 
block all callers at that point.If you set max_overflow to -1, then there 
is no overflow_lock present at all, it’s set to None in the constructor.  
Otherwise, blocking on the call is what it’s supposed to do, in a traditionally 
threaded application.   If when using gevent this means that other workers are 
blocked because the whole thing expects any kind of waiting to be handled 
“async style”, then that suggests we need a totally different approach for 
gevent.

 Since the overflow lock seems to be to only maintain overflow count, I 
 suggest that we increment the counter *before* connection attempt, don't hold 
 the lock during connection attempt and then decrement the counter in case of 
 an error. If there is interest in doing this, I shall find time for a patch 
 and possibly a test case.

How would that work with a traditionally threaded application?   My program 
goes to get a connection, the QueuePool says there’s none available yet and I 
should wait, then the call returns with…what?if it isn’t waiting.   I 
apologize that I have only a fuzzy view of how things work with gevent, and at 
this time of the morning I’m probably not engaging the traditional threading 
model in my head so well either.




signature.asc
Description: Message signed with OpenPGP using GPGMail


[sqlalchemy] Functions sometimes return decimal or float

2013-12-06 Thread Robert Buchholz
Hello,

I'm working with a database that stores mostly integer values or floats
with little (relevant) precision. Thus, I am using all numeric types
with asdecimal=True.
Unfortunately, our database-calculations return Decimal objects when run
in MySQL, though they return int/float in Sqlite. While I can deal with
either type, the inconsistency is causing some pain -- as we only notice
the lack of return value conversion / db CAST rather late in the
development process.
Generally, I would like the ORM ensure consistent result types for a
query like:
  DBSession.query(func.sum(Table.int_column))

We often handle this as:
  DBSession.query(cast(func.sum(Table.int_column), Integer))

Is there something we can do to generally avoid this type of bug?
E.g., always return all DB-results to float, issue a warning when an
explicit cast is missing and some DBs may return decimal, ... ?

How do you deal with this situation?


Cheers,
Robert

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


[sqlalchemy] Re: Functions sometimes return decimal or float

2013-12-06 Thread Jonathan Vanasco

what about using custom compiler functions 
( 
http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html#utc-timestamp-function 
) ?

you could make a custom function...

sum_integer()

and just call that instead of the default sqlalchemy option 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Re: Functions sometimes return decimal or float

2013-12-06 Thread Robert Buchholz
Hey Jonathan,

On Fr, 2013-12-06 at 10:12 -0800, Jonathan Vanasco wrote:
 what about using custom compiler functions 

that is definitely a good idea to cast the computation result more
elegantly. Do you also know a way to solve the issue of doing it wrong
by accident and not noticing?
Besides SUM, there's still AVG and other mathematical functions, that
you can still use and not be aware of the impact in other DB engines.

My goal was to make it harder to make a mistake unknowingly.


Cheers,

Robert

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Parallel connection attempts in QueuePool

2013-12-06 Thread Michael Bayer

On Dec 6, 2013, at 11:15 AM, Michael Bayer mike...@zzzcomputing.com wrote:

 
 Since the overflow lock seems to be to only maintain overflow count, I 
 suggest that we increment the counter *before* connection attempt, don't 
 hold the lock during connection attempt and then decrement the counter in 
 case of an error. If there is interest in doing this, I shall find time for 
 a patch and possibly a test case.

starting again, I think the part I missed is that you’re saying the creation of 
a new connection is the part that hung, not the waiting for existing 
connections to be available.   So suppose _create_connection() is hanging, but 
there’s plenty of overflow available - other threads should still be able to go 
in and all access _create_connection().

the proposal is simple enough:

diff --git a/lib/sqlalchemy/pool.py b/lib/sqlalchemy/pool.py
index 34681ef..219a98f 100644
--- a/lib/sqlalchemy/pool.py
+++ b/lib/sqlalchemy/pool.py
@@ -812,11 +812,14 @@ class QueuePool(Pool):
 self._overflow = self._max_overflow:
 return self._do_get()
 else:
-con = self._create_connection()
 self._overflow += 1
-return con
 finally:
 self._overflow_lock.release()
+try:
+return self._create_connection()
+except:
+self._overflow -= 1
+raise
 
 def recreate(self):
 self.logger.info(Pool recreating”)

the hard part is producing a test case.   I noticed just now that even if I 
take overflow_lock out entirely, all the current tests pass, but this is 
because it’s not easy for tests to catch race conditions like that.   To test 
the new change, it should be simpler, inject a mock connection that will hang 
on one attempt and pass on another, then ensure that the second attempt 
successfully connects within the overflow range before the “hanging” one does 
(or errors out).

Also can you confirm the MySQL behavior here is such that only arbitrary 
connection attempts are hanging?   That is, a subsequent connection attempt 
succeeds while the previous one continues to hang - otherwise I’m not sure how 
this patch improves the situation.





signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] Re: Functions sometimes return decimal or float

2013-12-06 Thread Jonathan Vanasco
I just learned all about the Events model.  You could potentially write a 
listener that can filter the data for you.  I'm not very familiar with the 
events though, and it could be a bit difficult to do -- as some items would 
be functions, others column operations, and others from 'text' constructs 
-- you might end up needing to apply a filter to every 'number-like' value, 
and that could cause bad side effects.

are there any methods in the databases that would set up per-session 
options ?

mysql has session-level options, sqlite has pragma and some other stuff. 
 maybe you could tell the db to format stuff on it's end?


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Parallel connection attempts in QueuePool

2013-12-06 Thread Michael Bayer
yeah, sorry I get this entirely now, ticket is 
http://www.sqlalchemy.org/trac/ticket/2880 and here’s a test, has nothing to do 
with gevent.  the patch should be fine and I’ll see if i can get it into 0.8.4 
as well.

from sqlalchemy.pool import QueuePool
from sqlalchemy.testing.mock import Mock, call
import threading
import time

dbapi = Mock()

def hanging_dbapi():
time.sleep(5)
return dbapi.connect()

def fast_dbapi():
return dbapi.connect()

def failing_dbapi():
time.sleep(5)
raise Exception(connection failed)

creator = threading.local()

def create():
return creator.mock_connector()

def run_test(name, should_hang, should_fail):
print(run test: %s %s %s % (name, should_hang, should_fail))
if should_fail:
creator.mock_connector = failing_dbapi
elif should_hang:
creator.mock_connector = hanging_dbapi
else:
creator.mock_connector = fast_dbapi

conn = pool.connect()
print(connected: %s % name)
conn.operation(name)
time.sleep(3)
conn.close()

pool = QueuePool(creator=create, pool_size=2, max_overflow=3)

success_one = threading.Thread(target=run_test, args=(success_one, False, 
False))
success_two = threading.Thread(target=run_test, args=(success_two, False, 
False))
overflow_one = threading.Thread(target=run_test, args=(overflow_one, True, 
False))
overflow_two = threading.Thread(target=run_test, args=(overflow_two, False, 
False))
overflow_three = threading.Thread(target=run_test, args=(overflow_three, 
False, False))

success_one.start()
time.sleep(.5)

success_two.start()
time.sleep(.5)

overflow_one.start()
time.sleep(.5)

overflow_two.start()
time.sleep(.5)

overflow_three.start()
time.sleep(.5)

overflow_one.join(timeout=10)
assert \
dbapi.connect().operation.mock_calls == \
[call(success_one), call(success_two),
call(overflow_two), call(overflow_three), call(overflow_one)],\
dbapi.connect().operation.mock_calls








signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] Functions sometimes return decimal or float

2013-12-06 Thread Michael Bayer

On Dec 6, 2013, at 12:28 PM, Robert Buchholz robert.buchh...@goodpoint.de 
wrote:

 Hello,
 
 I'm working with a database that stores mostly integer values or floats
 with little (relevant) precision. Thus, I am using all numeric types
 with asdecimal=True.
 Unfortunately, our database-calculations return Decimal objects when run
 in MySQL, though they return int/float in Sqlite. While I can deal with
 either type, the inconsistency is causing some pain -- as we only notice
 the lack of return value conversion / db CAST rather late in the
 development process.
 Generally, I would like the ORM ensure consistent result types for a
 query like:
  DBSession.query(func.sum(Table.int_column))
 
 We often handle this as:
  DBSession.query(cast(func.sum(Table.int_column), Integer))
 
 Is there something we can do to generally avoid this type of bug?
 E.g., always return all DB-results to float, issue a warning when an
 explicit cast is missing and some DBs may return decimal, ... ?
 
 How do you deal with this situation?


the func.XYZ() construct returns an object that is untyped in most cases, 
meaning SQLAlchemy has no opinion about the data that’s returned by the 
database.  This type can be specified using “type_=sometype”.

However, that should not be necessary for sum() - there’s a small subset of 
known SQL functions for which we assign a fixed type.  sum() is actually one of 
them, and should be using the same type as that of the given expression.  So if 
you hand it a Column with type Numeric(asdecimal=True), that will be the return 
type of the func, and the dialect-specific logic will take effect.

A test case (below) confirms this is the case for sum() - the type of column is 
maintained, whether asdecimal is True or False, succeeds on both MySQL and 
SQLite.  If you’re using some function other than sum() which isn’t one of the 
known types, then you need to add type_=yourtype to it, like 
func.myfunc(table.c.column, type_=Float(asdecimal=True)).

from sqlalchemy import func, Table, Column, MetaData, Float, select, 
create_engine
from decimal import Decimal

# create a table with a Float column not using decimal.
m = MetaData()
t = Table('t', m, Column('data', Float))

dburl = sqlite://
#dburl = mysql://scott:tiger@localhost/test
e = create_engine(dburl, echo=True)
m.drop_all(e)
m.create_all(e)
e.execute(t.insert(), data=45.67)

# now let's query func.sum() using a Float with asdecimal:
m2 = MetaData()
t_with_decimal = Table('t', m2, Column('data', Float(asdecimal=True)))

# with the Float column, we get float
assert isinstance(
e.scalar(select([func.sum(t.c.data)])),
float
)

# with the Float(asdecimal=True) column, we get Decimal
assert isinstance(
e.scalar(select([func.sum(t_with_decimal.c.data)])),
Decimal
)






 
 
 Cheers,
 Robert
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy.
 For more options, visit https://groups.google.com/groups/opt_out.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] Parallel connection attempts in QueuePool

2013-12-06 Thread Michael Bayer

On Dec 6, 2013, at 6:01 PM, Claudio Freire klaussfre...@gmail.com wrote:

 On Fri, Dec 6, 2013 at 3:50 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 +except:
 +self._overflow -= 1
 +raise
 
 
 That also needs to aquire the lock.

I think I’m going to make a different object out of “overflow” just to make 
this easier to follow.  It just needs to be an atomic counter that goes between 
M and N and tells you “yes” or “no”.   the code right now is very inlined as it 
originates from a certain performance-crazy time in SQLA’s history, but 
connection pool checkouts aren’t really per-statement these days.




signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] Parallel connection attempts in QueuePool

2013-12-06 Thread Michael Bayer

On Dec 6, 2013, at 6:27 PM, Michael Bayer mike...@zzzcomputing.com wrote:

 
 On Dec 6, 2013, at 6:01 PM, Claudio Freire klaussfre...@gmail.com wrote:
 
 On Fri, Dec 6, 2013 at 3:50 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 +except:
 +self._overflow -= 1
 +raise
 
 
 That also needs to aquire the lock.
 
 I think I’m going to make a different object out of “overflow” just to make 
 this easier to follow.  It just needs to be an atomic counter that goes 
 between M and N and tells you “yes” or “no”.   the code right now is very 
 inlined as it originates from a certain performance-crazy time in SQLA’s 
 history, but connection pool checkouts aren’t really per-statement these days.

OK more or less that, please review my commit at 
https://github.com/zzzeek/sqlalchemy/commit/d1cc78479d988bd9acbcf395483d2130b0873b1c
 which moves handling of “overflow” into _inc_overflow() and _dec_overflow() 
methods - the handling of the _overflow_lock is local to those methods.  The 
_do_get() method is a lot easier to read now.   Also added the existing test 
plus a new one to check the “decrement overflow on connection failed”.

thanks all for the help on this one.




signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] Re: Parallel connection attempts in QueuePool

2013-12-06 Thread Sunil

On Friday 06 December 2013 09:35 PM, Jonathan Vanasco wrote:
[...]

It sounds to me like you have an anti-pattern somewhere.  Your entire
application shouldn't hang because of a database connectivity issue.
  Specific requests to code that is trying to interact with the database
should hang -- but the application shouldn't itself.  SqlAlchemy
shouldn't even connect to the database on a request that doesn't have DB
logic in it.


Sorry, I forgot to mention that all my application's requests require 
database queries. As expected, other parts of my application are indeed 
working properly. In fact, as mentioned, I was able to connect to the 
gevent backdoor (like twisted manhole) port and obtain the provided 
trace on the 'hung' server.




The immediate concern I see with your fix -- at least in your situation
-- is that it could create a further surge and backlog on the MySQL
server.  I've worked with a few large online properties where some sort
of surge like this ( where a bunch of app-servers kept reconnecting to
mysql in a short time ) ended up creating a self-inflicted Denial of
Service that took the sites offline, and in most of the instances the DB
got corrupted and had to be repaired.



Hundreds of connections per second to the database are being created by 
a legacy PHP application. My Python/SQLAlchemy application is much more 
nice behaving during connection pooling. It just got caught in the 
overall mess.


My database is able to handle lot higher connections than the temporary 
backlog surge. The proper fix for the backlog problem, of course, is to 
increase the backlog by setting MySQL backlog parameter and overall 
system backlog values in the kernel. I have fixed this [1].


However, the current problem is that once a temporary backlog overflow 
is hit (or a few connections are lost for some reason), the application 
does not *ever* serve another database related request. It certainly 
requires the fix that has been committed.


Links:

1) 
http://www.mysqlperformanceblog.com/2012/01/06/mysql-high-number-connections-per-secon/


Thank you,

--
Sunil

--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Parallel connection attempts in QueuePool

2013-12-06 Thread Sunil

On Friday 06 December 2013 09:45 PM, Michael Bayer wrote:
[...]

OK, I see this is with gevent - while I like the idea of gevent, I’m
not deeply familiar with best practices for it.  The QueuePool
specifically uses thread-based locks to achieve it’s work.  I can’t
comment on what modifications might be needed to it in order to work
with gevent’s model, but overall I’d suggest an entirely different
pool implementation optimized for gevent. When I spent some time
trying out gevent I noticed that QueuePool might have been having
problems, and this is not surprising.

For starters, I’d probably use NullPool with a gevent-based
application, if there are in fact gevent-specific issues occurring.


The threading API is transparently replaced with gevent's own 
lightweight threading implementation using monkey patching. This 
includes the lock implementation. After monkey patching, a library like 
SQLAlchemy instead of spawning threads will unknowingly spawn gthreads 
and instead of using regular thread locks will unknowingly use gthread 
locks. Where in the traditional model locks block a thread and other 
threads continue to run, a gthread lock stops a gthread and returns back 
to the event loop for processing and running other events/gthreads.


This all usually works fine except in rare situations. I see that there 
is nothing in SQLAlchemy/QueuePool that would make this not work 
properly. I am happy to report that I have been 
Gevent/SQLAlchemy/QueuePool for quite some time in a highly available 
setup with ~2-3k QPS database load.


As noted by you later, the problem at hand has nothing to do with gevent 
and would occur in a traditional threading model too. Sorry to have 
introduced gevent confusion, but I felt obliged to mention it for the 
purpose of a full report.


[...]



Changeset 5f0a7bb cleaned up this code but does not seem to have
changed the flow (behaviour should be the same on trunk). Since
disabling the overflow with max_overflow = -1 does not use lock at
all, this behaviour is possibly an oversight rather than intended
behavior.


Noting that I haven’t deeply gotten into this code at the moment,
overall I’m confused about “the application became incapable of
serving requests” - if the QueuePool serves out as many connections
as it’s supposed to, its supposed to block all callers at that point.
If you set max_overflow to -1, then there is no overflow_lock present
at all, it’s set to None in the constructor.  Otherwise, blocking on
the call is what it’s supposed to do, in a traditionally threaded
application.   If when using gevent this means that other workers are
blocked because the whole thing expects any kind of waiting to be
handled “async style”, then that suggests we need a totally different
approach for gevent.


Since the overflow lock seems to be to only maintain overflow
count, I suggest that we increment the counter *before* connection
attempt, don't hold the lock during connection attempt and then
decrement the counter in case of an error. If there is interest in
doing this, I shall find time for a patch and possibly a test
case.


How would that work with a traditionally threaded application?   My
program goes to get a connection, the QueuePool says there’s none
available yet and I should wait, then the call returns with…what?
if it isn’t waiting.   I apologize that I have only a fuzzy view of
how things work with gevent, and at this time of the morning I’m
probably not engaging the traditional threading model in my head so
well either.




As you predicted in the later mail, this problem has in fact occurred 
way before the pool size has reached.


Pool limit = 128 + 10 overflow
Checked out connections at the time of the problem = 27

--
Sunil

--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Parallel connection attempts in QueuePool

2013-12-06 Thread Sunil

On Saturday 07 December 2013 12:20 AM, Michael Bayer wrote:
[...]

the hard part is producing a test case.   I noticed just now that
even if I take overflow_lock out entirely, all the current tests
pass, but this is because it’s not easy for tests to catch race
conditions like that.   To test the new change, it should be simpler,
inject a mock connection that will hang on one attempt and pass on
another, then ensure that the second attempt successfully connects
within the overflow range before the “hanging” one does (or errors
out).


I was thinking of writing a dummy listener to connect MySQL 
python/connector to for the test case. I see that the your test case 
using a dummy connector method is much simpler and elegant.




Also can you confirm the MySQL behavior here is such that only
arbitrary connection attempts are hanging?   That is, a subsequent
connection attempt succeeds while the previous one continues to hang
- otherwise I’m not sure how this patch improves the situation.


My observation is that subsequent connection attempts go alright. 
Although some of my application instances (3 out of 12) hung, other 
instances (9 lucky ones out of 12) continued to work. Legacy PHP code 
without connection pooling continued to work.


Due the nature of the listen() backlog, this problem, I assume, is not 
specific to MySQL. Information I used to fix the MySQL problem provides 
more information [1][2][3].


Links:

1) 
https://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_back_log


2) 
http://www.mysqlperformanceblog.com/2012/01/06/mysql-high-number-connections-per-secon/


3) http://man7.org/linux/man-pages/man2/listen.2.html

Thank you,

--
Sunil

--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.