[sqlalchemy] Object inheritance

2012-02-22 Thread Andrea
Hi all,
I have some object on a pre-existing model. Now we want to add a
persistance layer and so SQLAlchemy/SQLite will be our choice.
When I add an object to session a UnmappedInstanceError is raised:
Class 'try_sqlalchemy.example2.applib_model.DescriptorBean'
 is mapped, but this instance lacks instrumentation.  This occurs when
the instance
 is created before
sqlalchemy.orm.mapper(try_sqlalchemy.example2.applib_model.DescriptorBean)
 was called.

This is the example, three code snippets from three python modules:
http://pastebin.com/KLFFN3ke

On debug I see that the DescriptorBean instance is created after the
mapping (mapping is created on startup.createSchema method) so I don't
understand the error message.
The same problems if I use declarative and inherit from Base (class
DescriptorBean(Base, _Struct)). No problems if I have a class that
inherits directly from object. Maybe is _Struct inheritance that
destroys mapper instrumentation? Any suggestion?

Thanks,
Andrea

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Re: Object inheritance

2012-02-22 Thread Andrea
Another update!
Maybe the bad thing is to override self.__dict__. Now I set values
without overriding all and
seems to work:

class _Struct(dict):
def __init__(self,**kw):
dict.__init__(self, kw)
for k,v in kw.iteritems():
self.__dict__[k] = v

On Feb 22, 1:25 pm, Andrea andrea.dellapie...@gmail.com wrote:
 Little update:
 If I remove self.__dict__ = self from _Struct definition exception
 is not raised.

 This is the original base class:

 class _Struct(dict):
     def __init__(self,**kw):
         dict.__init__(self, kw)
         self.__dict__ = self

 On Feb 22, 12:20 pm, Andrea andrea.dellapie...@gmail.com wrote:







  Hi all,
  I have some object on a pre-existing model. Now we want to add a
  persistance layer and so SQLAlchemy/SQLite will be our choice.
  When I add an object to session a UnmappedInstanceError is raised:
  Class 'try_sqlalchemy.example2.applib_model.DescriptorBean'
   is mapped, but this instance lacks instrumentation.  This occurs when
  the instance
   is created before
  sqlalchemy.orm.mapper(try_sqlalchemy.example2.applib_model.DescriptorBean)
   was called.

  This is the example, three code snippets from three python 
  modules:http://pastebin.com/KLFFN3ke

  On debug I see that the DescriptorBean instance is created after the
  mapping (mapping is created on startup.createSchema method) so I don't
  understand the error message.
  The same problems if I use declarative and inherit from Base (class
  DescriptorBean(Base, _Struct)). No problems if I have a class that
  inherits directly from object. Maybe is _Struct inheritance that
  destroys mapper instrumentation? Any suggestion?

  Thanks,
  Andrea

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Working with large IN lists

2012-02-22 Thread Michael Bayer

When we want to test if a Python program has a leak, we do that via seeing 
how many uncollected objects are present.   This is done via gc:

import gc
print total number of objects:, len(gc.get_objects())

That's the only real way to measure if the memory used by Python objects is 
growing unbounded.  Looking at the memory usage on top shows what the 
interpreter takes up - the CPython interpreter in more modern releases does 
release memory back, but only occasionally.   Older versions don't.

If you're doing an operation that loads thousands of rows, those rows are 
virtually always loaded entirely into memory by the DBAPI, before your program 
or SQLAlchemy is ever given the chance to fetch a single row.   I haven't yet 
looked closely at your case here, but that's often at the core of scripts that 
use much more memory than expected.

There's ways to get *some* DBAPIs to not do this (particularly psycopg2, if 
you're using Postgresql, see 
http://docs.sqlalchemy.org/en/latest/orm/query.html?highlight=yield_per#sqlalchemy.orm.query.Query.yield_per
 and 
http://docs.sqlalchemy.org/en/latest/core/connections.html?highlight=stream_results#sqlalchemy.engine.base.Connection.execution_options),
  though the better solution is to usually try loading chunks of records in at 
a time (one such recipe that I use for this is here: 
http://www.sqlalchemy.org/trac/wiki/UsageRecipes/WindowedRangeQuery) .  Or 
better yet consider if the problem can be solved entirely on the SQL side (this 
entirely depends on exactly what you're trying to do with the data in question).


On Feb 22, 2012, at 9:46 AM, Vlad K. wrote:

 
 Okay, after several test cases, various join combinations with or without 
 relationships, with or without cherrypicking columns that are really used 
 from the joined models, I've come to the conclusion that the only problem I'm 
 having here is that there is no garbage collection. Python memory use just 
 keeps growing at a rate that, of course, depends on the size of models used 
 and data queried, but it just keeps growing, regardless of release/deletion 
 of instances or isolating one row processing in its own committed transaction.
 
 I also found this:
 
 http://permalink.gmane.org/gmane.comp.python.sqlalchemy.user/30087
 
 
 So it appears I'm having the same problem.
 
 
 Am I understanding correctly that because of this, SQLAlchemy ORM is in my 
 case useless if I have to process thousands of rows, because the memory used 
 to process each row (along with corresponding joined models etc...) will not 
 be released? So basically I'd have to use SQLA without the ORM, for this 
 particular use case?
 
 Or is this some memory leak bug?
 
 If so, any suggestions, examples on how do I switch from ORM use to non-ORM 
 if I want to retain the named tuples returned by queries and avoid rewriting 
 half the app?
 
 
 Thanks.
 
 
 .oO V Oo.
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Object inheritance

2012-02-22 Thread Michael Bayer
A few things:

1. the Python dict class cannot be mapped.  Classes can only extend from 
object or other classes that in turn extend from object.

2. SQLAlchemy instrumentation relies upon Python descriptors (see 
http://docs.python.org/howto/descriptor.html) to intercept changes in state on 
an object, that is, setting and getting attributes.   So techniques which 
involve direct access to self.__dict__ will fail to work correctly with a 
mapped class.

3. The mapping of a class is considered to be an integral part of that class' 
definition.   It's not a valid use case to map classes at some arbitrary point 
in time after many objects of that class have been created.  It's for this 
reason that modern SQLAlchemy strongly recommends the use of the Declarative 
pattern introduced at http://docs.sqlalchemy.org/en/latest/orm/tutorial.html as 
the primary means of mapping classes to database tables; it eliminates the 
confusion that the mapping and the class itself can be generally treated as two 
independent things.  While classes can be mapped and unmapped, this is not a 
regular operation and isn't appropriate except in special testing circumstances.


On Feb 22, 2012, at 6:20 AM, Andrea wrote:

 Hi all,
 I have some object on a pre-existing model. Now we want to add a
 persistance layer and so SQLAlchemy/SQLite will be our choice.
 When I add an object to session a UnmappedInstanceError is raised:
 Class 'try_sqlalchemy.example2.applib_model.DescriptorBean'
 is mapped, but this instance lacks instrumentation.  This occurs when
 the instance
 is created before
 sqlalchemy.orm.mapper(try_sqlalchemy.example2.applib_model.DescriptorBean)
 was called.
 
 This is the example, three code snippets from three python modules:
 http://pastebin.com/KLFFN3ke
 
 On debug I see that the DescriptorBean instance is created after the
 mapping (mapping is created on startup.createSchema method) so I don't
 understand the error message.
 The same problems if I use declarative and inherit from Base (class
 DescriptorBean(Base, _Struct)). No problems if I have a class that
 inherits directly from object. Maybe is _Struct inheritance that
 destroys mapper instrumentation? Any suggestion?
 
 Thanks,
 Andrea
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Working with large IN lists

2012-02-22 Thread Vlad K.


Hi,

thanks for your reply. I haven't yet tested this with a profiler to see 
exactly what exactly is happening, but the bottom line is that the 
overall memory use grows with each iteration (or transaction processed), 
to the point of grinding the server to a halt, and top shows only the 
Python process involved consuming all the memory.


I've already modified code to read one row at a time, by first creating 
a list of IDs to be affected, then going through that list and selecting 
+ updating/inserting one transaction at a time.


I suppose I can solve the problem entirely on the SQL side with a stored 
function but that's a maintenance overhead I'd like to avoid if possible.


Meanwhile I've gotten rid of convenience relationships and in some 
aspects decided on lazy=select instead of subquery or joined and have 
brought down total memory use, now the entire process can finish with 
the amount of RAM available on the server, but it still shows linear 
growth from the start to the end of the process.


.oO V Oo.


On 02/22/2012 07:23 PM, Michael Bayer wrote:

When we want to test if a Python program has a leak, we do that via seeing 
how many uncollected objects are present.   This is done via gc:

import gc
print total number of objects:, len(gc.get_objects())

That's the only real way to measure if the memory used by Python objects is growing 
unbounded.  Looking at the memory usage on top shows what the interpreter 
takes up - the CPython interpreter in more modern releases does release memory back, but 
only occasionally.   Older versions don't.

If you're doing an operation that loads thousands of rows, those rows are 
virtually always loaded entirely into memory by the DBAPI, before your program 
or SQLAlchemy is ever given the chance to fetch a single row.   I haven't yet 
looked closely at your case here, but that's often at the core of scripts that 
use much more memory than expected.

There's ways to get *some* DBAPIs to not do this (particularly psycopg2, if 
you're using Postgresql, see 
http://docs.sqlalchemy.org/en/latest/orm/query.html?highlight=yield_per#sqlalchemy.orm.query.Query.yield_per
 and 
http://docs.sqlalchemy.org/en/latest/core/connections.html?highlight=stream_results#sqlalchemy.engine.base.Connection.execution_options),
  though the better solution is to usually try loading chunks of records in at 
a time (one such recipe that I use for this is here: 
http://www.sqlalchemy.org/trac/wiki/UsageRecipes/WindowedRangeQuery) .  Or 
better yet consider if the problem can be solved entirely on the SQL side (this 
entirely depends on exactly what you're trying to do with the data in question).


--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Working with large IN lists

2012-02-22 Thread Vlad K.


Yes, definitely growing at a rate of 700-800 per iteration.

.oO V Oo.


On 02/22/2012 07:23 PM, Michael Bayer wrote:

When we want to test if a Python program has a leak, we do that via seeing 
how many uncollected objects are present.   This is done via gc:

import gc
print total number of objects:, len(gc.get_objects())


--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] problem with dynamic tables/classes and inheritance

2012-02-22 Thread lars van gemerden
I am trying to generate tables/classes dynamically. The code below is
my latest attempt, but I cannot get it to work.

-
 class TableName(object):
@declared_attr
def __tablename__(cls): return cls.__name__

class Inherit(object):
@declared_attr
def id(cls): #= is not called for S
base = cls.__bases__[len(cls.__bases__) - 1]
print class, base:, cls.__name__, base.__name__
return Column(Integer, ForeignKey(base.__name__ + '.id'),
primary_key = True)
@declared_attr
def __mapper_args__(cls):
return {'polymorphic_identity': cls.__name__}

class Object(Base, TableName):

association_tables = {}

id = Column(Integer, primary_key = True)
type_name = Column(String(50),  nullable = False)
__mapper_args__ = {'polymorphic_on': type_name}



if __name__ == '__main__':
session = setup(engine)

T = type('T', (Inherit, Object), {'Tdata': Column(String(50))})
S = type('S', (T,), {'Sdata': Column(String(50))}) #= Error
session.commit()
print S.__table__.c
-
the output is:
-
class, base: T Object
class, base: T Object
class, base: T Object
Traceback (most recent call last):
  File D:\Documents\Code\Eclipse\workspace\SQLAdata\src\test4.py,
line 55, in module
S = type('S', (T,), {'Sdata': Column(String(50))})
  File C:\Python27\lib\site-packages\sqlalchemy\ext\declarative.py,
line 1336, in __init__
_as_declarative(cls, classname, cls.__dict__)
  File C:\Python27\lib\site-packages\sqlalchemy\ext\declarative.py,
line 1329, in _as_declarative
**mapper_args)
  File C:\Python27\lib\site-packages\sqlalchemy\orm\__init__.py,
line 1116, in mapper
return Mapper(class_, local_table, *args, **params)
  File C:\Python27\lib\site-packages\sqlalchemy\orm\mapper.py, line
197, in __init__
self._configure_inheritance()
  File C:\Python27\lib\site-packages\sqlalchemy\orm\mapper.py, line
473, in _configure_inheritance
self.local_table)
  File C:\Python27\lib\site-packages\sqlalchemy\sql\util.py, line
303, in join_condition
between '%s' and '%s'.%s % (a.description, b.description, hint))
sqlalchemy.exc.ArgumentError: Can't find any foreign key relationships
between 'T' and 'S'.
-
What is wrong with this approach. Is there a good way to approach this
problem (I have tried a couple already).

Also: Why is Inherit.id() called 3 times for T

Please help!

Lars

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Working with large IN lists

2012-02-22 Thread Claudio Freire
On Wed, Feb 22, 2012 at 4:29 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 thanks for your reply. I haven't yet tested this with a profiler to see 
 exactly what exactly is happening, but the bottom line is that the overall 
 memory use grows with each iteration (or transaction processed), to the 
 point of grinding the server to a halt, and top shows only the Python 
 process involved consuming all the memory.

 yeah like I said that tells you almost nothing until you start looking at 
 gc.get_objects().  If the size of gc.get_objects() grows continuously for 50 
 iterations or more, never decreasing even when gc.collect() is called, then 
 it's a leak.  Otherwise it's just too much data being loaded at once.

I've noticed compiling queries (either explicitly or implicitly) tends
to *fragment* memory. There seem to be long-lived caches in the PG
compiler at least. I can't remember exactly where, but I could take
another look.

I'm talking of rather old versions of SQLA, 0.3 and 0.5.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Working with large IN lists

2012-02-22 Thread Michael Bayer

On Feb 22, 2012, at 2:46 PM, Claudio Freire wrote:

 On Wed, Feb 22, 2012 at 4:29 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 thanks for your reply. I haven't yet tested this with a profiler to see 
 exactly what exactly is happening, but the bottom line is that the overall 
 memory use grows with each iteration (or transaction processed), to the 
 point of grinding the server to a halt, and top shows only the Python 
 process involved consuming all the memory.
 
 yeah like I said that tells you almost nothing until you start looking at 
 gc.get_objects().  If the size of gc.get_objects() grows continuously for 50 
 iterations or more, never decreasing even when gc.collect() is called, then 
 it's a leak.  Otherwise it's just too much data being loaded at once.
 
 I've noticed compiling queries (either explicitly or implicitly) tends
 to *fragment* memory. There seem to be long-lived caches in the PG
 compiler at least. I can't remember exactly where, but I could take
 another look.
 
 I'm talking of rather old versions of SQLA, 0.3 and 0.5.


0.3's code is entirely gone, years ago.  I wouldn't even know what silly things 
it was doing.

In 0.5 and beyond, theres a cache of identifiers for quoting purposes.   If you 
are creating perhaps thousands of tables with hundreds of columns, all names 
being unique, then this cache might start to become  a blip on the radar.   For 
the expected use case of a schema with at most several hundred tables this 
should not be a significant size.

I don't know much what it means for a Python script to fragment memory, and I 
don't really think there's some kind of set of Python programming practices 
that deterministically link to whether or not a script fragments a lot.  Alex 
Martelli talks about it here: 
http://stackoverflow.com/questions/1316767/how-can-i-explicitly-free-memory-in-python
 .The suggestion there is if you truly need to load tons of data into 
memory, doing it in a subprocess is the only way to guarantee that memory is 
freed back to the OS.

As it stands, there are no known memory leaks in SQLAlchemy itself and if you 
look at our tests under aaa_profiling/test_memusage.py you can see we are 
exhaustively ensuring that the size of gc.get_objects() does not grow unbounded 
for all sorts of awkward situations.To illustrate potential new memory 
leaks we need succinct test cases that illustrate a simple ascending growth in 
memory usage.




-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Working with large IN lists

2012-02-22 Thread Michael Bayer

On Feb 22, 2012, at 3:28 PM, Claudio Freire wrote:

 
 Like I said, it's not a leak situation as much of a fragmentation
 situation, where long-lived objects in high memory positions can
 prevent the process' heap from shrinking.
 
 [0] http://revista.python.org.ar/2/en/html/memory-fragmentation.html

Saw that a bit, but looking at the tips at the bottom, concrete 
implementation changes are not coming to mind.   An eternal structure is 
ubiquitous in any programming language.  sys.modules is a big list of all the 
Python modules that have been imported, each one full of functions, classes, 
other data, these are all eternal structures - sys.modules is normally never 
cleaned out.I'm not seeing at what point you move beyond things that are in 
these modules into things that are so-called eternal structures that lead to 
inappropriate memory fragmentation.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Working with large IN lists

2012-02-22 Thread Claudio Freire
On Wed, Feb 22, 2012 at 5:40 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 Saw that a bit, but looking at the tips at the bottom, concrete 
 implementation changes are not coming to mind.   An eternal structure is 
 ubiquitous in any programming language.  sys.modules is a big list of all the 
 Python modules that have been imported, each one full of functions, classes, 
 other data, these are all eternal structures - sys.modules is normally 
 never cleaned out.    I'm not seeing at what point you move beyond things 
 that are in these modules into things that are so-called eternal structures 
 that lead to inappropriate memory fragmentation.

The thing to be careful about is when those eternal structures are created.

If they're created at the beginning (as sys.modules, which is
populated with imports, which most of the time happen in the preamble
of .py files), then the resulting objects will have lower memory
locations and thus not get in the way.

But if those structures are created after the program had time to fill
its address space with transient objects (say, lazy imports, caches),
then when the transient objects are deleted, the eternal structures
(with their high addresses) prevent the heap from shrinking.

Such caches, for instance, are better made limited in lifespan (say,
giving them a finite lifetime, making them expire, actively cleaning
them from time to time). Structures that are truly required to be
eternal are better populated at load time, early in the program's
lifecycle. In my backend, for instance, queries are precompiled at
startup, to make sure they have lower memory addresses. This has
mostly solved SQLA-related memory fragmentation issues for me.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Working with large IN lists

2012-02-22 Thread Claudio Freire
On Wed, Feb 22, 2012 at 5:51 PM, Claudio Freire klaussfre...@gmail.com wrote:
 Such caches, for instance, are better made limited in lifespan (say,
 giving them a finite lifetime, making them expire, actively cleaning
 them from time to time). Structures that are truly required to be
 eternal are better populated at load time, early in the program's
 lifecycle. In my backend, for instance, queries are precompiled at
 startup, to make sure they have lower memory addresses. This has
 mostly solved SQLA-related memory fragmentation issues for me.

One source of trouble I've had here, is the inability to use bind
parameters inside .in_(...).

Queries that accept variable lists, thus, I had to precompile to
string, and replace the inside of the condition by string
interpolation.

Ugly hack, but it served me well.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Working with large IN lists

2012-02-22 Thread Michael Bayer

On Feb 22, 2012, at 3:51 PM, Claudio Freire wrote:

 On Wed, Feb 22, 2012 at 5:40 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 Saw that a bit, but looking at the tips at the bottom, concrete 
 implementation changes are not coming to mind.   An eternal structure is 
 ubiquitous in any programming language.  sys.modules is a big list of all 
 the Python modules that have been imported, each one full of functions, 
 classes, other data, these are all eternal structures - sys.modules is 
 normally never cleaned out.I'm not seeing at what point you move beyond 
 things that are in these modules into things that are so-called eternal 
 structures that lead to inappropriate memory fragmentation.
 
 The thing to be careful about is when those eternal structures are created.
 
 If they're created at the beginning (as sys.modules, which is
 populated with imports, which most of the time happen in the preamble
 of .py files), then the resulting objects will have lower memory
 locations and thus not get in the way.
 
 But if those structures are created after the program had time to fill
 its address space with transient objects (say, lazy imports, caches),
 then when the transient objects are deleted, the eternal structures
 (with their high addresses) prevent the heap from shrinking.
 
 Such caches, for instance, are better made limited in lifespan (say,
 giving them a finite lifetime, making them expire, actively cleaning
 them from time to time). Structures that are truly required to be
 eternal are better populated at load time, early in the program's
 lifecycle. In my backend, for instance, queries are precompiled at
 startup, to make sure they have lower memory addresses. This has
 mostly solved SQLA-related memory fragmentation issues for me.


IMHO the whole point of using a high level, interpreted language like Python is 
that we don't have to be bogged down thinking like C programmers.   How come 
I've never had a memory fragmentation issue before ?  I've made 
precompilation an option for folks who really wanted it but I've never had a 
need for such a thing.   And you can be sure I work on some very large and 
sprawling SQLAlchemy models these days.

There are some caches here and there like the identifier cache as well as 
caches inside of TypeEngine objects, but these caches are all intended to be of 
limited size.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Working with large IN lists

2012-02-22 Thread Claudio Freire
On Wed, Feb 22, 2012 at 6:21 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 IMHO the whole point of using a high level, interpreted language like Python 
 is that we don't have to be bogged down thinking like C programmers.   How 
 come I've never had a memory fragmentation issue before ?      I've made 
 precompilation an option for folks who really wanted it but I've never had 
 a need for such a thing.   And you can be sure I work on some very large and 
 sprawling SQLAlchemy models these days.

Maybe you never used big objects.

Memory fragmentation arises only when the application handles a
mixture of big and small objects, such that holes created by small
objects being freed don't serve big memory requirements.

If your application handles a homogenous workload (ie: every request
is pretty much the same), as is usual, then you won't probably
experience fragmentation.

My application does the usual small-object work, interspersed with
intense computation on big objects, hence my troubles.

Python's garbage collector has been a pending issue for a long time,
but, as I noticed in the linked page, past architectural decisions
prevent some widely desired improvements.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Working with large IN lists

2012-02-22 Thread Vlad K.


Okay, thanks to this article:

http://neverfear.org/blog/view/155/Investigating_memory_leaks_in_Python


I made similar plot of object counts in time, showing top 50 types. The 
resulting PDF is here (you might wish to download it first, Google 
messes it up for me):


https://docs.google.com/open?id=0ByLiBlA59qDwYTY1MGIzYWEtYjMxZi00ZDVlLTk0OTEtOGI2ZjA3NDgyM2Y3


Everything seems to linearly grow in count. Something is keeping all 
those objects reference somewhere. What could possibly be the cause?



.oO V Oo.


--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Working with large IN lists

2012-02-22 Thread Michael Bayer

On Feb 22, 2012, at 6:36 PM, Vlad K. wrote:

 
 Okay, thanks to this article:
 
 http://neverfear.org/blog/view/155/Investigating_memory_leaks_in_Python
 
 
 I made similar plot of object counts in time, showing top 50 types. The 
 resulting PDF is here (you might wish to download it first, Google messes it 
 up for me):
 
 https://docs.google.com/open?id=0ByLiBlA59qDwYTY1MGIzYWEtYjMxZi00ZDVlLTk0OTEtOGI2ZjA3NDgyM2Y3
 
 
 Everything seems to linearly grow in count. Something is keeping all those 
 objects reference somewhere. What could possibly be the cause?


can you provide a self-contained, single file test case that illustrates the 
memory growth ?


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.