[sqlalchemy] Re: post_processors error during 0.3.10 to 0.4 migration (returning different object type based on db data)

2007-10-30 Thread Ron

 you could just be using one mapper for all the classes here.  its
 almost like you should monkeypatch class_mapper() and object_mapper()
 and just be done with it.

 of course the reason mappers are usually specific to a class is
 because, every class would have completely different attributes and
 relations.  but it seems here that is not the case.


Well, the subclasses actually map to a select on the table while Thing
maps to the whole table.  So the Class mapped over different sets even
though they had identical attributes and relations.  But the primary
reason for the additional classes is to get additional methods/
functionality so I may be able to drop the added complexity of mapping
to selects.  How do you reuse a mapper on additional classes?

class Foo(object):
  pass

foo_mapper = mapper(...)

class Bar(Foo):
  pass

then what?
foo_mapper.addClass(?) or something ??


 so, you want people to say:

 s = Server(model='apple')

 then later, they ...upgrade ?  by v0.1 - v0.2 you mean a new copy of
 your framework ?  or just hypothetical versions of the user's
 application ?

 then they say:

 s = some_query.get_my_thing(criterion)

 and they get back an AppleServer, which is some kind of improvement
 over Server.


I guess I should explain further what I'm trying to build.  My app is
a tool to help with managing clusters.  This includes everything from
Datacenters, Racks, Servers, Switches, etc.  I don't know ahead of
time the sorts of specific Things and functionality that might be
useful so I'm trying to make things as flexible, generic, and easy to
extend as possible.  So at the core I suppose you could call this
Thing/Attributes abstraction the framework upon which I and others
would code drivers like Server, SunServer, AppleServer, Switch,
CiscoSwitch, CiscoRouter, Pool, Location, etc etc.  All those together
with a command line interface to facilitate scripting is an app called
clusto.  So, I'm a sysadmin and I'm using this tool.  I just bought a
Load Balancer 5000.  I immediately put it into my system as a plain
old LoadBalancer(manufacturer='Load Balancer', model='5000').  Later
on I decide that I'd like clusto to be able to add servers to my fancy
LoadBalancer5000 configuration.  Nobody else has implemented the
functionality yet, so in true open source form, I dive in and do it
myself:

class LoadBalancer5000(LoadBalancer):
  meta_attrs = [('manufacturer',  'Load Balancer'), ('model', '5000')]

  def addServer(self, someserver):
 # magic

done.  No futzing with the database, no diving into obscure parts of
the code, nothing.  I just plop that class into the right path and it
works.  With a clever command line and scripting interface it may even
be useful.



 would they ever get a Server back again ?  if not, why does the
 database need to change ?  why not just map AppleServer to
 server ?

Because there might also be a SunServer and a FooBarServer and an
AlphaServer.

 also, arent you concerned about query overhead here ?
 with all your objects being completely homogenized into a vertical
 structure and all, that is.

Yeah, this tool isn't built for speed or high load.  It's built for
flexibility and usefulness.  If I want speed I'll figure out caching
and optimization later.  Also, it is version 0.0001 so it's acting, in
part, as a proof of concept.

 theres no straightforward way for me to
 get a list of all the AppleServers, for example, since id have to
 query all these different attributes just to identify those objects.


So, underneath the hood, to get all the AppleServers you'd do:

## pseudocode
for attr in SomeThingClass.all_meta_attrs:
 # all_meta_attrs is a list of all the meta_attrs for that class going
up the inheritance chain, cls.mro()
 thingquery += and_(Attribute.c.key==attr[0],
Attribute.c.value==attr[1])

select(and_(Thing.c.name==Attribute.c.name, thingquery))
#that should get all the Thing that can be managed by the given
class.  Maybe not straightforward but not terribly complex either.

So, in my implementation, the metaclass mapped each Class to such a
select.  I am mapping against different selectables, and so having
different mappers made sense.  So if I did SA functions like:

AppleServer.select(and_(Attribute.c.key='numports',
Attribute.c.value='2'))

I'd only get AppleServers with ('numports', '2') and not any other
types of Things.  At one point I got things working as I just
described, but I'm not sure if that was the case in my latest
iteration of the code.



  I'm still soaking in these examples.  I think what I really want is to
  have mapper accept something like polymorphic_func and base_class.  So
  I would pass it my _setProperClass function and Thing.  The mapper
  will build against Thing and then run _setProperClass against the
  instance.  Yeah, I'm cheating, cause that's kind of basically what I'm
  doing now.  I'm just not sure how else to achieve the functionality
  I'm looking for.

 ah well making polymorphic_on optionally a callable 

[sqlalchemy] Re: post_processors error during 0.3.10 to 0.4 migration (returning different object type based on db data)

2007-10-30 Thread Michael Bayer


On Oct 30, 2007, at 1:35 PM, Ron wrote:


 I guess I should explain further what I'm trying to build.  My app is
 a tool to help with managing clusters.  This includes everything from
 Datacenters, Racks, Servers, Switches, etc.  I don't know ahead of
 time the sorts of specific Things and functionality that might be
 useful so I'm trying to make things as flexible, generic, and easy to
 extend as possible.  So at the core I suppose you could call this
 Thing/Attributes abstraction the framework upon which I and others
 would code drivers like Server, SunServer, AppleServer, Switch,
 CiscoSwitch, CiscoRouter, Pool, Location, etc etc.  All those together
 with a command line interface to facilitate scripting is an app called
 clusto.  So, I'm a sysadmin and I'm using this tool.  I just bought a
 Load Balancer 5000.  I immediately put it into my system as a plain
 old LoadBalancer(manufacturer='Load Balancer', model='5000').  Later
 on I decide that I'd like clusto to be able to add servers to my fancy
 LoadBalancer5000 configuration.  Nobody else has implemented the
 functionality yet, so in true open source form, I dive in and do it
 myself:

 class LoadBalancer5000(LoadBalancer):
   meta_attrs = [('manufacturer',  'Load Balancer'), ('model', '5000')]

   def addServer(self, someserver):
  # magic

 done.  No futzing with the database, no diving into obscure parts of
 the code, nothing.  I just plop that class into the right path and it
 works.  With a clever command line and scripting interface it may even
 be useful.

If i were writing an app like that, id actualy have some kind of end- 
user commands:  create new type - LoadBalancer5000;  convert all  
LoadBalancer + model=5000 to LoadBalancer5000.  i.e. i *would*  
update the data, but id make it easy.  because the database is much  
more efficient if you use a single horizontal column to differentiate  
types.  if you have to dive into vertical attributes every time, that  
greatly limits functionality.  what if i wanted to get a report of  
25,000 objects and their types really quickly ?  would you rather  
iterate through 25000 rows, or 25000 * total number of attributes,  
apply complex rules on the client side to aggrgate the attribute rows  
together and determine types, etc ?  you're not really making the  
best usage of the database in that case.

this is actually not a unique scenario at all.  If you work with  
search engines, often you have to configure a combination of  
horizontal and vertical properties for documents which are  
stored.  the horizontal properties are those that can be searched  
very quickly, whereas the vertical are those which require  
secondary queries to retrieve (like the document's full list of  
metatags).

 theres no straightforward way for me to
 get a list of all the AppleServers, for example, since id have to
 query all these different attributes just to identify those objects.


 So, underneath the hood, to get all the AppleServers you'd do:

 ## pseudocode
 for attr in SomeThingClass.all_meta_attrs:
  # all_meta_attrs is a list of all the meta_attrs for that class going
 up the inheritance chain, cls.mro()
  thingquery += and_(Attribute.c.key==attr[0],
 Attribute.c.value==attr[1])

 select(and_(Thing.c.name==Attribute.c.name, thingquery))
 #that should get all the Thing that can be managed by the given
 class.  Maybe not straightforward but not terribly complex either.

 So, in my implementation, the metaclass mapped each Class to such a
 select.  I am mapping against different selectables, and so having
 different mappers made sense.  So if I did SA functions like:

 AppleServer.select(and_(Attribute.c.key='numports',
 Attribute.c.value='2'))

 I'd only get AppleServers with ('numports', '2') and not any other
 types of Things.  At one point I got things working as I just
 described, but I'm not sure if that was the case in my latest
 iteration of the code.

you can still have a bunch of selects that you just feed into a Thing  
query.  its not critical to have them mapped.


 however, didnt you say that your class
 attributes come from a different table ?  in that case this is still
 not going to work...if youre relying upon eager loading of related ,
 multiple sets of rows, thats not available until well after the
 polymorphic decisions have been made.  the most that polymorhpic_func
 could get is the first row with the Thing's primary key in it.


 That's a good point.  I suppose the function could use that primary
 key to select stuff out of the Attributes table and then analyze those
 to determine the proper class.  But that seems like an unhappy hack.
 Why isn't there a hook into a post-populate part of the mapping?  Or
 whatever the absolute very last step of making an instance happens to
 be.  Does such a thing exist and I just missed it?

no, we'd have to add a hook there too.  every hook slows down  
sqlalchemy's load time just a little bit more, not because of the  
hook itself but because 

[sqlalchemy] Re: post_processors error during 0.3.10 to 0.4 migration (returning different object type based on db data)

2007-10-29 Thread Michael Bayer


On Oct 28, 2007, at 11:02 PM, Ron wrote:


 I have a datastore that consists of 3 tables.
  1. Thing table (just a primary-key name column)
  2. Attr table (key/value columns with an id and foreign key to Thing
 table)
  3. Thing-to-Thing relation table (Things can be 'connected' to each
 other)



heres some questions:

does the metaclass create a mapper() for every subclass ?  i.e. is  
there a mapper for Server, SunServer, etc ?  it seems like there is  
(since by the wrong mapper i meant, calling object_mapper(instance)  
returns a Server or SunServer mapper, not the Thing mapper which  
loaded the object).

if there is a distinct mapper per class, cant your metaclass simply  
assign a hidden type attribute to each class, a single string value  
which indicates what class to load ?  then you can just have a single  
type column in the thing table and SA's regular inheritance  
mechanisms take care of the rest.  the metaclass would only need to  
worry about things when the user first creates the object.  im not  
seeing any reasons here why this wouldnt work.

also when you say the user shouldnt have to know about SA, that  
suggests the other way I might do this, that Thing stays as Thing and  
the user-facing object is actually a wrapper for a Thing, and is not  
mapped to the ORM directly.  this makes it less convenient as the  
user-facing object cant be used in SA operations directly, you'd have  
to translate around sessions and queries.

theres three examples in the distro you should look at:

examples/polymorph/single.py - single table inheritance
examples/vertical/vertical.py - stores vertical attributes  
similarly to your Thing
examples/elementtree/adjacency_list.py - this is mostly about self- 
referential relationships, but also illustrates how a non-mapped  
object can be supplied its internals by a mapped object (i.e. its a  
wrapper).

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: post_processors error during 0.3.10 to 0.4 migration (returning different object type based on db data)

2007-10-28 Thread Ron

Ok, I've figured out the problem but not really sure what the proper
solution is.

Basically, I have Thing objects that can have attributes associated
with them.  I have other classes that are subclasses of the Thing
object.  These classes can provide more specific functionality based
on the type of Thing it is.  Since Thing and it's subclasses all share
the same table, I need a way to get the correct class based on what
type of Thing it is.  I do this by examining the Attributes associated
with a thing.  The different subclasses of Thing match different
attributes.  In 0.3 I did this by called an instance._setProperClass()
function in the populate_instance method of a MapperExtension.  This
seems to make 0.4 angry.  If I call the same _setProperClass() after I
get the object normally everything seems to work fine.

I've attached a simplified version of what I do in my code to
illustrate the problem.


What I did was kind of a hack in 0.3 so I'm not that surprised that it
doesn't work in 0.4, but I'm not sure how else to achieve the
functionality I'm looking for.  Is there a better way to allow for
sqlalchemy to return objects of different types based on the data they
happen to contain?

-Ron

#!/usr/bin/env python

from sqlalchemy import *

from sqlalchemy.ext.sessioncontext import SessionContext
from sqlalchemy.ext.assignmapper import assign_mapper

from sqlalchemy.orm import * #Mapper, MapperExtension
from sqlalchemy.orm.mapper import Mapper

#from clusto.sqlalchemyhelpers import ClustoMapperExtension

import sys
# session context


METADATA = MetaData()

SESSION = scoped_session(sessionmaker(autoflush=True,
transactional=True))

THING_TABLE = Table('things', METADATA,
Column('name', String(128), primary_key=True),
#Column('thingtype', String(128)),
mysql_engine='InnoDB'
)

ATTR_TABLE = Table('thing_attrs', METADATA,
   Column('attr_id', Integer, primary_key=True),
   Column('thing_name', String(128),
  ForeignKey('things.name',
ondelete=CASCADE,
 onupdate=CASCADE)),
   Column('key', String(1024)),
   Column('value', String),
   mysql_engine='InnoDB'
   )



class CustomMapperExtension(MapperExtension):

def populate_instance(self, mapper, selectcontext, row, instance,
**flags):

Mapper.populate_instance(mapper, selectcontext, instance, row,
**flags)

## Causes problems if run here!
instance._setProperClass()
return EXT_CONTINUE






class Attribute(object):

Attribute class holds key/value pair backed by DB

def __init__(self, key, value, thing_name=None):
self.key = key
self.value = value

if thing_name:
self.thing_name = thing_name

def __repr__(self):
return thingname: %s, keyname: %s, value: %s %
(self.thing_name,
  self.key,
  self.value)
def delete(self):
SESSION.delete(self)


SESSION.mapper(Attribute, ATTR_TABLE)


DRIVERLIST = {}

class Thing(object):

Anything


someattrs = (('klass', 'server'),)

def __init__(self, name, *args, **kwargs):

self.name = name

for attr in self.someattrs:
self.addAttr(*attr)

def _setProperClass(self):

Set the class for the proper object to the best suited driver


if self.hasAttr('klass'):
klass = self.getAttr('klass')

self.__class__ =  DRIVERLIST[klass]

def getAttr(self, key, justone=True):

returns the first value of a given key.

if justone is False then return all values for the given key.


attrlist = filter(lambda x: x.key == key, self._attrs)

if not attrlist:
raise KeyError(key)

return justone and attrlist[0].value or [a.value for a in
attrlist]

def hasAttr(self, key, value=None):

if value:
attrlist = filter(lambda x: x.key == key and x.value ==
value, self._attrs)
else:
attrlist = filter(lambda x: x.key == key, self._attrs)

return attrlist and True or False

def addAttr(self, key, value):

Add an attribute (key/value pair) to this Thing.

Attribute keys can have multiple values.

self._attrs.append(Attribute(key, value))


SESSION.mapper(Thing, THING_TABLE,
   properties={'_attrs' : relation(Attribute, lazy=False,
   cascade='all, delete-
orphan',),
   },
   extension=CustomMapperExtension())

DRIVERLIST['thing'] = Thing

class Server(Thing):
someattrs = (('klass', 'server'),)

pass

DRIVERLIST['server'] = Server


[sqlalchemy] Re: post_processors error during 0.3.10 to 0.4 migration (returning different object type based on db data)

2007-10-28 Thread Michael Bayer


On Oct 28, 2007, at 6:58 PM, Ron wrote:


 Ok, I've figured out the problem but not really sure what the proper
 solution is.

 Basically, I have Thing objects that can have attributes associated
 with them.  I have other classes that are subclasses of the Thing
 object.  These classes can provide more specific functionality based
 on the type of Thing it is.  Since Thing and it's subclasses all share
 the same table, I need a way to get the correct class based on what
 type of Thing it is.  I do this by examining the Attributes associated
 with a thing.  The different subclasses of Thing match different
 attributes.  In 0.3 I did this by called an instance._setProperClass()
 function in the populate_instance method of a MapperExtension.  This
 seems to make 0.4 angry.  If I call the same _setProperClass() after I
 get the object normally everything seems to work fine.

 I've attached a simplified version of what I do in my code to
 illustrate the problem.


 What I did was kind of a hack in 0.3 so I'm not that surprised that it
 doesn't work in 0.4, but I'm not sure how else to achieve the
 functionality I'm looking for.  Is there a better way to allow for
 sqlalchemy to return objects of different types based on the data they
 happen to contain?

OK, there was an original intended way for this to happen if via  
MapperExtension, youd do it in create_instance() - just return  
whatever type of object you want. however, from looking at the error  
youre getting, this is actually not going to fix the problem here.

the class of object determines which mapper is used to populate its  
attributes.   the populate step and the new-in-0.4 post-populate  
step are not communicating here because the official mapper for your  
instance changes midway (its based on class).  while I can make the  
post-populate step ignore the mis-communication and just not fire  
off, or i can change how those two steps communicate, the fact  
remains that the *wrong* mapper populates your class (including in  
your 0.3 version)...so im not sure if its the right approach to  
support wrongish behavior like that.

the official way to have the class and populating mapper selected  
based on attributes is using polymorphic inheritance.  if your  
classes are all in an inheritance hierarchy, and theres a single  
attribute that can determine which is the right class, you can use  
that out of the box.  it seems like your model could conform to that  
since its a single type attribute determining the class.  I'd  
suggest trying to work with that model (single table inheritance).   
Otherwise ill probably have to add another MapperExtension hook to  
support this.





--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: post_processors error during 0.3.10 to 0.4 migration (returning different object type based on db data)

2007-10-28 Thread Ron

So, the code I posted is a much simplified version of what I'm trying
to accomplish only used to illustrate the error I was getting.  What I
actually want to do is select the appropriate class based on any
number of Attributes a Thing might have.  I have a metaclass that is
applied to Thing and all it's subclasses.  This metaclass does the
actual call to mapper, creates the select query to map against for the
various subclasses, and builds a DRIVERLIST dictionary with data that
can be used by setProperClass.  In other words, the type of a given
Thing is determined by it's attributes at runtime, not when the Thing
is created.

I didn't run into any functional problems doing it this way in 0.3 so
I'm not sure what you mean by wrong mapper (I used assign_mapper if
that makes any difference).  The reason I did the setProperClass at
the end of the populate_instance function is because I wanted to make
use of the attrs that the mapper would populate.  That seemed like the
easiest way to accomplish the goal at the time.

I've read the Mapping Class Inheritance Hierarchies section in the
documentation and it looks like they won't quite do what I'm trying to
accomplish.  Maybe if I explained my app architecture a little more
you could clarify the solution a bit (sorry, this ended up being more
verbose than I intended):

I have a datastore that consists of 3 tables.
 1. Thing table (just a primary-key name column)
 2. Attr table (key/value columns with an id and foreign key to Thing
table)
 3. Thing-to-Thing relation table (Things can be 'connected' to each
other)

The idea for that schema is to maximize the flexibility of what one
can store.  In this vein I created a Thing class.  This class has many
methods for managing attributes, connections between Things,
searching, matching, clever __init__, __str__, __eq__, etc.  The
design is such that subclasses of Thing only need to set class
variables to achieve certain functionality.  For example, there is a
meta_attrs list that will pre-fill the attributes for an object and
there is also a required_attrs var that will let you define required
arguments to init.  My goal was to make sublcasses or 'drivers' as
simple as possible.

To expand on the Server example from my testcode, say I had this
class:

class Server(Thing):
meta_attrs = [('type', 'server')]

def ssh(self):
   # start an ssh session to this server
   somemagic()

People use that class and do things like:

someserver.addAttr('manufacturer', 'sun')

adding lots of data to the db.  Then later someone decides that sun
servers have some special functionality that should be exposed, say cd
ejection.  They create a new class:

class SunServer(Server):
meta_attrs = [('manufacturer', 'sun')]

def ejectCD(self):
# eject the cd

It should be that easy.  Now, some things I didn't mention earlier.
The all meta_attrs of all parent classes also get applied to new
objects.  So any  s=SunServer()  will have both ('type', 'server') and
('manufacturer', 'sun') attributes.  Also, now that I have this new
SunServer class any time I select something from the database that
matches all its meta_attrs it should return a SunServer object where
it used to return a regular Server object.

ex.

t1 = Thing.query.filter(Thing.c.name == 'someOldSunServer')
isinstance(t1, SunServer) == True

This is without updating the database, or making any other change
aside from adding that new SunServer class.  So, where does sqlalchemy
fit into all this?  People developing drivers shouldn't ever have to
know about SA (they, of course, could make use of it if they want).
The Thing class has a __metaclass__ that takes care of all the SA
magic so every subclass of Thing is taken care of.  However, I'm not
sure how to get the polymorfic stuff in the SA mapper to match against
arbitrary attributes of classes  (specifically those named in the
meta_attrs).  Should I not rely on SA to return any specific type at
all (just always return Thing) and make my own query/select functions
that call _setProperClass on their own?

I'm trying to take advantage of as much of the SA magic as possible,
but I'm unsure when I am going beyond its scope and some of the more
advanced topics are not quite documented enough for me to fully
understand how to use them.

Thanks for the help,
-Ron



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---