[sqlalchemy] Re: post_processors error during 0.3.10 to 0.4 migration (returning different object type based on db data)
you could just be using one mapper for all the classes here. its almost like you should monkeypatch class_mapper() and object_mapper() and just be done with it. of course the reason mappers are usually specific to a class is because, every class would have completely different attributes and relations. but it seems here that is not the case. Well, the subclasses actually map to a select on the table while Thing maps to the whole table. So the Class mapped over different sets even though they had identical attributes and relations. But the primary reason for the additional classes is to get additional methods/ functionality so I may be able to drop the added complexity of mapping to selects. How do you reuse a mapper on additional classes? class Foo(object): pass foo_mapper = mapper(...) class Bar(Foo): pass then what? foo_mapper.addClass(?) or something ?? so, you want people to say: s = Server(model='apple') then later, they ...upgrade ? by v0.1 - v0.2 you mean a new copy of your framework ? or just hypothetical versions of the user's application ? then they say: s = some_query.get_my_thing(criterion) and they get back an AppleServer, which is some kind of improvement over Server. I guess I should explain further what I'm trying to build. My app is a tool to help with managing clusters. This includes everything from Datacenters, Racks, Servers, Switches, etc. I don't know ahead of time the sorts of specific Things and functionality that might be useful so I'm trying to make things as flexible, generic, and easy to extend as possible. So at the core I suppose you could call this Thing/Attributes abstraction the framework upon which I and others would code drivers like Server, SunServer, AppleServer, Switch, CiscoSwitch, CiscoRouter, Pool, Location, etc etc. All those together with a command line interface to facilitate scripting is an app called clusto. So, I'm a sysadmin and I'm using this tool. I just bought a Load Balancer 5000. I immediately put it into my system as a plain old LoadBalancer(manufacturer='Load Balancer', model='5000'). Later on I decide that I'd like clusto to be able to add servers to my fancy LoadBalancer5000 configuration. Nobody else has implemented the functionality yet, so in true open source form, I dive in and do it myself: class LoadBalancer5000(LoadBalancer): meta_attrs = [('manufacturer', 'Load Balancer'), ('model', '5000')] def addServer(self, someserver): # magic done. No futzing with the database, no diving into obscure parts of the code, nothing. I just plop that class into the right path and it works. With a clever command line and scripting interface it may even be useful. would they ever get a Server back again ? if not, why does the database need to change ? why not just map AppleServer to server ? Because there might also be a SunServer and a FooBarServer and an AlphaServer. also, arent you concerned about query overhead here ? with all your objects being completely homogenized into a vertical structure and all, that is. Yeah, this tool isn't built for speed or high load. It's built for flexibility and usefulness. If I want speed I'll figure out caching and optimization later. Also, it is version 0.0001 so it's acting, in part, as a proof of concept. theres no straightforward way for me to get a list of all the AppleServers, for example, since id have to query all these different attributes just to identify those objects. So, underneath the hood, to get all the AppleServers you'd do: ## pseudocode for attr in SomeThingClass.all_meta_attrs: # all_meta_attrs is a list of all the meta_attrs for that class going up the inheritance chain, cls.mro() thingquery += and_(Attribute.c.key==attr[0], Attribute.c.value==attr[1]) select(and_(Thing.c.name==Attribute.c.name, thingquery)) #that should get all the Thing that can be managed by the given class. Maybe not straightforward but not terribly complex either. So, in my implementation, the metaclass mapped each Class to such a select. I am mapping against different selectables, and so having different mappers made sense. So if I did SA functions like: AppleServer.select(and_(Attribute.c.key='numports', Attribute.c.value='2')) I'd only get AppleServers with ('numports', '2') and not any other types of Things. At one point I got things working as I just described, but I'm not sure if that was the case in my latest iteration of the code. I'm still soaking in these examples. I think what I really want is to have mapper accept something like polymorphic_func and base_class. So I would pass it my _setProperClass function and Thing. The mapper will build against Thing and then run _setProperClass against the instance. Yeah, I'm cheating, cause that's kind of basically what I'm doing now. I'm just not sure how else to achieve the functionality I'm looking for. ah well making polymorphic_on optionally a callable
[sqlalchemy] Re: post_processors error during 0.3.10 to 0.4 migration (returning different object type based on db data)
On Oct 30, 2007, at 1:35 PM, Ron wrote: I guess I should explain further what I'm trying to build. My app is a tool to help with managing clusters. This includes everything from Datacenters, Racks, Servers, Switches, etc. I don't know ahead of time the sorts of specific Things and functionality that might be useful so I'm trying to make things as flexible, generic, and easy to extend as possible. So at the core I suppose you could call this Thing/Attributes abstraction the framework upon which I and others would code drivers like Server, SunServer, AppleServer, Switch, CiscoSwitch, CiscoRouter, Pool, Location, etc etc. All those together with a command line interface to facilitate scripting is an app called clusto. So, I'm a sysadmin and I'm using this tool. I just bought a Load Balancer 5000. I immediately put it into my system as a plain old LoadBalancer(manufacturer='Load Balancer', model='5000'). Later on I decide that I'd like clusto to be able to add servers to my fancy LoadBalancer5000 configuration. Nobody else has implemented the functionality yet, so in true open source form, I dive in and do it myself: class LoadBalancer5000(LoadBalancer): meta_attrs = [('manufacturer', 'Load Balancer'), ('model', '5000')] def addServer(self, someserver): # magic done. No futzing with the database, no diving into obscure parts of the code, nothing. I just plop that class into the right path and it works. With a clever command line and scripting interface it may even be useful. If i were writing an app like that, id actualy have some kind of end- user commands: create new type - LoadBalancer5000; convert all LoadBalancer + model=5000 to LoadBalancer5000. i.e. i *would* update the data, but id make it easy. because the database is much more efficient if you use a single horizontal column to differentiate types. if you have to dive into vertical attributes every time, that greatly limits functionality. what if i wanted to get a report of 25,000 objects and their types really quickly ? would you rather iterate through 25000 rows, or 25000 * total number of attributes, apply complex rules on the client side to aggrgate the attribute rows together and determine types, etc ? you're not really making the best usage of the database in that case. this is actually not a unique scenario at all. If you work with search engines, often you have to configure a combination of horizontal and vertical properties for documents which are stored. the horizontal properties are those that can be searched very quickly, whereas the vertical are those which require secondary queries to retrieve (like the document's full list of metatags). theres no straightforward way for me to get a list of all the AppleServers, for example, since id have to query all these different attributes just to identify those objects. So, underneath the hood, to get all the AppleServers you'd do: ## pseudocode for attr in SomeThingClass.all_meta_attrs: # all_meta_attrs is a list of all the meta_attrs for that class going up the inheritance chain, cls.mro() thingquery += and_(Attribute.c.key==attr[0], Attribute.c.value==attr[1]) select(and_(Thing.c.name==Attribute.c.name, thingquery)) #that should get all the Thing that can be managed by the given class. Maybe not straightforward but not terribly complex either. So, in my implementation, the metaclass mapped each Class to such a select. I am mapping against different selectables, and so having different mappers made sense. So if I did SA functions like: AppleServer.select(and_(Attribute.c.key='numports', Attribute.c.value='2')) I'd only get AppleServers with ('numports', '2') and not any other types of Things. At one point I got things working as I just described, but I'm not sure if that was the case in my latest iteration of the code. you can still have a bunch of selects that you just feed into a Thing query. its not critical to have them mapped. however, didnt you say that your class attributes come from a different table ? in that case this is still not going to work...if youre relying upon eager loading of related , multiple sets of rows, thats not available until well after the polymorphic decisions have been made. the most that polymorhpic_func could get is the first row with the Thing's primary key in it. That's a good point. I suppose the function could use that primary key to select stuff out of the Attributes table and then analyze those to determine the proper class. But that seems like an unhappy hack. Why isn't there a hook into a post-populate part of the mapping? Or whatever the absolute very last step of making an instance happens to be. Does such a thing exist and I just missed it? no, we'd have to add a hook there too. every hook slows down sqlalchemy's load time just a little bit more, not because of the hook itself but because
[sqlalchemy] Re: post_processors error during 0.3.10 to 0.4 migration (returning different object type based on db data)
On Oct 28, 2007, at 11:02 PM, Ron wrote: I have a datastore that consists of 3 tables. 1. Thing table (just a primary-key name column) 2. Attr table (key/value columns with an id and foreign key to Thing table) 3. Thing-to-Thing relation table (Things can be 'connected' to each other) heres some questions: does the metaclass create a mapper() for every subclass ? i.e. is there a mapper for Server, SunServer, etc ? it seems like there is (since by the wrong mapper i meant, calling object_mapper(instance) returns a Server or SunServer mapper, not the Thing mapper which loaded the object). if there is a distinct mapper per class, cant your metaclass simply assign a hidden type attribute to each class, a single string value which indicates what class to load ? then you can just have a single type column in the thing table and SA's regular inheritance mechanisms take care of the rest. the metaclass would only need to worry about things when the user first creates the object. im not seeing any reasons here why this wouldnt work. also when you say the user shouldnt have to know about SA, that suggests the other way I might do this, that Thing stays as Thing and the user-facing object is actually a wrapper for a Thing, and is not mapped to the ORM directly. this makes it less convenient as the user-facing object cant be used in SA operations directly, you'd have to translate around sessions and queries. theres three examples in the distro you should look at: examples/polymorph/single.py - single table inheritance examples/vertical/vertical.py - stores vertical attributes similarly to your Thing examples/elementtree/adjacency_list.py - this is mostly about self- referential relationships, but also illustrates how a non-mapped object can be supplied its internals by a mapped object (i.e. its a wrapper). --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: post_processors error during 0.3.10 to 0.4 migration
On Oct 28, 2007, at 3:39 PM, Ron wrote: I've been trying to migrate my code to to 0.4 and I'm getting stuck on this error. I haven't been able to narrow down what property of my schema or code triggers this, but I thought I'd ask the group in case there was an easy answer. Here Thing is a class that is mapped to a table with a single column. It has a relation to an attribute table with (thing_id, key, value) columns. I have a subclass of Thing called Server, that instead of mapping directly to the table maps to a select on the thing table where the thing has certain attributes from the attribute table. If I create a Server then add attributes to it then flush the data I get no errors. But if I try to query for a Server to which I tried to add attributes I get the attached error. Adding attributes straight to Things or querying for Servers that I didn't add attributes to does not produce the error. Not sure if any of that was clear, but it's a start. Any ideas? youd have to attach your full table setup and mappings to have any idea how this error is occuring. id probably classify this as a bug since if your mapping has something SA can't handle, it should be raising a specific error at compile time instaed of randomly failing at query time. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: post_processors error during 0.3.10 to 0.4 migration (returning different object type based on db data)
Ok, I've figured out the problem but not really sure what the proper solution is. Basically, I have Thing objects that can have attributes associated with them. I have other classes that are subclasses of the Thing object. These classes can provide more specific functionality based on the type of Thing it is. Since Thing and it's subclasses all share the same table, I need a way to get the correct class based on what type of Thing it is. I do this by examining the Attributes associated with a thing. The different subclasses of Thing match different attributes. In 0.3 I did this by called an instance._setProperClass() function in the populate_instance method of a MapperExtension. This seems to make 0.4 angry. If I call the same _setProperClass() after I get the object normally everything seems to work fine. I've attached a simplified version of what I do in my code to illustrate the problem. What I did was kind of a hack in 0.3 so I'm not that surprised that it doesn't work in 0.4, but I'm not sure how else to achieve the functionality I'm looking for. Is there a better way to allow for sqlalchemy to return objects of different types based on the data they happen to contain? -Ron #!/usr/bin/env python from sqlalchemy import * from sqlalchemy.ext.sessioncontext import SessionContext from sqlalchemy.ext.assignmapper import assign_mapper from sqlalchemy.orm import * #Mapper, MapperExtension from sqlalchemy.orm.mapper import Mapper #from clusto.sqlalchemyhelpers import ClustoMapperExtension import sys # session context METADATA = MetaData() SESSION = scoped_session(sessionmaker(autoflush=True, transactional=True)) THING_TABLE = Table('things', METADATA, Column('name', String(128), primary_key=True), #Column('thingtype', String(128)), mysql_engine='InnoDB' ) ATTR_TABLE = Table('thing_attrs', METADATA, Column('attr_id', Integer, primary_key=True), Column('thing_name', String(128), ForeignKey('things.name', ondelete=CASCADE, onupdate=CASCADE)), Column('key', String(1024)), Column('value', String), mysql_engine='InnoDB' ) class CustomMapperExtension(MapperExtension): def populate_instance(self, mapper, selectcontext, row, instance, **flags): Mapper.populate_instance(mapper, selectcontext, instance, row, **flags) ## Causes problems if run here! instance._setProperClass() return EXT_CONTINUE class Attribute(object): Attribute class holds key/value pair backed by DB def __init__(self, key, value, thing_name=None): self.key = key self.value = value if thing_name: self.thing_name = thing_name def __repr__(self): return thingname: %s, keyname: %s, value: %s % (self.thing_name, self.key, self.value) def delete(self): SESSION.delete(self) SESSION.mapper(Attribute, ATTR_TABLE) DRIVERLIST = {} class Thing(object): Anything someattrs = (('klass', 'server'),) def __init__(self, name, *args, **kwargs): self.name = name for attr in self.someattrs: self.addAttr(*attr) def _setProperClass(self): Set the class for the proper object to the best suited driver if self.hasAttr('klass'): klass = self.getAttr('klass') self.__class__ = DRIVERLIST[klass] def getAttr(self, key, justone=True): returns the first value of a given key. if justone is False then return all values for the given key. attrlist = filter(lambda x: x.key == key, self._attrs) if not attrlist: raise KeyError(key) return justone and attrlist[0].value or [a.value for a in attrlist] def hasAttr(self, key, value=None): if value: attrlist = filter(lambda x: x.key == key and x.value == value, self._attrs) else: attrlist = filter(lambda x: x.key == key, self._attrs) return attrlist and True or False def addAttr(self, key, value): Add an attribute (key/value pair) to this Thing. Attribute keys can have multiple values. self._attrs.append(Attribute(key, value)) SESSION.mapper(Thing, THING_TABLE, properties={'_attrs' : relation(Attribute, lazy=False, cascade='all, delete- orphan',), }, extension=CustomMapperExtension()) DRIVERLIST['thing'] = Thing class Server(Thing): someattrs = (('klass', 'server'),) pass DRIVERLIST['server'] = Server
[sqlalchemy] Re: post_processors error during 0.3.10 to 0.4 migration (returning different object type based on db data)
On Oct 28, 2007, at 6:58 PM, Ron wrote: Ok, I've figured out the problem but not really sure what the proper solution is. Basically, I have Thing objects that can have attributes associated with them. I have other classes that are subclasses of the Thing object. These classes can provide more specific functionality based on the type of Thing it is. Since Thing and it's subclasses all share the same table, I need a way to get the correct class based on what type of Thing it is. I do this by examining the Attributes associated with a thing. The different subclasses of Thing match different attributes. In 0.3 I did this by called an instance._setProperClass() function in the populate_instance method of a MapperExtension. This seems to make 0.4 angry. If I call the same _setProperClass() after I get the object normally everything seems to work fine. I've attached a simplified version of what I do in my code to illustrate the problem. What I did was kind of a hack in 0.3 so I'm not that surprised that it doesn't work in 0.4, but I'm not sure how else to achieve the functionality I'm looking for. Is there a better way to allow for sqlalchemy to return objects of different types based on the data they happen to contain? OK, there was an original intended way for this to happen if via MapperExtension, youd do it in create_instance() - just return whatever type of object you want. however, from looking at the error youre getting, this is actually not going to fix the problem here. the class of object determines which mapper is used to populate its attributes. the populate step and the new-in-0.4 post-populate step are not communicating here because the official mapper for your instance changes midway (its based on class). while I can make the post-populate step ignore the mis-communication and just not fire off, or i can change how those two steps communicate, the fact remains that the *wrong* mapper populates your class (including in your 0.3 version)...so im not sure if its the right approach to support wrongish behavior like that. the official way to have the class and populating mapper selected based on attributes is using polymorphic inheritance. if your classes are all in an inheritance hierarchy, and theres a single attribute that can determine which is the right class, you can use that out of the box. it seems like your model could conform to that since its a single type attribute determining the class. I'd suggest trying to work with that model (single table inheritance). Otherwise ill probably have to add another MapperExtension hook to support this. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: post_processors error during 0.3.10 to 0.4 migration (returning different object type based on db data)
So, the code I posted is a much simplified version of what I'm trying to accomplish only used to illustrate the error I was getting. What I actually want to do is select the appropriate class based on any number of Attributes a Thing might have. I have a metaclass that is applied to Thing and all it's subclasses. This metaclass does the actual call to mapper, creates the select query to map against for the various subclasses, and builds a DRIVERLIST dictionary with data that can be used by setProperClass. In other words, the type of a given Thing is determined by it's attributes at runtime, not when the Thing is created. I didn't run into any functional problems doing it this way in 0.3 so I'm not sure what you mean by wrong mapper (I used assign_mapper if that makes any difference). The reason I did the setProperClass at the end of the populate_instance function is because I wanted to make use of the attrs that the mapper would populate. That seemed like the easiest way to accomplish the goal at the time. I've read the Mapping Class Inheritance Hierarchies section in the documentation and it looks like they won't quite do what I'm trying to accomplish. Maybe if I explained my app architecture a little more you could clarify the solution a bit (sorry, this ended up being more verbose than I intended): I have a datastore that consists of 3 tables. 1. Thing table (just a primary-key name column) 2. Attr table (key/value columns with an id and foreign key to Thing table) 3. Thing-to-Thing relation table (Things can be 'connected' to each other) The idea for that schema is to maximize the flexibility of what one can store. In this vein I created a Thing class. This class has many methods for managing attributes, connections between Things, searching, matching, clever __init__, __str__, __eq__, etc. The design is such that subclasses of Thing only need to set class variables to achieve certain functionality. For example, there is a meta_attrs list that will pre-fill the attributes for an object and there is also a required_attrs var that will let you define required arguments to init. My goal was to make sublcasses or 'drivers' as simple as possible. To expand on the Server example from my testcode, say I had this class: class Server(Thing): meta_attrs = [('type', 'server')] def ssh(self): # start an ssh session to this server somemagic() People use that class and do things like: someserver.addAttr('manufacturer', 'sun') adding lots of data to the db. Then later someone decides that sun servers have some special functionality that should be exposed, say cd ejection. They create a new class: class SunServer(Server): meta_attrs = [('manufacturer', 'sun')] def ejectCD(self): # eject the cd It should be that easy. Now, some things I didn't mention earlier. The all meta_attrs of all parent classes also get applied to new objects. So any s=SunServer() will have both ('type', 'server') and ('manufacturer', 'sun') attributes. Also, now that I have this new SunServer class any time I select something from the database that matches all its meta_attrs it should return a SunServer object where it used to return a regular Server object. ex. t1 = Thing.query.filter(Thing.c.name == 'someOldSunServer') isinstance(t1, SunServer) == True This is without updating the database, or making any other change aside from adding that new SunServer class. So, where does sqlalchemy fit into all this? People developing drivers shouldn't ever have to know about SA (they, of course, could make use of it if they want). The Thing class has a __metaclass__ that takes care of all the SA magic so every subclass of Thing is taken care of. However, I'm not sure how to get the polymorfic stuff in the SA mapper to match against arbitrary attributes of classes (specifically those named in the meta_attrs). Should I not rely on SA to return any specific type at all (just always return Thing) and make my own query/select functions that call _setProperClass on their own? I'm trying to take advantage of as much of the SA magic as possible, but I'm unsure when I am going beyond its scope and some of the more advanced topics are not quite documented enough for me to fully understand how to use them. Thanks for the help, -Ron --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---