Re: [Tutor] dictionary dispatch for object instance attributes question
Ah, see, I should convince my bosses that I need a Python interpreter. Of course, then they'd ask what Python was, and why I was thinking about it at work Duh, I was just reading the docs, and I kept thinking that an attribute was just a class variable. Thanks, Kent, now I have all sorts of interesting experiments to undertake... On Sun, 20 Feb 2005 07:26:54 -0500, Kent Johnson [EMAIL PROTECTED] wrote: Liam Clarke wrote: Hi, just an expansion on Brian's query, is there a variant of getattr for instance methods? i.e. class DBRequest: def __init__(self, fields, action): self.get(fields) def get(self, fields): print fields Instead of self.get in _init__, the value of action to call a function? Or, is it going to have to be dictionary dispatch? I don't understand your example, but instance methods are attributes too, so getattr() works with them as well as simple values. An instance method is an attribute of the class whose value is a function. When you access the attribute on an instance you get something called a 'bound method' which holds a reference to the actual function and a reference to the instance (to pass to the function as the 'self' parameter). You can call a bound method just like any other function. So: class foo: ... def __init__(self): ... self.bar = 3 ... def baz(self): ... print self.bar ... f=foo() getattr() of a simple attribute: getattr(f, 'bar') 3 getattr() of an instance method returns a 'bound method': getattr(f, 'baz') bound method foo.baz of __main__.foo instance at 0x008D5FD0 Calling the bound method (note the added ()) is the same as calling the instance method directly: getattr(f, 'baz')() 3 Of course you can do the same thing with dot notation for attributes: b=f.baz b bound method foo.baz of __main__.foo instance at 0x008D5FD0 b() 3 Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor -- 'There is only one basic human right, and that is to do as you damn well please. And with it comes the only basic human duty, to take the consequences. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] dictionary dispatch for object instance attributes question
Brian van den Broek wrote: My Node class defines a _parse method which separates out the node header, and sends those lines to a _parse_metadata method. This is where the elif chain occurs -- each line of the metadata starts with a tag like dt= and I need to recognize each tag and set the appropriate Node object attribute, such as .document_type. (I do not want to rely on the unhelpful names for the tags in the file format, preferring more self-documenting attribute names.) I've come up with *a* way to use a dictionary dispatch, but I'll wager a great deal it isn't the *best* way. Here is a minimal illustration of what I have come up with: code class A: def __init__(self): self.something = None self.something_else = None self.still_another_thing = None def update(self, data): for key in metadata_dict: if data.startswith(key): exec('''self.%s = %s''' %(metadata_dict[key], data[len(key):])) # triple quotes as there may be quotes in metadata # values break metadata_dict = {'something_tag=': 'something', '2nd_tag=': 'something_else', 'last=': 'still_another_thing'} a = A() print a.still_another_thing a.update('last=the metadata value for the last= metadata tag') print a.still_another_thing /code output None the metadata value for the last= metadata tag /output So, it works. Yay :-) But, should I be doing it another way? Another way to do this is to use dispatch methods. If you have extra processing to do for each tag, this might be a good way to go. I'm going to assume that your data lines have the form 'tag=data'. Then your Node class might have methods that look like this: class Node: ... def parse_metadata(self, line): tag, data = line.split('=', 1) try: handler = getattr(self, 'handle_' + tag) except AttributeError: print 'Unexpected tag:', tag, data else: handler(data) def handle_something_tag(self, data): self.something = int(data) # for example def handle_last(self, data): try: self.another_thing.append(data) # attribute is a list except AttributeError: self.another_thing = [data] and so on. This organization avoids any if / else chain and puts all the processing for each tag in a single place. BTW the try / except / else idiom is used here to avoid catching unexpected exceptions. The naive way to write it would be try: handler = getattr(self, 'handle_' + tag) handler(data) except AttributeError: print 'Unexpected tag:', tag, data The problem with this is that if handler() raise AttributeError you will get an unhelpful error message and no stack trace. Putting the call to handler() in an else clause puts it out of the scope of the try / except but it will still be executed only if the getattr succeeds. Also, I know the general security concerns about things like exec. They make me nervous in using it, even though I am (as yet) the sole user. Am I right in thinking that the constrained way I am using it here protects me? My code uses most of the attributes as a simple storage container for later rewriting of the file, but in a few cases they enter into (safe seeming) conditionals like: if 'text' == self.document_type: self.do_text_stuff() if 'RTF' == self.document_type: self.do_RTF_stuff() Conditionals on a 'type' flag are a code smell that suggests using subclasses. Maybe you should have a TextNode class and an RtfNode class. Then the above becomes just self.do_stuff() and TextNode and RtfNode each have the appropriate implementations of do_stuff(). I'm not saying this is the right choice for you, just something you might consider. Kent Thanks and best to all, Brian vdB ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] dictionary dispatch for object instance attributes question
Kent Johnson wrote: Another way to do this is to use dispatch methods. If you have extra processing to do for each tag, this might be a good way to go. I'm going to assume that your data lines have the form 'tag=data'. Then your Node class might have methods that look like this: class Node: ... def parse_metadata(self, line): tag, data = line.split('=', 1) try: handler = getattr(self, 'handle_' + tag) except AttributeError: print 'Unexpected tag:', tag, data else: handler(data) def handle_something_tag(self, data): self.something = int(data) # for example def handle_last(self, data): try: self.another_thing.append(data) # attribute is a list except AttributeError: self.another_thing = [data] and so on. This organization avoids any if / else chain and puts all the processing for each tag in a single place. One more idea. If you have 20 different tags but only four different ways of processing them, maybe you want to use a dict that maps from the tag name to a tuple of (attribute name, processing method). With this approach you need only four handler methods instead of 20. It would look like this: metadata_dict = { 'something_tag' : ( 'something', self.handle_int ), 'last' : ( 'another_thing', self.handle_list ), } def parse_metadata(self, line): tag, data = line.split('=', 1) try: attr_name, handler = metadata_dict[tag] except AttributeError: print 'Unexpected tag:', tag, data else: handler(attr_name, data) def handle_int(self, attr_name, data): setattr(self, attr_name, int(data)) def handle_list(self, attr_name, data): l = getattr(self, attr_name, []) l.append(data) setattr(self, attr_name, l) I-have-to-stop-replying-to-my-own-posts-ly yours, Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] dictionary dispatch for object instance attributes question
Kent Johnson said unto the world upon 2005-02-16 05:58: Brian van den Broek wrote: SNIP Kent's useful explanation of how to use handlers Also, I know the general security concerns about things like exec. They make me nervous in using it, even though I am (as yet) the sole user. Am I right in thinking that the constrained way I am using it here protects me? My code uses most of the attributes as a simple storage container for later rewriting of the file, but in a few cases they enter into (safe seeming) conditionals like: if 'text' == self.document_type: self.do_text_stuff() if 'RTF' == self.document_type: self.do_RTF_stuff() Conditionals on a 'type' flag are a code smell that suggests using subclasses. Maybe you should have a TextNode class and an RtfNode class. Then the above becomes just self.do_stuff() and TextNode and RtfNode each have the appropriate implementations of do_stuff(). I'm not saying this is the right choice for you, just something you might consider. Kent Hi Kent, thanks for the snipped discussion on handlers -- very useful. As for the code smell thing, I have a follow-up question. I now get the point of the type-based conditional being a smell for classes. (I get it due to a previous thread that an over-enthusiastic inbox purge prevents me from citing with certainty, but I think it was Bill and Alan who clarified it for me.) My problem is that I've got a lot of code which was written before I got that point and my code doesn't yet actually do much. (I do have working code for parsing my original source files and storing all of their metadata, etc., but I haven't yet got working code for doing the manipulating the data in the ways I want.) I had been thinking better to get everything working and then refactor. Is that an unsound approach? My worry about refactoring now is that I feel like I am rearranging deck-chairs when I should be worried about getting the ship to float :-) Thanks and best to all, Brian vdB ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] dictionary dispatch for object instance attributes question
Brian van den Broek wrote: As for the code smell thing, I have a follow-up question. I now get the point of the type-based conditional being a smell for classes. (I get it due to a previous thread that an over-enthusiastic inbox purge prevents me from citing with certainty, but I think it was Bill and Alan who clarified it for me.) My problem is that I've got a lot of code which was written before I got that point and my code doesn't yet actually do much. (I do have working code for parsing my original source files and storing all of their metadata, etc., but I haven't yet got working code for doing the manipulating the data in the ways I want.) I had been thinking better to get everything working and then refactor. Is that an unsound approach? My worry about refactoring now is that I feel like I am rearranging deck-chairs when I should be worried about getting the ship to float :-) It's a hard question because it really comes down to programming style and judgement. I like to work in a very incremental style - design a little, code a little, test a little, repeat as needed. I believe in 'Refactor Mercilessly' - another XP slogan. I have many years experience and a well-developed opinion of what is good design and bad design. One consequence of this style is, I usually have working code and tests to go with it. It may not do very much, but it works. So for me, if I smell something, and think that refactoring into subclasses - or some other change - is the best design for the problem as I understand it, I will probably do the refactoring. It's not going to be easier tomorrow :-) If it just smells a little, or the refactoring is major, I might think about how to get rid of the smell but put it off until I'm pretty sure it is a good idea. I don't think of this as rearranging the deck chairs - it's more like building the right foundation. Clean, expressive, well-designed code is a pleasure to work with. For you, it's probably not so cut-and-dried. If you don't have the experience to judge how bad a smell is, or to think through the possibilities so clearly, it's harder to know how to proceed. If you are in part dabbling with OOP design to learn about it, maybe you want to put off some changes until the code is working; then you could make the change and do a comparison and see which one feels cleaner to you. I hope this helps at least a little :-) Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] dictionary dispatch for object instance attributes question
On Tue, 15 Feb 2005 23:48:31 -0500, Brian van den Broek [EMAIL PROTECTED] wrote: Jeff Shannon said unto the world upon 2005-02-15 21:20: On Tue, 15 Feb 2005 17:19:37 -0500, Brian van den Broek [EMAIL PROTECTED] wrote: For starters, I've made metadata a class attribute rather than an unconnected dictionary. This seems conceptually nicer to me. The problem is that my Node instance live in a TP_file class instance, and the way my code is now, the TP_file instance also needs to see the metadata dict. There are a few tags, which if present in any Node of the file make me want to treat the entire file a bit differently. (Of course, here is the place where my novice-designer status is most likely to be bitting me.) So, that's why I have it as a module level object, rather than within a class. (I do, however, see your point about it being neater.) Okay, that makes sense. You have two different classes (the TP_file class and the Node class) that need access to the same information, so yes, having it at the module level lets them share it more effectively. (Alternately, since it sounds like the TP_file class is where all of the Node instances are created, you *could* decide that the metadata belongs as part of the TP_file, which would then actively share it with Node... but what you've got sounds like a very reasonable plan, so at this point I wouldn't worry about it.) In addition, update() can now modify several attributes at once, at the cost of a bit of extra parsing up front. The metadata all occurs one element to a line in my original file. [...] Maybe I'm still missing a better way, but as I am processing line by line, each line with one element, I don't see how to use this cool looking multiple elements at once approach. Yes, if you know that you will only have one header per line, then it's reasonable to process them one line at a time. You could alternatively have the TP_file gather all the header lines for a given node into a list, and then process that list to create the Node instance, but given the specifics of your case you probably wouldn't gain anything over your current approach by doing so. This is what makes programming so interesting -- there's so many different choices possible, and which one is best depends on a large number of factors. When writing a program for some task, the best design for a particular set of circumstances may be completely different than the best design for a somewhat different particular set of circumstances -- and the best design for general usage is probably an altogether different thing still. Good luck! Jeff Shannon ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] dictionary dispatch for object instance attributes question
Kent Johnson said unto the world upon 2005-02-16 15:02: Brian van den Broek wrote: SNIP I had been thinking better to get everything working and then refactor. Is that an unsound approach? My worry about refactoring now is that I feel like I am rearranging deck-chairs when I should be worried about getting the ship to float :-) It's a hard question because it really comes down to programming style and judgement. I like to work in a very incremental style - design a little, code a little, test a little, repeat as needed. I believe in 'Refactor Mercilessly' - another XP slogan. I have many years experience and a well-developed opinion of what is good design and bad design. One consequence of this style is, I usually have working code and tests to go with it. It may not do very much, but it works. So for me, if I smell something, and think that refactoring into subclasses - or some other change - is the best design for the problem as I understand it, I will probably do the refactoring. It's not going to be easier tomorrow :-) If it just smells a little, or the refactoring is major, I might think about how to get rid of the smell but put it off until I'm pretty sure it is a good idea. I don't think of this as rearranging the deck chairs - it's more like building the right foundation. Clean, expressive, well-designed code is a pleasure to work with. For you, it's probably not so cut-and-dried. If you don't have the experience to judge how bad a smell is, or to think through the possibilities so clearly, it's harder to know how to proceed. If you are in part dabbling with OOP design to learn about it, maybe you want to put off some changes until the code is working; then you could make the change and do a comparison and see which one feels cleaner to you. I hope this helps at least a little :-) Kent Hi Kent, I see my `strike that' msg. didn't get through in time, to save you from the reply. But, from the selfish perspective, I'm glad enough about that; the above does indeed help more than a little. I get, in the abstract at least, how Test Driven Development would make these refactorings much easier to do with confidence. (Somewhere near half the point, isn't it?) The goal of my current project, beyond the given of having useful code, is to write a medium sized project in OOP. At the outset, I felt I had to choose between getting a handle on OOP or TDD. I felt I could only tackle one new paradigm at a time. I went with OOP as I didn't want to spend the effort of getting procedural code down using TDD and then have to redo it in OOP. But, not having test does make the refactoring more scary than I imagine it would be tests in hand. And I would have had the need to redo it, I think. The file format I am working with is from an application I've been using as a PIM/Knowledge manager for several years. So, I've got tons of data and tons of plans. I'm not certain if the term is the right one, but I'm thinking of the code I am working on as a base toolset or `framework' for all the other things I want to do with the files of that format. Thus, subclassing and other OOP techniques are sure to be important for those plans. But, I think you hit it right on the head -- my inexperience with OOP doesn't provide me with any metric for judgement about these things. Browsing through things like Fowler, Beck, Brant, Opdyke's _Refactoring_, while fun, doesn't help much without my having struggled with my own OOP code first. Hey, yesterday I proved that having read about setattr numerous times is no guarantee I'll remember it the first time a use case comes up :-) Thanks for the continued efforts to help me `get' it. Best to all, Brian vdB ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] dictionary dispatch for object instance attributes question
Jeff Shannon said unto the world upon 2005-02-16 16:09: On Tue, 15 Feb 2005 23:48:31 -0500, Brian van den Broek [EMAIL PROTECTED] wrote: SNIP some of Jeff's responses to my evaluation of his earlier suggestions Yes, if you know that you will only have one header per line, then it's reasonable to process them one line at a time. You could alternatively have the TP_file gather all the header lines for a given node into a list, and then process that list to create the Node instance, but given the specifics of your case you probably wouldn't gain anything over your current approach by doing so. This is what makes programming so interesting -- there's so many different choices possible, and which one is best depends on a large number of factors. When writing a program for some task, the best design for a particular set of circumstances may be completely different than the best design for a somewhat different particular set of circumstances -- and the best design for general usage is probably an altogether different thing still. Good luck! Jeff Shannon Thanks Jeff, the confirmation that my assessment made sense is very helpful. Due to the my lack of experience (as discussed in my response to Kent) I'm always uncomfortable rejecting a proposed solution -- is my assessment that the solution isn't the best a product of that inexperience, or am I on to something? So, thanks for taking the time to `bless' my assessment. Best to all, Brian vdB ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] dictionary dispatch for object instance attributes question
Hi all, I'm still plugging away at my project of writing code to process treepad files. (This was the task which I posted about in the recent help with refactoring needed -- which approach is more Pythonic? thread.) My present problem is how best to reorganize a long (20 elements) elif chain. The file format I am dealing with organizes itself with a file header, and then a series of nodes. Each node has a header filled with up to 20 different metadata elements, followed by the node content proper. These metadata elements can be in arbitrary order, and need not all be present. My Node class defines a _parse method which separates out the node header, and sends those lines to a _parse_metadata method. This is where the elif chain occurs -- each line of the metadata starts with a tag like dt= and I need to recognize each tag and set the appropriate Node object attribute, such as .document_type. (I do not want to rely on the unhelpful names for the tags in the file format, preferring more self-documenting attribute names.) I've come up with *a* way to use a dictionary dispatch, but I'll wager a great deal it isn't the *best* way. Here is a minimal illustration of what I have come up with: code class A: def __init__(self): self.something = None self.something_else = None self.still_another_thing = None def update(self, data): for key in metadata_dict: if data.startswith(key): exec('''self.%s = %s''' %(metadata_dict[key], data[len(key):])) # triple quotes as there may be quotes in metadata # values break metadata_dict = {'something_tag=': 'something', '2nd_tag=': 'something_else', 'last=': 'still_another_thing'} a = A() print a.still_another_thing a.update('last=the metadata value for the last= metadata tag') print a.still_another_thing /code output None the metadata value for the last= metadata tag /output So, it works. Yay :-) But, should I be doing it another way? Also, I know the general security concerns about things like exec. They make me nervous in using it, even though I am (as yet) the sole user. Am I right in thinking that the constrained way I am using it here protects me? My code uses most of the attributes as a simple storage container for later rewriting of the file, but in a few cases they enter into (safe seeming) conditionals like: if 'text' == self.document_type: self.do_text_stuff() if 'RTF' == self.document_type: self.do_RTF_stuff() Thanks and best to all, Brian vdB ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] dictionary dispatch for object instance attributes question
Liam Clarke said unto the world upon 2005-02-15 18:08: Hi Brian, why not take it the next step and for key in metadata_dict: if data.startswith(key): exec('''self.%s = %s''' %(metadata_dict[key], data[len(key):])) # triple quotes as there may be quotes in metadata # values break self.foo = {} for key in metadata_dict.keys(): #? I got confused, so I guessed. if data.startswith(key): self.foo[metadata_dict[key]]=data[len(key):] And then instead of self.x (if metadata_dict[key] = x] You just call self.foo['x'] A bit more obfuscated, but it would seem to remove the exec, although I'm not sure how else it impacts your class. SNIP related Pythoncard example So yeah, hope that helps a wee bit. Regards, Liam Clarke and Rich Krauter said unto the world upon 2005-02-15 18:09: Brian, You could use setattr(self,metadata_dict[key],data[len(key):]). Rich Hi Liam, Rich, and all, thanks for the replies. (And for heroically working through the long question -- if there is a tutee verbosity award, I think its mine ;-) Rich: thanks. setattr, yeah, that's the ticket! Liam: The reason I didn't want to take it this way is: Flat is better than nested :-) The code I am working on is an improved (I hope ;-) ) and expanded class-based version of some more primitive code I had done purely procedurally. There, I had a dictionary approach for storing the metadata (akin to the self.foo dictionary you suggested above). One of the benefits of going OOP, in my opinion, is that instead of using dictionary access syntax, you can just say things like self.document_type. It might be little more than sugar[*], but I've a sweet tooth, and I'd want to avoid going back to the dictionary syntax if I could. Though, absent the setattr way that Rich pointed to, I must admit the exec in my originally posted version would have me dithering whether to opt for sugar or safety. Thankfully, it's all moot. [*] In that, if I've understood correctly, class namespaces are just fancily packaged dictionaries. Thanks to all, Brian vdB ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] dictionary dispatch for object instance attributes question
Jeff Shannon said unto the world upon 2005-02-15 21:20: On Tue, 15 Feb 2005 17:19:37 -0500, Brian van den Broek [EMAIL PROTECTED] wrote: My Node class defines a _parse method which separates out the node header, and sends those lines to a _parse_metadata method. This is where the elif chain occurs -- each line of the metadata starts with a tag like dt= and I need to recognize each tag and set the appropriate Node object attribute, such as .document_type. (I do not want to rely on the unhelpful names for the tags in the file format, preferring more self-documenting attribute names.) In addition to using setattr(), I'd take a slightly different approach to this. (This may just be a matter of personal style, and is not as clearly advantageous as using setattr() instead of exec, but...) Hi Jeff and all, I am *pretty* sure I followed what you meant, Jeff. Thank you for the suggestions! I don't think they will fit with my situation, but that I think so might say more about my present level of understanding of OOP and design issues than about either the suggestions or the situation. :-) .class Node(object): .metadata = {'dt': 'document_type', 'something': 'some_other_field', ...} .def __init__(self): # .def update(self, **kwargs): .for key, value in kwargs.items(): .try: .attr_name = self.metadata[key] .except KeyError: .raise ValueError(Invalid field type '%s' % key) .setattr(self, attr_name, value) For starters, I've made metadata a class attribute rather than an unconnected dictionary. This seems conceptually nicer to me. The problem is that my Node instance live in a TP_file class instance, and the way my code is now, the TP_file instance also needs to see the metadata dict. There are a few tags, which if present in any Node of the file make me want to treat the entire file a bit differently. (Of course, here is the place where my novice-designer status is most likely to be bitting me.) So, that's why I have it as a module level object, rather than within a class. (I do, however, see your point about it being neater.) In addition, update() can now modify several attributes at once, at the cost of a bit of extra parsing up front. Supposing that your node header looks like this: .header = dt=text/plain;something=some_value;last=some_other_thing_here Now, we can break that into fields, and then split the fields into a name and a value -- .tags = {} .for field in header.split(';'): .name, value = field.split('=') .tags[name] = value . .n = Node() .n.update(**tags) You can even simplify this a bit more, by rewriting the __init__(): .def __init__(self, **kwargs): .if kwargs: .self.update(**kwargs) Now you can create and update in a single step: .n = Node(**tags) The metadata all occurs one element to a line in my original file. I've got the TP_file class breaking the nodes up and sending the contents to new Node instances (as Alan suggested in my previous thread). The Node instance has a parse method that reads the node contents line by line and sends the appropriate lines to the parse_metadata method. (All lines before a designated `header-ending' line.) Maybe I'm still missing a better way, but as I am processing line by line, each line with one element, I don't see how to use this cool looking multiple elements at once approach. (The other complication that I didn't mention is that the parse_metadata method has to do more than just store the metadata -- some elements must be converted to ints, others left as strings, and still others can have multiple instances in a single Node, so rather than be set they must be appended to an attribute list, etc. The setattr way has taken me from 20 elifs to just 4, though :-) ) At any rate, my whole code is (perhaps wrongly) organized around logical-line based processing. You could also put all of the splitting into fields in a method, and when __init__ gets a single string as its argument simply pass it to that method and update with the results... --Jeff Shannon Anyway, such are the reasons I'm not sure the suggestions will work in my situation. I'm glad to have seen them, though, and am going to save them for the point where I actually have the whole program working and can think about large-scale refactoring. I may well then find that my current uncertainty is unwarranted. But I'd like to make the beast live before I make it thrive :-) Thanks again, and best, Brian vdB ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor