Re: [Tutor] help with refactoring needed -- which approach is morePythonic?
Alan Gauld said unto the world upon 2005-02-12 07:51: Thanks Alan and Kent for the replies. I'm responding to a couple of questions Alan asked me about my problem. I don't think I have any further questions of my own (yet), though. :-) Call it node_linkify. The new thought is to create two new methods for the TP_file class: - One to build a dictionary with Node titles as keys and a list of the corresponding unique ids as values. Which saves searching the node list each time, good idea. This dictionary would of course be part of the Head or Body object which manages the Nodes? That's the plan, yes. - Another to pass that list into a the node_linkify Node method, once for each Node object of the TP_file. Since the node knows its own ID and shouldn't know about the other nodes I wonder if it would be better for the Body/Head object to pass only what the node needs to create the links. Or do the links go to *all* the other related nodes? In which case the dictionary entry is fine. Alan G. The Node object being linkified won't know which other Node objects need to be linked to until the to-be-linkified Node's article text is scanned. So, I think the full dict is needed. The best analogy I can come up with for what I am trying to do is this: Consider a wiki which only supports WikiNames rather than arbitrary page names and arbitrary page links. Quite often, in a wiki, people add text which could/should have been in the form of a wiki link, but they didn't format it that way. For example, there might be a page called ExtantWikiName and, on some other page, someone wrote `extant wiki name' rather than `ExtantWikiName'. So, one might want a function which scanned each page of the wiki, looking for strings which match the name of wiki pages, but don't have wiki link formatting and then so format them. It seems to me that this method, if a method of each wiki page, would need to be given knowledge of all page names in the wiki. That's pretty much my task, save that my target app doesn't support wiki linking, so I need to associate node titles (the analogue of wiki page names) with the unique id's of pages with the title to be able to generate the links. Off to code it up! Thanks and best to all, Brian vdB ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] help with refactoring needed -- which approach is morePythonic?
> It still seems to me that the actual updating of the article should be > a Node method (it is the Node object's article that is being updated, > after all). Yes, the owner of the data should update it. > Call it node_linkify. The new thought is to create two new > methods for the TP_file class: > > - One to build a dictionary with Node titles as keys and >a list of the corresponding unique ids as values. Which saves searching the node list each time, good idea. This dictionary would of course be part of the Head or Body object which manages the Nodes? > - Another to pass that list into a the node_linkify Node method, >once for each Node object of the TP_file. Since the node knows its own ID and shouldn't know about the other nodes I wonder if it would be better for the Body/Head object to pass only what the node needs to create the links. Or do the links go to *all* the other related nodes? In which case the dictionary entry is fine. Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] help with refactoring needed -- which approach is morePythonic?
Brian van den Broek wrote: Alan Gauld said unto the world upon 2005-02-10 02:58: Pseudo code: class Body: def __init__(self,content): self.contents = contents self.nodes = [] def parse(self): for line in self.contents: if line == NodeStartTag: node = Node() if line == NodeEndTag: self.nodes.append(node) node.append(line) class Node: def __init__(self,lines=[]): self.lines = lines def append(self,item): self.lines.append(item) def parse(self): # your parsing method here. Hi all, YAQ (Yet Another Question): Following the general pattern, I end up with a Body object which has an attribute .nodes that consists of a list of Node objects. So, something like: My Example Body Node List Node the first Node the second Is there any way to make methods of the Node class access attributes of `parents' of instances? I would like a Node instance such as Node the first above to be aware just what it is a node of and what its siblings are. You have to tell it the parent. ("Explicit is better than implicit.") For example you could pass a reference to Body to the Node in the constructor: def parse(self): for line in self.contents: if line == NodeStartTag: node = Node(self)# HERE if line == NodeEndTag: self.nodes.append(node) node.append(line) In general I think this is a bad design. I try to avoid telling components about their parents in any kind of containment hierarchy. If the component knows about its parent, then the component can't be reused in a different context and it can't be tested without creating the expected context. Is there another way you could accomplish what you want? Kent Does this make sense? Best to all, Brian vdB PS Thanks for the reply to my venting question, Kent. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] help with refactoring needed -- which approach is morePythonic?
Brian van den Broek wrote: Alan Gauld said unto the world upon 2005-02-10 02:58: Pseudo code: class Body: def __init__(self,content): self.contents = contents self.nodes = [] def parse(self): for line in self.contents: if line == NodeStartTag: node = Node() if line == NodeEndTag: self.nodes.append(node) node.append(line) class Node: def __init__(self,lines=[]): self.lines = lines def append(self,item): self.lines.append(item) def parse(self): # your parsing method here. Hi all, YAQ (Yet Another Question): Following the general pattern, I end up with a Body object which has an attribute .nodes that consists of a list of Node objects. So, something like: My Example Body Node List Node the first Node the second Is there any way to make methods of the Node class access attributes of `parents' of instances? I would like a Node instance such as Node the first above to be aware just what it is a node of and what its siblings are. Does this make sense? I think so. I haven't tested this (pseudo) code which I took from your above post and just modified it, but I think you want something like this: Pseudo code: class Body: def __init__(self,content): self.contents = contents self.nodes = [] def parse(self): for line in self.contents: if line == NodeStartTag: node = Node(self) #when you create a node, pass in the parent object like this if line == NodeEndTag: self.nodes.append(node) node.append(line) class Node: def __init__(self, parent, lines=[]): self.lines = lines self.parent = parent #and store the parent like this def append(self,item): self.lines.append(item) def parse(self): # your parsing method here. def show_siblings(self): print self.parent.nodes # and you can access the parent kinda like this. You don't want to get too carried away with something like this, though. You may want to read up on the Law of Demeter. This (in my opinion) is fine, though. Best to all, Brian vdB PS Thanks for the reply to my venting question, Kent. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor Jeremy Jones ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] help with refactoring needed -- which approach is morePythonic?
Alan Gauld said unto the world upon 2005-02-10 02:58: Pseudo code: class Body: def __init__(self,content): self.contents = contents self.nodes = [] def parse(self): for line in self.contents: if line == NodeStartTag: node = Node() if line == NodeEndTag: self.nodes.append(node) node.append(line) class Node: def __init__(self,lines=[]): self.lines = lines def append(self,item): self.lines.append(item) def parse(self): # your parsing method here. Hi all, YAQ (Yet Another Question): Following the general pattern, I end up with a Body object which has an attribute .nodes that consists of a list of Node objects. So, something like: My Example Body Node List Node the first Node the second Is there any way to make methods of the Node class access attributes of `parents' of instances? I would like a Node instance such as Node the first above to be aware just what it is a node of and what its siblings are. Does this make sense? Best to all, Brian vdB PS Thanks for the reply to my venting question, Kent. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] help with refactoring needed -- which approach is morePythonic?
Alan Gauld wrote: The main change in refactoring is moving it to OOP. I have a method that serves as the entry point for parsing the files. Not an object? If you are thinking terms of what the methods do its probably not OOP... I would expect to see an object for the File, another for the Header, a third for the Body and another for the Node. The first (containing a header and bosdy object) is responsible for cracking open the file and reading the lines, recognising where it has a header and sending those lines to the header object and the rest to the bosy object. The body object then reades those lines and creates a Node object per node feeding it lines as appropriate... This is a reasonable approach. Having the Body and Node classes gives a handy place to put functions to do something with that data. But I tend to take the Extreme Programming position of You Aren't Going To Need It. I would probably stick with the list representation of Node, for example, until I had some real work for the Node class to do. It's definitely a judgement call when to introduce classes, there isn't a right way and a wrong way. Some problems cry out for classes, some clearly have no need, and then there is a gray area in the middle. Pseudo code: class Body: def __init__(self,content): self.contents = contents self.nodes = [] def parse(self): for line in self.contents: if line == NodeStartTag: node = Node() if line == NodeEndTag: self.nodes.append(node) node.append(line) def __del__(self): del self.nodes Why is 'del self.nodes' needed? When the Body is del'ed the reference to self.nodes should be lost and the nodes list will be GC'd. Or am I missing something? class Node: def __init__(self,lines=[]): self.lines = lines def append(self,item): self.lines.append(item) def parse(self): # your parsing method here. You might want to extend list so a Node automatically has the behavior of a list. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] help with refactoring needed -- which approach is morePythonic?
Alan Gauld said unto the world upon 2005-02-10 02:58: The main change in refactoring is moving it to OOP. I have a method that serves as the entry point for parsing the files. Not an object? If you are thinking terms of what the methods do its probably not OOP... Hi Alan, That may well be :-) What I've done is have a class File_in_the_format, and have all the parsing work going on in methods of the class. My class is called with a filepath, the __init__ method calls the _master_parser method which breaks things into head and body, calling appropriate methods and so on. I was thinking it was OOP even with just the one class as I am making use of encapsulation and the class namespace to avoid passing parameters about. I would expect to see an object for the File, another for the Header, a third for the Body and another for the Node. The first (containing a header and bosdy object) is responsible for cracking open the file and reading the lines, recognising where it has a header and sending those lines to the header object and the rest to the bosy object. The body object then reades those lines and creates a Node object per node feeding it lines as appropriate... I originally tried to see how to do it that way, but got stuck. I'll point to where by using your pseudo code below. Pseudo code: class Body: def __init__(self,content): self.contents = contents self.nodes = [] def parse(self): for line in self.contents: if line == NodeStartTag: node = Node()# Stuck here -- BvdB if line == NodeEndTag: self.nodes.append(node) node.append(line) def __del__(self): del self.nodes I got stuck at the place where I added a comment above. I didn't see how I could do that, since I didn't yet have all of the node lines in hand when it would be time to build a node. class Node: def __init__(self,lines=[]): self.lines = lines def append(self,item):# Bing! -- BvdB self.lines.append(item) def parse(self): # your parsing method here. Well, that flipped a switch! I *knew in the abstract* that I can define interfaces [?] like .append and .__add__. The puzzle pieces just hadn't come together yet. They aren't all there yet either, but a half an hour in the interpreter has already cleared away much puzzlement over how to pull this off. So, thanks a lot for the push! Whats the advantage? If you have different node types you can easily subclass Node and not change the File or Body classes, just write a slightl'y different line parser. That way your code remains more stable - you never need to change working code to add a new node type... Well, I don't *think* these considerations are going to enter into my particular task. I'm working with a well documented text format for a shareware app I have where the developer has recently changed the file format to an undocumented binary format. So, I can be fairly confident that my particular target is one I have complete knowledge about. But, I absolutely see the point for the general case. I'm off to scrap the code that made me happy :-( and redo it this way to learn better habits :-) Thanks again, Brian vdB ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] help with refactoring needed -- which approach is morePythonic?
> The main change in refactoring is moving it to OOP. I have a method > that serves as the entry point for parsing the files. Not an object? If you are thinking terms of what the methods do its probably not OOP... I would expect to see an object for the File, another for the Header, a third for the Body and another for the Node. The first (containing a header and bosdy object) is responsible for cracking open the file and reading the lines, recognising where it has a header and sending those lines to the header object and the rest to the bosy object. The body object then reades those lines and creates a Node object per node feeding it lines as appropriate... > I want the body parser to accept a list of lines corresponding to the > nodes portions of my file, separate out each node (everything between > node end tags, the bottommost end tag included in the node), and > send each node's contents to a further method for processing. Or to another object? Nouns are objects, verbs are methods. > .def body_parser(self, body_contents): Pseudo code: class Body: def __init__(self,content): self.contents = contents self.nodes = [] def parse(self): for line in self.contents: if line == NodeStartTag: node = Node() if line == NodeEndTag: self.nodes.append(node) node.append(line) def __del__(self): del self.nodes > .self.node_parser(current_node_contents) class Node: def __init__(self,lines=[]): self.lines = lines def append(self,item): self.lines.append(item) def parse(self): # your parsing method here. Whats the advantage? If you have different node types you can easily subclass Node and not change the File or Body classes, just write a slightl'y different line parser. That way your code remains more stable - you never need to change working code to add a new node type... Just an alternative to consider Alan G Author of the Learn to Program web tutor http://www.freenetpages.co.uk/hp/alan.gauld ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor