On Fri, 31 Dec 2004, kumar s wrote:
> http://www.python.org/doc/lib/dom-example.html > > Frankly it looked more complex. could I request you to explain your > pseudocode. It is confusing when you say call a function within another > function. Hi Kumar, A question, though: can you try to explain what part feels weird about having a function call another function? Is it something specific to XML processing, or a more general problem? That is, do you already feel comfortable with writing and using "helper" functions? If you're feeling uncomfortable with the idea of functions calling functions, then that's something we should probably concentrate on, because it's really crucial to use this technique, especially on structured data like XML. As a concrete toy example of a function that calls another function, we can use the overused hypotenuse function. Given right triangle leg lengths 'a' and 'b', this function returns the length of the hypotenuse: ### def hypotenuse(a, b): return (a**2 + b**2)**0.5 ### This definition works, but we can use helper functions to make the hypotenuse function a little bit more like English: ### def sqrt(x): return x ** 0.5 def square(x): return x * x def hypotenuse(a, b): return sqrt(square(a) + square(b)) ### In this variation, the rewritten hypotenuse() function uses the other two functions as "helpers". The key idea is that the functions that we write can then be used by anything that needs it. Another thing that happens is that hypotenuse() doesn't have to know how sqrt() and square() are defined: it just depends on the fact that sqrt() and square() are out there, and it can just use these as tools. Computer scientists call this "abstraction". Here is another example of another "helper" function that comes in handy when we do XML parsing: ### def get_children(node, tagName): """Returns the children elements of the node that have this particular tagName. This is different from getElementsByTagName() because we only look shallowly at the immediate children of the given node.""" children = [] for n in node.childNodes: if n.nodeType == n.ELEMENT_NODE and n.tagName == tagName: children.append(n) return children ### For example: ### >>> import xml.dom.minidom >>> dom = xml.dom.minidom.parseString("<p><a>hello</a><a>world</a></p>") >>> >>> dom.firstChild <DOM Element: p at 0x50efa8> >>> >>> get_children(dom.firstChild, "a") [<DOM Element: a at 0x50efd0>, <DOM Element: a at 0x516058>] ### > It is confusing when you say call a function within another function. Here's a particular example that uses this get_children() function and that get_text() function that we used in the earlier part of this thread. ### def parse_Hsp(hsp_node): """Prints out the query-from and query-to of an Hsp node.""" query_from = get_text(get_children(hsp_node, "Hsp_query-from")[0]) query_to = get_text(get_children(hsp_node, "Hsp_query-to")[0]) print query_from print query_to ### This function only knows how to deal with Hsp_node elements. As soon as we can dive through our DOM tree into an Hsp element, we should be able to extract the data we need. Does this definition of parse_Hsp() make sense? You're not going to be able to use it immediately for your real problem yet, but you can try it on a sample subset of your XML data: ### sampleData = """ <Hsp> <Hsp_num>1</Hsp_num> <Hsp_bit-score>1164.13</Hsp_bit-score> <Hsp_score>587</Hsp_score> <Hsp_evalue>0</Hsp_evalue> <Hsp_query-from>1</Hsp_query-from> <Hsp_query-to>587</Hsp_query-to> </Hsp> """ doc = xml.dom.minidom.parseString(sampleData) parse_Hsp(doc.firstChild) ### to see how it works so far. This example tries to show the power of being able to call helper functions. If we were to try to write this all using DOM primitives, the end result would look too ugly for words. But let's see it anyway. *grin* ### def parse_Hsp(hsp_node): ## without using helper functions: """Prints out the query-from and query-to of an Hsp node.""" query_from, query_to = "", "" for child in hsp_node.childNodes: if (child.nodeType == child.ELEMENT_NODE and child.tagName == "Hsp_query-from"): for n in child.childNodes: if n.nodeType == n.TEXT_NODE: query_from += n.data if (child.nodeType == child.ELEMENT_NODE and child.tagName == "Hsp_query-to"): for n in child.childNodes: if n.nodeType == n.TEXT_NODE: query_to += n.data print query_from print query_to ### This is exactly the kind of code we want to avoid. It works, but it's so fragile and hard to read that I just don't trust it. It just burns my eyes. *grin* By using "helper" functions, we're extending Python's vocabulary of commands. We can then use those functions to help solve our problem with less silliness. This is a reason why knowing how to write and use functions is key to learning how to program: this principle applies regardless of what particular programming language we're using. If you have questions on any of this, please feel free to ask. Good luck! _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor