def startElement(self, name, attrs): self._queue.append(name) # keep the order of processed tags handler = str('_start_'+name) if hasattr(self, handler): self.__class__.__dict__[handler](self, attrs)
Is there a better syntax for self.__class__.__dict__[handler]?
And where should the "output" go to?
All examples use print statements in the element handlers.
I wrote those get... methods - but I guess they don't belong in the XML handler, but perhaps in the parser or somewhere else.
It works, but I don't think it's good design.
def getPages(self): return self.pages.getSortedArray()
def getPage(self, no): return self.pages[no]
parser = xml.sax.make_parser() parser.setFeature(xml.sax.handler.feature_namespaces, 0) pxh = MyHandler() parser.setContentHandler(pxh) parser.parse(dateiname) for p in pxh.getPages(): ...
My style is to create/build a data structure in the parser and have a single get... method that will give me the result.
Your getPage/getPages would be part of the objects in the data structure.
So:
class MyDoc(...): ... def getPage(self): ... def getPages(self,no): ... ...
class MyXmlParser(...): ... def reset(self): super(MyXmlParser,self).reset() self._myresult = MyDoc(...) ... def startElement(self, name, attrs): ... add to self._result ... def getResult(self): # usually something more advanced than this return self._result
I should ask the last question on the twisted ML, I guess:
Further, if I'd like to use it in a twisted driven asynchronous app, would I let the parser run in a thread? (Or how can I make the parser non-blocking?)
Sorry, dunnoh!
--eric
_______________________________________________ Pythonmac-SIG maillist - Pythonmac-SIG@python.org http://mail.python.org/mailman/listinfo/pythonmac-sig