John Nagle wrote: > Here's some actual code, from "tokenizer.py". This is called once > for each character in an HTML document, when in "data" state (outside > a tag). It's straightforward code, but look at all those > dictionary lookups. > > def dataState(self): > data = self.stream.char() > > # Keep a charbuffer to handle the escapeFlag > if self.contentModelFlag in\ > (contentModelFlags["CDATA"], contentModelFlags["RCDATA"]):
Is the tuple (contentModelFlags["CDATA"], contentModelFlags["RCDATA"]) constant? If that is the case, I'd cut it out into a class member (or module-local variable) first thing in the morning. And I'd definitely keep the result of the "in" test in a local variable for reuse, seeing how many times it's used in the rest of the code. Writing inefficient code is not something to blame the language for. Stefan -- http://mail.python.org/mailman/listinfo/python-list