Re: Ideas to optimize this getitem/eval call?
On Sun, Jan 4, 2009 at 6:46 PM, Tino Wildenhain t...@wildenhain.de wrote: mario wrote: On Jan 3, 7:16 am, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: I was about to make a comment about this being a security hole, Strange that you say this, as you are also implying that *all* the widely-used templating systems for python are security holes... Well, you would be right to say that of course ;-) Infact, evoque is really one of the few (or even the only one?) that was conceived from the start to support restricted evaluation. Thats is definitively not the case. There are at least 2 quite old template systems on top of a quite good restricted environment. Interesting, which ones? I should have probably emphasized *conceived* above... that feature was there in the initial design and not added afterwards. Cheers Tino -- http://mail.python.org/mailman/listinfo/python-list
Re: Ideas to optimize this getitem/eval call?
mario wrote: On Jan 3, 7:16 am, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: I was about to make a comment about this being a security hole, Strange that you say this, as you are also implying that *all* the widely-used templating systems for python are security holes... Well, you would be right to say that of course ;-) Infact, evoque is really one of the few (or even the only one?) that was conceived from the start to support restricted evaluation. Thats is definitively not the case. There are at least 2 quite old template systems on top of a quite good restricted environment. Cheers Tino smime.p7s Description: S/MIME Cryptographic Signature -- http://mail.python.org/mailman/listinfo/python-list
Re: Ideas to optimize this getitem/eval call?
On Jan 3, 7:16 am, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: I was about to make a comment about this being a security hole, Strange that you say this, as you are also implying that *all* the widely-used templating systems for python are security holes... Well, you would be right to say that of course ;-) Infact, evoque is really one of the few (or even the only one?) that was conceived from the start to support restricted evaluation. but I see from here http://evoque.gizmojo.org/usage/restricted/ that you are aware of at least some of the issues. I must say though, your choice of builtins to prohibit seems rather arbitrary. What is dangerous about (e.g.) id() and isinstance()? Preventive, probably. I also feel that temlates should have any business with info such as the memory addressed returnred by id(). For isinstance, becuase it is somewhat related to __subclasses__ that is known to be insecure. Incidentally, I updated the page you point to clarify what is going on. I also added a link to Brett Cannon's inspiration paper on the topic of securing the python interpreter... except: # KeyError, NameError, AttributeError, SyntaxError, # ValueError, TypeError, IOError If you want to capture seven exceptions, then capture seven exceptions, not any exception. Absolutely not. I want to catch ALL evaluation exceptions... it would actually *be* a secuity hole to allow any exception to bubble. hey will however be handled appropriately as per the application policy/ config/deployment. You should write: except (KeyError, NameError, ..., IOError): instead of a bare except clause. That will capture exceptions that represent bugs in your code as well as exceptions that should propbably be allowed to propagate, such as KeyboardInterupt and SystemExit. Again, no. Template presentational logic has no business issuing SystemExits or so. And, of course, there are no bugs in my code ;-) # Special case if a KeyError is coming from the self.codes[name] # lookup (traceback should consist of a single frame only): if sys.exc_info()[2].tb_next is None: if sys.exc_info()[0] is KeyError: self.codes[expr] = compile(expr, 'string', 'eval') return self[expr] That seems awfully complicated for no good reason. Yes, you are probably right. I wanted to be precise in that if the KeyError originates strictly from the codes lookup and not from the actual eval of the expr itself -- in which case the expr should be compiled and added to codes (yes, this is the first-time failure I referred to in the first message). I tested the performance of your 2 variations in context, and there seems to be no noticeable performance gain (something like less than 1% gain). But note the two variations as you code them do not quite do exactly the same test. I have adjusted to use this code now: def __getitem__(self, expr): try: return eval(self.codes[expr], self.globals, self.locals) except: # We want to catch **all** evaluation errors! # KeyError, NameError, AttributeError, SyntaxError, V # alueError, TypeError, IOError, ... # # Special case: # if KeyError is coming from self.codes[expr] lookup, # then we add the compiledentry and try again: if not name in self.codes: self.codes[expr] = compile(name, 'string', 'eval') return self[expr] # handle any other error... This retains the correctness of the check, and has the same marginal perf improvement (that is basically negligible, but at least it is not slower) and has teh advantage that the code is clearer. Thanks for the remarks! mario -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Ideas to optimize this getitem/eval call?
correction: the code posted in previous message should have been: def __getitem__(self, expr): try: return eval(self.codes[expr], self.globals, self.locals) except: # We want to catch **all** evaluation errors! # KeyError, NameError, AttributeError, SyntaxError, V # alueError, TypeError, IOError, ... # # Special case: # if KeyError is coming from self.codes[expr] lookup, # then we add the compiledentry and try again: if not expr in self.codes: self.codes[expr] = compile(expr, 'string', 'eval') return self[expr] # handle any other error... -- http://mail.python.org/mailman/listinfo/python-list
Re: Ideas to optimize this getitem/eval call?
On Sat, 03 Jan 2009 04:14:14 -0800, mario wrote: On Jan 3, 7:16 am, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: [...] I must say though, your choice of builtins to prohibit seems rather arbitrary. What is dangerous about (e.g.) id() and isinstance()? Preventive, probably. I also feel that temlates should have any business with info such as the memory addressed returnred by id(). For isinstance, becuase it is somewhat related to __subclasses__ that is known to be insecure. It's that probably that worries me. In my earlier post, I was going to say your choice of builtins to prohibit seems rather Cargo Cult-ish, but I decided that might be a bit rude, and I decided to give you the benefit of the doubt. But your answer here doesn't really give me any more confidence. What security issues do you think you are preventing by prohibiting id()? It is true that the id returned is a memory address in CPython, but that's a mere implementation detail, and may not be true in other implementations. Even if it is a memory address, so what? You can't do anything with it except treat it as an integer. I don't see how id() decreases security one iota. In the absence of even a theoretical threat, banning a function that returns an int because the int is derived from a memory address (but can't be used as one!) makes you appear to be engaging in Cargo Cult programming: following the ritual, without any understanding of what you are doing. http://catb.org/~esr/jargon/html/C/cargo-cult-programming.html Sorry to be so harsh, especially over such a tiny little thing as the id() function, but that's the impression it gives me, and I'm probably not the only one. It is true that a false positive (needlessly banning a harmless function) has less serious consequences than a false negative (failing to ban a harmful function), but it does reduce confidence that you know what you're doing. Likewise for isinstance() and issubclass(). They may be related to __subclasses__, but they don't give the caller access to __subclassess__. So what is the actual threat you are defending against? I'm not saying that there must be an actual in-the-wild demonstrated vulnerability before you prohibit a built-in, but there should be at least a plausible attack vector. Incidentally, I updated the page you point to clarify what is going on. Speaking of that page, you have an obvious typo (teh instead of the) that should be fixed: In addition to teh above, all subclasses of BaseException... http://evoque.gizmojo.org/usage/restricted/ I also added a link to Brett Cannon's inspiration paper on the topic of securing the python interpreter... except: # KeyError, NameError, AttributeError, SyntaxError, # ValueError, TypeError, IOError If you want to capture seven exceptions, then capture seven exceptions, not any exception. Absolutely not. I want to catch ALL evaluation exceptions... I see. It wasn't clear that this was deliberate, I read it as if there was a fixed, finite list you wanted to catch, and nothing else. You should make sure this is clearly documented in your code. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Ideas to optimize this getitem/eval call?
Hi, below is the essence of a an expression evaluator, by means of a getitem lookup. The expression codes are compiled and cached -- the lookup is actually recursive, and the first time around it will always fail. import sys class GetItemEval(object): def __init__(self): self.globals = globals() # some dict (always the same) self.locals = {} # some other dict (may be different between evaluations) self.codes = {} # compiled code expressions cache def __getitem__(self, expr): try: return eval(self.codes[expr], self.globals, self.locals) except: # KeyError, NameError, AttributeError, SyntaxError, ValueError, # TypeError, IOError # # Special case if a KeyError is coming from the self.codes [name] # lookup (traceback should consist of a single frame only): if sys.exc_info()[2].tb_next is None: if sys.exc_info()[0] is KeyError: self.codes[expr] = compile(expr, 'string', 'eval') return self[expr] # otherwise handle eval error in some way... This class could be used in a way as follows: # define some expressions def f(s): return [+s+] exprs = [1+2+3, 2*3*5, f(__name__)] # instantiate one gie = GetItemEval() # use it to lookup/eval each expression for x in exprs: print x, =, gie[x] And, fwiw, some sample timeit code: import timeit print timeit.Timer(for x in exprs: gie[x], from __main__ import gie, exprs).timeit(50) I had done various poking to discover if it could be made to go faster, and in the end I settled on the version above. mario Incidentally this constitutes the lion's share of the evaluation runtime of evoque templating... http://evoque.gizmojo.org/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Ideas to optimize this getitem/eval call?
What do you mean by 'fail'? you have; :: self.codes = {} so :: try: ::return eval(self.codes[expr], self.globals, self.locals) will always return an exception the first time (if this is what you're referring to). -- http://mail.python.org/mailman/listinfo/python-list
Re: Ideas to optimize this getitem/eval call?
On Fri, 02 Jan 2009 17:29:29 -0800, mario wrote: Hi, below is the essence of a an expression evaluator, by means of a getitem lookup. The expression codes are compiled and cached -- the lookup is actually recursive, and the first time around it will always fail. import sys class GetItemEval(object): def __init__(self): self.globals = globals() # some dict (always the same) self.locals = {} # some other dict (may be different between evaluations) self.codes = {} # compiled code expressions cache def __getitem__(self, expr): try: return eval(self.codes[expr], self.globals, self.locals) I was about to make a comment about this being a security hole, but I see from here http://evoque.gizmojo.org/usage/restricted/ that you are aware of at least some of the issues. I must say though, your choice of builtins to prohibit seems rather arbitrary. What is dangerous about (e.g.) id() and isinstance()? except: # KeyError, NameError, AttributeError, SyntaxError, # ValueError, TypeError, IOError If you want to capture seven exceptions, then capture seven exceptions, not any exception. You should write: except (KeyError, NameError, ..., IOError): instead of a bare except clause. That will capture exceptions that represent bugs in your code as well as exceptions that should propbably be allowed to propagate, such as KeyboardInterupt and SystemExit. # Special case if a KeyError is coming from the self.codes[name] # lookup (traceback should consist of a single frame only): if sys.exc_info()[2].tb_next is None: if sys.exc_info()[0] is KeyError: self.codes[expr] = compile(expr, 'string', 'eval') return self[expr] # otherwise handle eval error in some way... That seems awfully complicated for no good reason. If you want to handle KeyError differently, then capture KeyError: try: ... except KeyError: handle_keyerror() except: (NameError, ..., IOError): handle_everythingelse() It also seems significant slower: import sys def catch1(): ... D = {} ... try: ... D['x'] ... except KeyError: ... pass ... def catch2(): ... D = {} ... try: ... D['x'] ... except: ... if sys.exc_info()[2].tb_next is None: ... if sys.exc_info()[0] is KeyError: ... pass ... from timeit import Timer t1 = Timer('catch1()', 'from __main__ import catch1') t2 = Timer('catch2()', 'from __main__ import catch2, sys') min(t1.repeat()) 4.8421750068664551 min(t2.repeat()) 6.2694659233093262 -- Steven -- http://mail.python.org/mailman/listinfo/python-list