Re: Creating a reliable sandboxed Python environment
In a message of Sun, 31 May 2015 09:52:29 +1000, "Steven D'Aprano" writes: >How many PyPy sandboxes are being used with hostile users motivated to break >out of the sandbox? > >"I wrote a sandbox which I can't break out of" is different from "I wrote a >sandbox which nobody can break out of". Javascript is sandboxed, but due to >bugs in implementations, Javascript-based exploits are now heavily used by >malware. There are possibly even more Javascript-based exploits than buffer >overflow based exploits these days, as C programmers get better at using >automated tools that check for buffer overflows. I don't know, as we don't really have a way of tracking who is using PyPy for anything. We know we have some. Laura -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
davidf...@gmail.com writes: > Thanks for the responses folks. I will briefly summarize them:... I do think you should look at Geordi (the C++ IRC bot) that I linked. It seems to have changed its implementation to use Docker, but either way, lots of the the stuff it did was language independent. -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
While this thread is indeed a theoretical discussion of the interpreter, for a practical solution where you control the host environment, one might look into OS level sandboxing like FreeBSD's Jails (not to be confused with a simple chroot environment) along with various resource limiting parameters. You can lock down a 'sandboxed' i.e. jailed environment for arbitrary data and processes, including python, pretty tightly. -Kurt- On Sat, May 30, 2015 at 5:52 PM, Steven D'Aprano wrote: > On Sat, 30 May 2015 09:24 pm, Laura Creighton wrote: > > > In a message of Sat, 30 May 2015 19:00:14 +1000, "Steven D'Aprano" > writes: > >>I wouldn't have imagined that the claim "it's easier to secure a small > >>language with a few features than a big language with lots of features" > >>would have been so controversial. I wonder if this claim will be equally > >>as controversial? > >> > >>There is a rough correlation between the number of lines of code in a > code > >>base, and the number of potential security holes that need to be guarded > >>against. > > > > Maybe these aren't controversial if you are doing langauge level > > sandboxing, but you don't have to sandbox like that. Consider, for a > > moment, the sandboxing technique used by PyPy > > discussed at > > > > http://pypy.readthedocs.org/en/latest/sandbox.html > > > > You think it is way cool, but, alas, you want to sandbox some other > > language than Python. > > How many PyPy sandboxes are being used with hostile users motivated to > break > out of the sandbox? > > "I wrote a sandbox which I can't break out of" is different from "I wrote a > sandbox which nobody can break out of". Javascript is sandboxed, but due to > bugs in implementations, Javascript-based exploits are now heavily used by > malware. There are possibly even more Javascript-based exploits than buffer > overflow based exploits these days, as C programmers get better at using > automated tools that check for buffer overflows. > > > > -- > Steven > > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On Sat, 30 May 2015 09:24 pm, Laura Creighton wrote: > In a message of Sat, 30 May 2015 19:00:14 +1000, "Steven D'Aprano" writes: >>I wouldn't have imagined that the claim "it's easier to secure a small >>language with a few features than a big language with lots of features" >>would have been so controversial. I wonder if this claim will be equally >>as controversial? >> >>There is a rough correlation between the number of lines of code in a code >>base, and the number of potential security holes that need to be guarded >>against. > > Maybe these aren't controversial if you are doing langauge level > sandboxing, but you don't have to sandbox like that. Consider, for a > moment, the sandboxing technique used by PyPy > discussed at > > http://pypy.readthedocs.org/en/latest/sandbox.html > > You think it is way cool, but, alas, you want to sandbox some other > language than Python. How many PyPy sandboxes are being used with hostile users motivated to break out of the sandbox? "I wrote a sandbox which I can't break out of" is different from "I wrote a sandbox which nobody can break out of". Javascript is sandboxed, but due to bugs in implementations, Javascript-based exploits are now heavily used by malware. There are possibly even more Javascript-based exploits than buffer overflow based exploits these days, as C programmers get better at using automated tools that check for buffer overflows. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
Chris Angelico writes: > Turing completeness isn't the whole story. How do you go about > sandboxing a Brainf* implementation such that it can be used to > implement Python, but can't be used to read or arbitrary files from > your file system? We're talking about sandboxing, so preventing the sandboxed Python interpreter written in embedded BF from accessing arbitrary files is the whole point. If you want to let a sandboxed program access a file, you create some kind of managed handle outside the interpreter, and pass that into the interpreter so the interpreted program can make a constrained set of calls on it. That's how Java applets work and it's basically the opposite of Python's "consenting adults" approach which is to let everything access everything. -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On Sun, May 31, 2015 at 6:00 AM, Paul Rubin wrote: > Steven D'Aprano writes: >> I wouldn't have imagined that the claim "it's easier to secure a small >> language with a few features than a big language with lots of features" >> would have been so controversial. > > Consider that if the small language is Turing-complete, you can use it > to implement the big language. If the small language is also secure (in > the sense of not being able to escape a sandbox), the big language > implemented in it can't escape the sandbox either. Therefore the size > of the language doesn't inherently affect the sandbox security. Turing completeness isn't the whole story. How do you go about sandboxing a Brainf* implementation such that it can be used to implement Python, but can't be used to read or arbitrary files from your file system? Will you reimplement the Python standard library in Brainf*? Will you implement open(), but nerf it? Will you make sure there's nothing anywhere in the stdlib that can open files? And if you _don't_ provide a reimplemented standard library, you either need to provide an import mechanism (so you can make use of the existing Python code) or declare that the language as a whole is neutered by a complete lack of all those features that are implemented in Python. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
In a message of Sat, 30 May 2015 20:42:49 +0200, Stefan Behnel writes: >So here the cost of security is actually rewriting the entire language >runtime and potentially also major parts of its ecosystem? Not exactly a >cheap price either. > >Stefan Well, the runtime is mostly generated, you don't have to write it by hand. But, yes, writing an interpreter is work, no question. I think that the problem of writing an interpreter is a much smaller proposition than playing whack-a-mole with language level sandboxing, but depending on your language, I could be wrong about that. Laura -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
Steven D'Aprano writes: > I wouldn't have imagined that the claim "it's easier to secure a small > language with a few features than a big language with lots of features" > would have been so controversial. Consider that if the small language is Turing-complete, you can use it to implement the big language. If the small language is also secure (in the sense of not being able to escape a sandbox), the big language implemented in it can't escape the sandbox either. Therefore the size of the language doesn't inherently affect the sandbox security. Implementing Python in Lua (with LuaJIT) might even have tolerable performance, possibly beating CPython. > I wonder if this claim will be equally as controversial? There is a > rough correlation between the number of lines of code in a code base, > and the number of potential security holes that need to be guarded > against. Bigger programs are more likely to have bugs, sure, and Lua might have those already. But that's not the issue Python faces regarding sandboxing, where it's insecure by design. >> Stuff like bignums and unicode in themselves wouldn't have >> affected security. > > Do you consider a Denial of Service or Memory Exhaustion attack to be a > security issue? It's less of an issue on the client side were you don't mind too much if an attacker DOS's his own machine. Otherwise you have to consider memory allocation and CPU cycles to be controlled system resources, which is not rocket science (every operating system does that). I'm not sure where Lua sits with regard to this. > If not, try running this in Python: > 100**100**100 That's not an issue with bignums in themselves, but rather it's an artifact of CPython's implementation. Exponentiation works by repeated squaring, and each squaring step only doubles the size of its input and uses predictable cycles, so a sandboxed implementation could get by with just checking input sizes before every multiplication. > (Perhaps not a great idea.) How about defeating cryptographic protection > mechanisms?... > Or using Unicode to bypass data validation?... > https://capec.mitre.org/data/definitions/71.html > Unicode encoding attacks?... ... ... None of the stuff you listed appear to be issues inherent with supporting some feature in a language. They are mostly application and library bugs. I got bored enough that I didn't look at all of them, so maybe I missed something. -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
Laura Creighton schrieb am 30.05.2015 um 13:24: > As a point of fact, We've _already got_ Topaz, a Ruby interpreter, > Hippy, a PHP interpreter, a Prolog interpreter, a Smalltalk > interpeter, and a javascript interpreter. Recently we got Pyket a > Racket compiler. There also exist plenty of experimental languages > written by academic langauge designers, and other crazy people who > like such things. But don't ask the PyPy project about hard is it to > sandbox one versus the other. From our point of view, they all cost > the same -- free, as in _already done for you_, same as you get a JIT > for free, and pluggable garbage collectors for free, etc. etc. So here the cost of security is actually rewriting the entire language runtime and potentially also major parts of its ecosystem? Not exactly a cheap price either. Stefan -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On Sat, May 30, 2015 at 10:06 PM, BartC wrote: > On 29/05/2015 23:49, Chris Angelico wrote: >> That's 64-bit integers, not arbitrary-precision, but that's something >> at least. You do still need to worry about what happens when your >> numbers get too big; in Python, you simply don't. So it's still not >> quite there in terms of functionality. > > > But then the vast majority of integer operations won't require arbitrary > precision. (Or maybe Python programmers routinely use big integers all over > the place simply because they can.) It's true that it won't often be an issue, but it's a matter of never needing to worry about it. Have you ever had to tweak an algorithm to ensure that you never go past some arbitrary boundary? In Python, with pure integer arithmetic, you can guarantee that mathematical truths are going to be maintained. There might still be good reason for doing operations in one order rather than another, but it's to do with performance, not correctness; the simple and naive algorithm can be used to verify correctness of any smarter algorithm. That's an important feature. > Python seems to have sacrificed some performance. When I questioned why 3.x > was slower than 2.x, merging int and long int (as I understood it) was one > of the reasons put forward. Yes, but that's an API point. A future version of Python could re-separate them as an optimization, as long as script-level code can't tell the difference. That's how Pike works, for instance; its native "int" type handles both machine words and bignums, with no visible distinction except performance. The common case (small numbers) is optimized; but short of probing for timings, there's no way to notice the actual boundary between the two. [1] This kind of change could be done to a future Python without breaking any existing code, and without breaking the expectation that integer arithmetic can go to infinity. ChrisA [1] Well, Pike also has a constant Int.NATIVE_MAX, in case a program is curious. But you get the idea. -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On 29/05/2015 23:49, Chris Angelico wrote: On Sat, May 30, 2015 at 4:33 AM, Paul Rubin wrote: Chris Angelico writes: Looks to me as if Lua doesn't have integers at all They fixed that in Lua 5.3: http://www.lua.org/manual/5.3/readme.html#changes That's 64-bit integers, not arbitrary-precision, but that's something at least. You do still need to worry about what happens when your numbers get too big; in Python, you simply don't. So it's still not quite there in terms of functionality. But then the vast majority of integer operations won't require arbitrary precision. (Or maybe Python programmers routinely use big integers all over the place simply because they can.) Likewise, eight-bit strings, not Unicode. Also fixed in 5.3 (basic utf-8 support added, per above). Do you see what I mean about functionality being sacrificed for security? There is no way that this could be called fully functional by comparison with Python. Python seems to have sacrificed some performance. When I questioned why 3.x was slower than 2.x, merging int and long int (as I understood it) was one of the reasons put forward. (Simplicity seems to work for Lua. The entire distribution (for LuaJIT 2.0), seems to be about 2MB, including C sources, and the JIT interpreter is about 220KB. LuaJIT was also one of the fastest dynamic languages I've tried. But you're right that the language is rather sparse.) -- Bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
In a message of Sat, 30 May 2015 19:00:14 +1000, "Steven D'Aprano" writes: >I wouldn't have imagined that the claim "it's easier to secure a small >language with a few features than a big language with lots of features" >would have been so controversial. I wonder if this claim will be equally as >controversial? > >There is a rough correlation between the number of lines of code in a code >base, and the number of potential security holes that need to be guarded >against. Maybe these aren't controversial if you are doing langauge level sandboxing, but you don't have to sandbox like that. Consider, for a moment, the sandboxing technique used by PyPy discussed at http://pypy.readthedocs.org/en/latest/sandbox.html You think it is way cool, but, alas, you want to sandbox some other language than Python. What do you do? You write an interpreter for this language in RPython. Clearly, writing such a thing will be a lot easier for 'the toy language that does hardly anything I invented this morning' as opposed to 'javascript that is expected to operate in the real world' but this has nothing to do with the security aspects of the two langauges. You'd have the exact same problem of difficulty even if you never intend to sandbox the thing at all. The sandboxing aspects will happen, automatically, as soon as you have a written a working interpreter. The layer that provides the security doesn't care about your target language, just as long as it is written in RPython. As a point of fact, We've _already got_ Topaz, a Ruby interpreter, Hippy, a PHP interpreter, a Prolog interpreter, a Smalltalk interpeter, and a javascript interpreter. Recently we got Pyket a Racket compiler. There also exist plenty of experimental languages written by academic langauge designers, and other crazy people who like such things. But don't ask the PyPy project about hard is it to sandbox one versus the other. From our point of view, they all cost the same -- free, as in _already done for you_, same as you get a JIT for free, and pluggable garbage collectors for free, etc. etc. If you find this stuff interesting, come check it out. Laura -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On Sat, 30 May 2015 02:48 pm, Paul Rubin wrote: > Chris Angelico writes: >> You can *easily* sandbox something that has very little functionality >> - all you have to do is provide a minimalist "language" that permits >> only a very few actions, and you know it's safe. But that security >> comes at a price. > > This is a non-sequitur. The reason they didn't put more features into > Lua is that it would have made the memory footprint bigger and they > pitch it as an embeddable extension engine so they want to keep it > small. I wouldn't have imagined that the claim "it's easier to secure a small language with a few features than a big language with lots of features" would have been so controversial. I wonder if this claim will be equally as controversial? There is a rough correlation between the number of lines of code in a code base, and the number of potential security holes that need to be guarded against. > Stuff like bignums and unicode in themselves wouldn't have > affected security. Do you consider a Denial of Service or Memory Exhaustion attack to be a security issue? If not, try running this in Python: 100**100**100 (Perhaps not a great idea.) How about defeating cryptographic protection mechanisms? https://www.auscert.org.au/21885 Or using Unicode to bypass data validation? https://capec.mitre.org/data/definitions/71.html Unicode encoding attacks? https://www.owasp.org/index.php/Unicode_Encoding https://cwe.mitre.org/data/definitions/176.html Unicode spoofing? Buffer overflows? UTF-8 exploits? IDNA exploits? Code point deletion exploits? Malicious rendering? http://unicode.org/reports/tr36/ http://unicode.org/faq/security.html [...] > Heck, think of Java, which is monstrously more complicated than Python > and supports applet sandboxing, plus it can run Python programs (under > Jython). Or Javascript, which has similar complexity to Python and runs > sandboxes in millions (billions?) of browsers. Funny you should mention Javascript... http://securityevaluators.com/knowledge/papers/engineeringheapoverflow.pdf http://security.stackexchange.com/questions/41966/ https://nakedsecurity.sophos.com/exploring-the-blackhole-exploit-kit-7/ http://resources.infosecinstitute.com/fbi-tor-exploit/ https://www.mozilla.org/en-US/security/advisories/mfsa2013-53/ Yes, I can see why you think Javascript is securely sandboxed... *wink* -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
Chris Angelico writes: > You can *easily* sandbox something that has very little functionality > - all you have to do is provide a minimalist "language" that permits > only a very few actions, and you know it's safe. But that security > comes at a price. This is a non-sequitur. The reason they didn't put more features into Lua is that it would have made the memory footprint bigger and they pitch it as an embeddable extension engine so they want to keep it small. Stuff like bignums and unicode in themselves wouldn't have affected security. There's no obstacle to implementing Python the way Lua is implemented, and for all I know, MicroPython does that. It just didn't happen to be done that way with CPython because of Python's expected mode of use historically. Armin Ronacher's blog entry that I linked says a little more about this. Heck, think of Java, which is monstrously more complicated than Python and supports applet sandboxing, plus it can run Python programs (under Jython). Or Javascript, which has similar complexity to Python and runs sandboxes in millions (billions?) of browsers. -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On Sat, May 30, 2015 at 11:28 AM, Paul Rubin wrote: > Chris Angelico writes: >> Do you see what I mean about functionality being sacrificed for >> security? > > No I don't. Lua has less functionality because it was designed to have > a small embedding footprint. Python is much bigger because it was > mostly designed to run as a standalone interpreter. That has nothing to > do with security. You haven't shown the slightest connection between > Lua's lower functionality and its higher sandbox security, because there > is none. The lower functionality is because of a totally independent > reason, namely the desire to make the interpreter smaller. This thread started out as "How can I sandbox Python inside Python?". One of the responses was "You can't, but try sandboxing Lua inside Python instead". This has the cost that Lua, unlike Python, simply lacks features. You can *easily* sandbox something that has very little functionality - all you have to do is provide a minimalist "language" that permits only a very few actions, and you know it's safe. But that security comes at a price. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
Chris Angelico writes: > Do you see what I mean about functionality being sacrificed for > security? No I don't. Lua has less functionality because it was designed to have a small embedding footprint. Python is much bigger because it was mostly designed to run as a standalone interpreter. That has nothing to do with security. You haven't shown the slightest connection between Lua's lower functionality and its higher sandbox security, because there is none. The lower functionality is because of a totally independent reason, namely the desire to make the interpreter smaller. -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On Sat, May 30, 2015 at 4:33 AM, Paul Rubin wrote: > Chris Angelico writes: >> Looks to me as if Lua doesn't have integers at all > > They fixed that in Lua 5.3: > > http://www.lua.org/manual/5.3/readme.html#changes That's 64-bit integers, not arbitrary-precision, but that's something at least. You do still need to worry about what happens when your numbers get too big; in Python, you simply don't. So it's still not quite there in terms of functionality. >> Likewise, eight-bit strings, not Unicode. > > Also fixed in 5.3 (basic utf-8 support added, per above). This is definitely NOT what I'm talking about. From what I can see, 5.3 introduces a new library for working with strings of bytes as if they were UTF-8. That's on par with PHP's Unicode support - basically, nothing that isn't explicitly coded. It makes Unicode something that you tack on with resignation, rather than something that's just an automatic part of text handling. So, a long way from being there in functionality. Do you see what I mean about functionality being sacrificed for security? There is no way that this could be called fully functional by comparison with Python. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
Marko Rauhamaa writes: >> The language features are an orthogonal issue to embeddability. > I doubt that. Guile is designed for embedding but it is a full-fledged > Scheme implementation. Orthogonal means independent, not opposing. > I have very little experience with Lua. What surprises me is that it is > not as elementary as it could be: I don't really see the value of > metatables and weak tables. It lets you implement various flavors of OOP etc. without different language mechanisms. -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
Paul Rubin : > The language features are an orthogonal issue to embeddability. I doubt that. Guile is designed for embedding but it is a full-fledged Scheme implementation. > Lua is easier to embed securely because its embedding interface was > designed for that. I have very little experience with Lua. What surprises me is that it is not as elementary as it could be: I don't really see the value of metatables and weak tables. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
Chris Angelico writes: >> It doesn't add much to your application to embed Lua > Lua's a much weaker language than Python is, though. Can it handle > arbitrary-precision integers? Unicode? Dare I even ask, > arbitrary-precision rationals (fractions.Fraction)? Security comes at > a price, I guess. The language features are an orthogonal issue to embeddability. Lua is easier to embed securely because its embedding interface was designed for that. It's also easy to call C functions from Lua, so if you want arbitrary precision integers or rationals, you can link GMP into your application use it from your Lua scripts. As another example, Javascript is as powerful as Python (though worse in many ways due to misdesign), but also by now supports reasonably secure embedding (or else it wouldn't be usable in browsers). See http://lucumr.pocoo.org/2014/8/16/the-python-i-would-like-to-see/#the-damn-interpreter for Armin Ronacher's comments on CPython's not-so-great embedding interface. The interface he contrasts it with is essentially how Lua and Javascript (at least in some implementations) work. I haven't looked at MicroPython or PyPy. -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
Chris Angelico writes: > Looks to me as if Lua doesn't have integers at all They fixed that in Lua 5.3: http://www.lua.org/manual/5.3/readme.html#changes > Likewise, eight-bit strings, not Unicode. Also fixed in 5.3 (basic utf-8 support added, per above). -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
In a message of Fri, 29 May 2015 19:38:21 +1000, Chris Angelico writes: >The point was to sandbox something inside Python. Otherwise, yes, just >write it in Python. But if you do have to sandbox like this, you lose >language-level Unicode support, language-level arbitrary precision >integers, etcetera, etcetera, etcetera. So I stand by my previous >statement: The price of security is functionality. > >ChrisA You can run a pypy sandbox from inside your CPython app, if that is what you want to do. http://pypy.readthedocs.org/en/latest/sandbox.html Just FYI. Laura -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On Fri, May 29, 2015 at 7:23 PM, Stefan Behnel wrote: > Chris Angelico schrieb am 29.05.2015 um 09:41: >> On Fri, May 29, 2015 at 4:18 PM, Stefan Behnel wrote: Lua's a much weaker language than Python is, though. Can it handle arbitrary-precision integers? Unicode? Dare I even ask, arbitrary-precision rationals (fractions.Fraction)? >>> >>> All of those and way more, as long as you use it embedded in Python. >> >> Okay, so how would you go about using Lua-embedded-in-Python to >> manipulate Unicode text? > > Lua only supports byte strings, so Lupa will encode and decode them for > you. If that's not enough, you'll have to work with Python Unicode string > objects through the language interface. (And I just noticed that the > handling can be improved here by overloading Lua operators with Python > operators - not currently implemented.) > > >> Looks to me as if Lua doesn't have integers at all > > The standard number type in Lua is a C double float, i.e. the steady > integer range is somewhere within +/-2^53. That tends to be enough for a > *lot* of use cases. You could change that type in the Lua C code (e.g. to a > 64 bit int), but that's usually a bad idea. The same comment as above > applies: if you need Python object features, use Python objects. Unicode strings shouldn't involve the hassle of bouncing through an interface layer. Nobody will bother, and the result will be code that's ASCII-only. That happens often enough even in Python 2, where u"foo" is a Unicode string. > Embedding Lua in Python gives you access to all of Python's objects and > ecosystem. It may not always be as cool to use as from Python, but in that > case, why not code it in Python in the first place? You wouldn't use > Lua/Lupa to write whole applications, just the user defined parts of them. > The rest can happily remain in Python. And should, for your own sanity. The point was to sandbox something inside Python. Otherwise, yes, just write it in Python. But if you do have to sandbox like this, you lose language-level Unicode support, language-level arbitrary precision integers, etcetera, etcetera, etcetera. So I stand by my previous statement: The price of security is functionality. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
Chris Angelico schrieb am 29.05.2015 um 09:41: > On Fri, May 29, 2015 at 4:18 PM, Stefan Behnel wrote: >>> Lua's a much weaker language than Python is, though. Can it handle >>> arbitrary-precision integers? Unicode? Dare I even ask, >>> arbitrary-precision rationals (fractions.Fraction)? >> >> All of those and way more, as long as you use it embedded in Python. > > Okay, so how would you go about using Lua-embedded-in-Python to > manipulate Unicode text? Lua only supports byte strings, so Lupa will encode and decode them for you. If that's not enough, you'll have to work with Python Unicode string objects through the language interface. (And I just noticed that the handling can be improved here by overloading Lua operators with Python operators - not currently implemented.) > Looks to me as if Lua doesn't have integers at all The standard number type in Lua is a C double float, i.e. the steady integer range is somewhere within +/-2^53. That tends to be enough for a *lot* of use cases. You could change that type in the Lua C code (e.g. to a 64 bit int), but that's usually a bad idea. The same comment as above applies: if you need Python object features, use Python objects. Embedding Lua in Python gives you access to all of Python's objects and ecosystem. It may not always be as cool to use as from Python, but in that case, why not code it in Python in the first place? You wouldn't use Lua/Lupa to write whole applications, just the user defined parts of them. The rest can happily remain in Python. And should, for your own sanity. Stefan -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On Fri, May 29, 2015 at 4:18 PM, Stefan Behnel wrote: >> Lua's a much weaker language than Python is, though. Can it handle >> arbitrary-precision integers? Unicode? Dare I even ask, >> arbitrary-precision rationals (fractions.Fraction)? > > All of those and way more, as long as you use it embedded in Python. Okay, so how would you go about using Lua-embedded-in-Python to manipulate Unicode text? I'm not talking about things like the unicodedata module and the name/codepoint lookups, I'm talking about the basics of working with international text and the fundamental need for your native string type to cope with that. Do you have to keep bouncing back and forth between Python and Lua? And if you do arithmetic in the single most obvious way, what happens? http://www.lua.org/pil/2.3.html Looks to me as if Lua doesn't have integers at all, and so the obvious form of arithmetic will be equivalent to doing everything in Python floats. That's not arbitrary precision. There's also a comment that you can use some other type for numbers, but I suspect that's still "some other C type", so you still can't do arbitrary precision. (Plus you get one type for ints and floats, so even if you did select some magic type that bounces through to a Python int, it'd stop you from doing any non-integral arithmetic at all.) http://www.lua.org/pil/2.4.html Likewise, eight-bit strings, not Unicode. While I could accept some sort of inter-language thunk for something uncommon, like fractions.Fraction, forcing people to thunk back and forth for basic arithmetic and string manipulation is way too much hassle - which means they won't do it, which means strings will be eight-bit and numbers will be doubles. And eight-bit strings might be treated as UTF-8, or might be treated as some arbitrary codepage, or might be both at once in different contexts, so you can't really depend on them being anything more than ASCII. Or can you change all this when you embed Lua? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
Chris Angelico schrieb am 28.05.2015 um 20:51: > On Fri, May 29, 2015 at 4:41 AM, Stefan Behnel wrote: >> davidf...@gmail.com schrieb am 26.05.2015 um 04:24: >>> Has anyone on this list attempted to sandbox Python programs in a >>> serious fashion? I'd be interested to hear your approach. >> >> Not quite sandboxing Python, but I've seen people use my Lupa [1] library >> for this. They're writing all their code in Python, and then let users >> embed their own Lua code into it to script their API. The Lua runtime is >> apparently quite good at sandboxing, and it's really small, just some 600KB >> or so. Lupa then lets you easily control the access to your Python code at >> a whitelist level by intercepting all Python attribute lookups. >> >> It doesn't add much to your application to embed Lua (or even LuaJIT) in >> Python, and it gives users a nicely object oriented language to call and >> orchestrate your Python objects. > > Lua's a much weaker language than Python is, though. Can it handle > arbitrary-precision integers? Unicode? Dare I even ask, > arbitrary-precision rationals (fractions.Fraction)? All of those and way more, as long as you use it embedded in Python. > Security comes at a price, I guess. Sure, but features aren't the price here. Stefan -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On Fri, May 29, 2015 at 4:41 AM, Stefan Behnel wrote: > davidf...@gmail.com schrieb am 26.05.2015 um 04:24: >> Has anyone on this list attempted to sandbox Python programs in a >> serious fashion? I'd be interested to hear your approach. > > Not quite sandboxing Python, but I've seen people use my Lupa [1] library > for this. They're writing all their code in Python, and then let users > embed their own Lua code into it to script their API. The Lua runtime is > apparently quite good at sandboxing, and it's really small, just some 600KB > or so. Lupa then lets you easily control the access to your Python code at > a whitelist level by intercepting all Python attribute lookups. > > It doesn't add much to your application to embed Lua (or even LuaJIT) in > Python, and it gives users a nicely object oriented language to call and > orchestrate your Python objects. Lua's a much weaker language than Python is, though. Can it handle arbitrary-precision integers? Unicode? Dare I even ask, arbitrary-precision rationals (fractions.Fraction)? Security comes at a price, I guess. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
davidf...@gmail.com schrieb am 26.05.2015 um 04:24: > Has anyone on this list attempted to sandbox Python programs in a > serious fashion? I'd be interested to hear your approach. Not quite sandboxing Python, but I've seen people use my Lupa [1] library for this. They're writing all their code in Python, and then let users embed their own Lua code into it to script their API. The Lua runtime is apparently quite good at sandboxing, and it's really small, just some 600KB or so. Lupa then lets you easily control the access to your Python code at a whitelist level by intercepting all Python attribute lookups. It doesn't add much to your application to embed Lua (or even LuaJIT) in Python, and it gives users a nicely object oriented language to call and orchestrate your Python objects. Stefan [1] https://pypi.python.org/pypi/lupa -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
Thanks for the responses folks. I will briefly summarize them: > As you say, it is fundamentally not possible to make this work at the Python level. This is pretty effectively demonstrated by "Tav's admirable but failed attempt to sandbox file IO": * http://tav.espians.com/a-challenge-to-break-python-security.html Wow there are some impressive ways to confuse the system. I particularly like overriding str's equality function to defeat mode checking code when opening files. > When we needed this at edX, we wrote CodeJail > (https://github.com/edx/codejail). It's a wrapper around AppArmor to provide OS-level protection of code execution in subprocesses. It has Python-specific features, but because it is based on AppArmor, can sandbox any process, so long as it's configured properly. This looks promising. I will take a closer look. > What about launching the Python process in a Docker container? This may work in combination with other techniques. Certainly faster than spinning up a new VM or snapshot-restoring a fixed VM on a repeated basis. Would need to see whether CPU, Memory, and Disk usage could be constrained at the level of a container. - David On Monday, May 25, 2015 at 7:24:32 PM UTC-7, davi...@gmail.com wrote: > I am writing a web service that accepts Python programs as input, runs the > provided program with some profiling hooks, and returns various information > about the program's runtime behavior. To do this in a safe manner, I need to > be able to create a sandbox that restricts what the submitted Python program > can do on the web server. > > Almost all discussion about Python sandboxes I have seen on the internet > involves selectively blacklisting functionality that gives access to system > resources, such as trying to hide the "open" builtin to restrict access to > file I/O. All such approaches are doomed to fail because you can always find > a way around a blacklist. > > For my particular sandbox, I wish to allow *only* the following kinds of > actions (in a whitelist): > * reading from stdin & writing to stdout; > * reading from files, within a set of whitelisted directories; > * pure Python computation. > > In particular all other operations available through system calls are banned. > This includes, but is not limited to: > * writing to files; > * manipulating network sockets; > * communicating with other processes. > > I believe it is not possible to limit such operations at the Python level. > The best you could do is try replacing all the standard library modules, but > that is again just a blacklist - it won't prevent a determined attacker from > doing things like constructing their own 'code' object and executing it. > > It might be necessary to isolate the Python process at the operating system > level. > * A chroot jail on Linux & OS X can limit access to the filesystem. Again > this is just a blacklist. > * No obvious way to block socket creation. Again this would be just a > blacklist. > * No obvious way to detect unapproved system calls and block them. > > In the limit, I could dynamically spin up a virtual machine and execute the > Python program in the machine. However that's extremely expensive in > computational time. > > Has anyone on this list attempted to sandbox Python programs in a serious > fashion? I'd be interested to hear your approach. > > - David -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On Tuesday, May 26, 2015 at 4:24:32 AM UTC+2, davi...@gmail.com wrote: > I am writing a web service that accepts Python programs as input, runs the > provided program with some profiling hooks, and returns various information > about the program's runtime behavior. To do this in a safe manner, I need to > be able to create a sandbox that restricts what the submitted Python program > can do on the web server. > > Almost all discussion about Python sandboxes I have seen on the internet > involves selectively blacklisting functionality that gives access to system > resources, such as trying to hide the "open" builtin to restrict access to > file I/O. All such approaches are doomed to fail because you can always find > a way around a blacklist. > > For my particular sandbox, I wish to allow *only* the following kinds of > actions (in a whitelist): > * reading from stdin & writing to stdout; > * reading from files, within a set of whitelisted directories; > * pure Python computation. > > In particular all other operations available through system calls are banned. > This includes, but is not limited to: > * writing to files; > * manipulating network sockets; > * communicating with other processes. > > I believe it is not possible to limit such operations at the Python level. > The best you could do is try replacing all the standard library modules, but > that is again just a blacklist - it won't prevent a determined attacker from > doing things like constructing their own 'code' object and executing it. > > It might be necessary to isolate the Python process at the operating system > level. > * A chroot jail on Linux & OS X can limit access to the filesystem. Again > this is just a blacklist. > * No obvious way to block socket creation. Again this would be just a > blacklist. > * No obvious way to detect unapproved system calls and block them. > > In the limit, I could dynamically spin up a virtual machine and execute the > Python program in the machine. However that's extremely expensive in > computational time. > > Has anyone on this list attempted to sandbox Python programs in a serious > fashion? I'd be interested to hear your approach. > > - David What about launching the Python process in a Docker container? Spinning up a new container is pretty quick and it might provide you with enough isolation. Probably not a perfect solution, but I do believe that it would be easier than trying to sandbox Python itself. Marco -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On Monday, May 25, 2015 at 10:24:32 PM UTC-4, davi...@gmail.com wrote: > I am writing a web service that accepts Python programs as input, runs the > provided program with some profiling hooks, and returns various information > about the program's runtime behavior. To do this in a safe manner, I need to > be able to create a sandbox that restricts what the submitted Python program > can do on the web server. > > Almost all discussion about Python sandboxes I have seen on the internet > involves selectively blacklisting functionality that gives access to system > resources, such as trying to hide the "open" builtin to restrict access to > file I/O. All such approaches are doomed to fail because you can always find > a way around a blacklist. > > For my particular sandbox, I wish to allow *only* the following kinds of > actions (in a whitelist): > * reading from stdin & writing to stdout; > * reading from files, within a set of whitelisted directories; > * pure Python computation. > > In particular all other operations available through system calls are banned. > This includes, but is not limited to: > * writing to files; > * manipulating network sockets; > * communicating with other processes. > > I believe it is not possible to limit such operations at the Python level. > The best you could do is try replacing all the standard library modules, but > that is again just a blacklist - it won't prevent a determined attacker from > doing things like constructing their own 'code' object and executing it. > > It might be necessary to isolate the Python process at the operating system > level. > * A chroot jail on Linux & OS X can limit access to the filesystem. Again > this is just a blacklist. > * No obvious way to block socket creation. Again this would be just a > blacklist. > * No obvious way to detect unapproved system calls and block them. > > In the limit, I could dynamically spin up a virtual machine and execute the > Python program in the machine. However that's extremely expensive in > computational time. > > Has anyone on this list attempted to sandbox Python programs in a serious > fashion? I'd be interested to hear your approach. > > - David When we needed this at edX, we wrote CodeJail (https://github.com/edx/codejail). It's a wrapper around AppArmor to provide OS-level protection of code execution in subprocesses. It has Python-specific features, but because it is based on AppArmor, can sandbox any process, so long as it's configured properly. --Ned. -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
In a message of Tue, 26 May 2015 09:53:56 +0200, Laura Creighton writes: >In a message of Tue, 26 May 2015 17:10:30 +1000, "Steven D'Aprano" writes: >>My sense is that the only way to safely sandbox Python is to create your own >>Python implementation designed with security in mind. You can't get there >>starting from CPython. Maybe Jython? > >You get there starting with PyPy. >see: http://pypy.readthedocs.org/en/latest/sandbox.html > >Note: this is not very heavily used. You may find bugs we don't >know about yet. > >Laura Also, the place to discuss this is pypy-...@python.org or the #pypy channel on freenode. Laura -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
In a message of Tue, 26 May 2015 17:10:30 +1000, "Steven D'Aprano" writes: >My sense is that the only way to safely sandbox Python is to create your own >Python implementation designed with security in mind. You can't get there >starting from CPython. Maybe Jython? You get there starting with PyPy. see: http://pypy.readthedocs.org/en/latest/sandbox.html Note: this is not very heavily used. You may find bugs we don't know about yet. Laura -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On Tuesday 26 May 2015 12:24, davidf...@gmail.com wrote: > I am writing a web service that accepts Python programs as input, runs the > provided program with some profiling hooks, and returns various > information about the program's runtime behavior. To do this in a safe > manner, I need to be able to create a sandbox that restricts what the > submitted Python program can do on the web server. > > Almost all discussion about Python sandboxes I have seen on the internet > involves selectively blacklisting functionality that gives access to > system resources, such as trying to hide the "open" builtin to restrict > access to file I/O. All such approaches are doomed to fail because you can > always find a way around a blacklist. It's not so much that you can find your way around a blacklist, but that a blacklist only bans things which you have thought of. Perhaps the attacker has thought of something else. Ideally, a sandbox will whitelist functions which you know are safe, with a "default deny" policy. That requires building your own parser which only allows code that passes your whitelist. Even then, the problem is that perhaps there is an attack vector you didn't think of: something you thought was safe, actually is not. Have you read Tav's admirable but failed attempt to sandbox file IO? http://tav.espians.com/a-challenge-to-break-python-security.html http://tav.espians.com/paving-the-way-to-securing-the-python- interpreter.html http://tav.espians.com/update-on-securing-the-python-interpreter.html Also google for "Capabilities Python" or CapPython. My sense is that the only way to safely sandbox Python is to create your own Python implementation designed with security in mind. You can't get there starting from CPython. Maybe Jython? > For my particular sandbox, I wish to allow *only* the following kinds of > actions (in a whitelist): * reading from stdin & writing to stdout; > * reading from files, within a set of whitelisted directories; > * pure Python computation. Pure Python computation can be used to DOS your machine, e.g. (100**100)**100 will, I think, do it. (I'm not about to try it.) -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
davidf...@gmail.com writes: > Has anyone on this list attempted to sandbox Python programs in a > serious fashion? I'd be interested to hear your approach. There is something like that for C++ and it is quite complicated: https://github.com/Eelis/geordi I expect that for Python you'd have to do most of the same stuff. -- https://mail.python.org/mailman/listinfo/python-list
Re: Creating a reliable sandboxed Python environment
On Tue, May 26, 2015 at 12:24 PM, wrote: > I believe it is not possible to limit such operations at the Python level. > The best you could do is try replacing all the standard library modules, but > that is again just a blacklist - it won't prevent a determined attacker from > doing things like constructing their own 'code' object and executing it. > > It might be necessary to isolate the Python process at the operating system > level. > * A chroot jail on Linux & OS X can limit access to the filesystem. Again > this is just a blacklist. > * No obvious way to block socket creation. Again this would be just a > blacklist. > * No obvious way to detect unapproved system calls and block them. > > In the limit, I could dynamically spin up a virtual machine and execute the > Python program in the machine. However that's extremely expensive in > computational time. > > Has anyone on this list attempted to sandbox Python programs in a serious > fashion? I'd be interested to hear your approach. Yes, I had a project along similar lines to yours, a few years back. We wanted to let our end users customize our service using a Python script. Our conclusions were: 1) As you say, it is fundamentally not possible to make this work at the Python level. 2) It's extremely difficult to do at any other level, too. 3) Python is a great language, despite my then-boss's dislike of it. 4) Lua isn't as great a language, but it's much easier to sandbox. 5) Unicode is important, even if my then-boss took a lot of convincing on that one. (Was a big point in Python's favour, and against Lua.) 6) Efficient transfer of complex structured data across a process boundary is difficult. 7) Letting end users script your system safely is a fundamentally hard problem. We ended up abandoning Python altogether and using ECMAScript (with Google's V8 interpreter) as our scripting language, and even then, we had to do all sorts of things to make it safe. (And I wouldn't bet my life on it being safe even now. Not even sure I'd bet my data or uptime on it being safe, either.) My recommendation to you: If you absolutely have to run untrusted Python code, don't concern yourself with *anything* that the Python code can and can't do. You'll end up making gross and ugly hacks that stop people from doing legitimate things, in an attempt to prevent abuses. Instead, *just* guard yourself at the OS level - a chroot jail to protect what matters, iptables rules to prevent anything going to the outside world, run as a non-significant user with minimal permissions, ulimit everything so they can't hurt you. Whatever it takes, make it so that you could protect C code, because trust me, it'll be less headaches than trying to sandbox anything at the Python level. Or, worse, you won't get headaches, you'll just have a flawed security model that eventually gets exploited. There are a couple of alternatives. You could go for a really extreme protection system and actually spin up a virtual machine, where they're welcome to do whatever they like, and it'll run inside X amount of memory and Y amount of CPU. Pretty costly (the overhead of a full OS for every client), but it'll work. Or you could go to the other extreme, and instead of actually permitting arbitrary Python code, you instead allow a "Python-like syntax" wherein people can manipulate the input. You'd need to then create some special hacks to allow file I/O, so this probably wouldn't work for your scenario, but imagine writing a sed-like program that accepts Python code. You could do something like this: for line in input: print(evaluate_user_code(line), file=output) where evaluate_user_code() is a protected evaluator, like ast.literal_eval() but additionally allowing access to one name "line", which obviously would be the line in question. But for your case, I think that'd require too many hacks to be useful. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Creating a reliable sandboxed Python environment
I am writing a web service that accepts Python programs as input, runs the provided program with some profiling hooks, and returns various information about the program's runtime behavior. To do this in a safe manner, I need to be able to create a sandbox that restricts what the submitted Python program can do on the web server. Almost all discussion about Python sandboxes I have seen on the internet involves selectively blacklisting functionality that gives access to system resources, such as trying to hide the "open" builtin to restrict access to file I/O. All such approaches are doomed to fail because you can always find a way around a blacklist. For my particular sandbox, I wish to allow *only* the following kinds of actions (in a whitelist): * reading from stdin & writing to stdout; * reading from files, within a set of whitelisted directories; * pure Python computation. In particular all other operations available through system calls are banned. This includes, but is not limited to: * writing to files; * manipulating network sockets; * communicating with other processes. I believe it is not possible to limit such operations at the Python level. The best you could do is try replacing all the standard library modules, but that is again just a blacklist - it won't prevent a determined attacker from doing things like constructing their own 'code' object and executing it. It might be necessary to isolate the Python process at the operating system level. * A chroot jail on Linux & OS X can limit access to the filesystem. Again this is just a blacklist. * No obvious way to block socket creation. Again this would be just a blacklist. * No obvious way to detect unapproved system calls and block them. In the limit, I could dynamically spin up a virtual machine and execute the Python program in the machine. However that's extremely expensive in computational time. Has anyone on this list attempted to sandbox Python programs in a serious fashion? I'd be interested to hear your approach. - David -- https://mail.python.org/mailman/listinfo/python-list