Re: [Haskell-cafe] Compiling arbitrary Haskell code
For those who are interested (and I already chatted with Chris on IRC), I've implemented a pastebin that is able to (among some other things) to run arbitrary Haskell code: http://paste.hskll.org/ I've also developed a 'restricted-workers' library for managing processes that should run in secured environment. I've described some of my endeavors in a blog post: http://parenz.wordpress.com/2013/07/15/interactive-diagrams-gsoc-progress-report/ Bottom line: proper restrictions are hard, the necessary tools operate on a low level, there are some caveats too On Sat, Oct 12, 2013 at 12:30 AM, Christopher Done wrote: > Is there a definitive list of things in GHC that are unsafe to > _compile_ if I were to take an arbitrary module and compile it? > > E.g. off the top of my head, things that might be dangerous: > > * TemplateHaskell/QuasiQuotes -- obviously > * Are rules safe? > * #includes — I presume there's some security risk with including any old > file? > * FFI -- speaks for itself > > I'm interested in the idea of compiling Haskell code on lpaste.org, > for core, rule firings, maybe even Th expansion, etc. When sandboxing > code that I'm running, it's really easy if I whitelist what code is > available (parsing with HSE, whitelisting imports, extensions). The > problem of infinite loops or too much allocation is fairly > straight-forwardly solved by similar techniques applied in mueval. > > SafeHaskell helps a lot here, but suppose that I want to also allow > TemplateHaskell, GeneralizedNewtypeDeriving and stuff like that, > because a lot of real code uses those. They only seem to be restricted > to prevent cheeky messing with APIs in ways the authors of the APIs > didn't want -- but that shouldn't necessarily be a security—in terms > of my system—problem, should it? Ideally I'd very strictly whitelist > which modules are allowed to be used (e.g. a version of TH that > doesn't have runIO), and extensions, and then compile any code that > uses them. > > I'd rather not have to setup a VM just to compile Haskell code safely. > I'm willing to put some time in to investigate it, but if there's > already previous work done for this, I'd appreciate any links. > > At the end of the day, there's always just supporting a subset of > Haskell using SafeHaskell. I'm just curious about the more general > case, for use-cases similar to my own. > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > -- Sincerely yours, -- Daniil ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Compiling arbitrary Haskell code
On 12 October 2013 01:19, Johan Tibell wrote: > Whatever guarantees GHC offers (e.g. using Safe Haskell), I would always run > things like these in a sandbox. It's much better for security to dissallow > everything and then whitelist some things (e.g. let the sandbox communicate > with the rest of the world in some limited way) than the other way around. Yeah, the impression I'm getting is that compiling pretty much anything other than simple expressions (a la lambdabot) is that all bets are off. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Compiling arbitrary Haskell code
Whatever guarantees GHC offers (e.g. using Safe Haskell), I would always run things like these in a sandbox. It's much better for security to dissallow everything and then whitelist some things (e.g. let the sandbox communicate with the rest of the world in some limited way) than the other way around. Same goes for running untrusted code. On Fri, Oct 11, 2013 at 1:30 PM, Christopher Done wrote: > Is there a definitive list of things in GHC that are unsafe to > _compile_ if I were to take an arbitrary module and compile it? > > E.g. off the top of my head, things that might be dangerous: > > * TemplateHaskell/QuasiQuotes -- obviously > * Are rules safe? > * #includes — I presume there's some security risk with including any old > file? > * FFI -- speaks for itself > > I'm interested in the idea of compiling Haskell code on lpaste.org, > for core, rule firings, maybe even Th expansion, etc. When sandboxing > code that I'm running, it's really easy if I whitelist what code is > available (parsing with HSE, whitelisting imports, extensions). The > problem of infinite loops or too much allocation is fairly > straight-forwardly solved by similar techniques applied in mueval. > > SafeHaskell helps a lot here, but suppose that I want to also allow > TemplateHaskell, GeneralizedNewtypeDeriving and stuff like that, > because a lot of real code uses those. They only seem to be restricted > to prevent cheeky messing with APIs in ways the authors of the APIs > didn't want -- but that shouldn't necessarily be a security—in terms > of my system—problem, should it? Ideally I'd very strictly whitelist > which modules are allowed to be used (e.g. a version of TH that > doesn't have runIO), and extensions, and then compile any code that > uses them. > > I'd rather not have to setup a VM just to compile Haskell code safely. > I'm willing to put some time in to investigate it, but if there's > already previous work done for this, I'd appreciate any links. > > At the end of the day, there's always just supporting a subset of > Haskell using SafeHaskell. I'm just curious about the more general > case, for use-cases similar to my own. > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Compiling arbitrary Haskell code
On 12.10.2013 00:30, Christopher Done wrote: Is there a definitive list of things in GHC that are unsafe to _compile_ if I were to take an arbitrary module and compile it? E.g. off the top of my head, things that might be dangerous: * TemplateHaskell/QuasiQuotes -- obviously * Are rules safe? * #includes — I presume there's some security risk with including any old file? * FFI -- speaks for itself I'm interested in the idea of compiling Haskell code on lpaste.org, for core, rule firings, maybe even Th expansion, etc. When sandboxing code that I'm running, it's really easy if I whitelist what code is available (parsing with HSE, whitelisting imports, extensions). The problem of infinite loops or too much allocation is fairly straight-forwardly solved by similar techniques applied in mueval. Pragma GHC_OPTIONS. You can add custom preprocessor for example bash and then interpret program as bash script. I think sandboing compiler is a must. There are just too many handles and hooks to cater to all possible uses. Some of them must be exploitable. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Compiling arbitrary Haskell code
On Fri, Oct 11, 2013 at 1:30 PM, Christopher Done wrote: > Is there a definitive list of things in GHC that are unsafe to > _compile_ if I were to take an arbitrary module and compile it? > > E.g. off the top of my head, things that might be dangerous: > > * TemplateHaskell/QuasiQuotes -- obviously > * Are rules safe? > * #includes — I presume there's some security risk with including any old > file? > * FFI -- speaks for itself > It really depends on the security properties you want to maintain. That should inform your policy. For example, denial of service vs. leaking information (like password db) vs. allowing yourself to become part of a botnet. There are lots of things to consider here. For example, lambdabot has always disallowed IO and thus needs to disallow unsafeCoerce/unsafePerformIO/unsafeInterleaveIO and anything else that introduces a "backdoor" in the type system. I think the list you have above is a good start, but wouldn't be complete for lambdabot. > > I'm interested in the idea of compiling Haskell code on lpaste.org, > for core, rule firings, maybe even Th expansion, etc. When sandboxing > code that I'm running, it's really easy if I whitelist what code is > available (parsing with HSE, whitelisting imports, extensions). The > problem of infinite loops or too much allocation is fairly > straight-forwardly solved by similar techniques applied in mueval. > What type of sandboxing do you plan to use and what limitations does it have? For example, chroot jails can be defeated. > > SafeHaskell helps a lot here, but suppose that I want to also allow > TemplateHaskell, GeneralizedNewtypeDeriving and stuff like that, > because a lot of real code uses those. They only seem to be restricted > to prevent cheeky messing with APIs in ways the authors of the APIs > didn't want -- but that shouldn't necessarily be a security—in terms > of my system—problem, should it? Ideally I'd very strictly whitelist > which modules are allowed to be used (e.g. a version of TH that > doesn't have runIO), and extensions, and then compile any code that > uses them. > GND can be used to cause a segfault. I don't know if it can be used to cause a more serious exploit, but I would be concerned that it can. Then again, if you're already allowing TH or arbitrary IO then those are probably much easier places to attack so it may not matter. > > I'd rather not have to setup a VM just to compile Haskell code safely. > I'm willing to put some time in to investigate it, but if there's > already previous work done for this, I'd appreciate any links. > I don't know how well it's documented, but lambdabot has a long history of restricting the Haskell it accepts to make it safe. Other things to look at, google native client (to see how they approach sandboxing), and geordi the C++ IRC bot. In the native client case they do fancy tricks with segment registers (to control where the sandboxed process can write to memory) and intercepting system calls in the outer part of the process. They have the case where they do everything in one process in one address space. You could imagine porting the GHC RTS to run in native client (didn't someone start on that?) and then using that to sandbox all your Haskell evaluation. > > At the end of the day, there's always just supporting a subset of > Haskell using SafeHaskell. I'm just curious about the more general > case, for use-cases similar to my own. > I think SafeHaskell is a reasonable starting place, but I don't think it gives you a really strong guarantee yet. Everything that is inferred safe probably is (I don't know of any exploits with that part of SafeHaskell). In practice, you'll probably also want to use some trusted packages, but that requires that none of the stuff your trust is exploitable. I hope that helps, Jason ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Compiling arbitrary Haskell code
Is there a definitive list of things in GHC that are unsafe to _compile_ if I were to take an arbitrary module and compile it? E.g. off the top of my head, things that might be dangerous: * TemplateHaskell/QuasiQuotes -- obviously * Are rules safe? * #includes — I presume there's some security risk with including any old file? * FFI -- speaks for itself I'm interested in the idea of compiling Haskell code on lpaste.org, for core, rule firings, maybe even Th expansion, etc. When sandboxing code that I'm running, it's really easy if I whitelist what code is available (parsing with HSE, whitelisting imports, extensions). The problem of infinite loops or too much allocation is fairly straight-forwardly solved by similar techniques applied in mueval. SafeHaskell helps a lot here, but suppose that I want to also allow TemplateHaskell, GeneralizedNewtypeDeriving and stuff like that, because a lot of real code uses those. They only seem to be restricted to prevent cheeky messing with APIs in ways the authors of the APIs didn't want -- but that shouldn't necessarily be a security—in terms of my system—problem, should it? Ideally I'd very strictly whitelist which modules are allowed to be used (e.g. a version of TH that doesn't have runIO), and extensions, and then compile any code that uses them. I'd rather not have to setup a VM just to compile Haskell code safely. I'm willing to put some time in to investigate it, but if there's already previous work done for this, I'd appreciate any links. At the end of the day, there's always just supporting a subset of Haskell using SafeHaskell. I'm just curious about the more general case, for use-cases similar to my own. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe