Re: [Rd] R in sandbox/jail (long question)

2010-05-20 Thread Murray Stokely
On Tue, May 18, 2010 at 7:38 PM, Assaf Gordon assafgor...@gmail.com wrote:
 I've found this old thread:
 http://r.789695.n4.nabble.com/R-in-a-sandbox-jail-td921991.html
 But for technical reasons I'd prefer not to setup a chroot jail.


I would also point out that the state of the art in the operating
system community has moved on significantly since 1982 when chroot was
added.  BSD Jails, Solaris Zones/Containers, SELinux, etc. all provide
much more control over the system calls, network connections, and file
and device access granted to applications in different jails/zones.

These operating system capabilities solve exactly some of the problems
you are trying to solve by painstakingly modifying R, but in a more
secure and configurable manner.

 - Murray

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R in sandbox/jail (long question)

2010-05-19 Thread Thomas Lumley



I think you'll find it's a bit more complicated than that.

Firstly, R --sandbox is pretty crippled, since as far as I can tell it can't 
load packages, since package loading uses gzfile().  This would include the 
'stats' package.   If you can load packages you would need to sanitize all 
those packages, since they may contain functions that directly talk to the 
operating system (for example, the 'foreign' package does).

Also, most functions called by .C() and many called by .Call() can be made to 
overwrite memory they don't own, by passing invalid arguments, so the sandbox 
would only protect you from mistakes by the user and from incompetent attacks, 
but not from competent attacks.

-thomas

On Tue, 18 May 2010, Assaf Gordon wrote:


Hello,

I have a setup similar to Rweb (  http://www.math.montana.edu/Rweb/ ):
I get R scripts from users and need to execute them in in a safe manner (they 
are executed automatically, without human inspection).


I would like to limit the user's script to reading from STDIN and writing to 
STDOUT/ERR.
Specifically, preventing any kind of interaction with the underlying 
operating system (files, sockets, system(), etc.).


I've found this old thread:
http://r.789695.n4.nabble.com/R-in-a-sandbox-jail-td921991.html
But for technical reasons I'd prefer not to setup a chroot jail.

I have written a patch that adds a --sandbox parameter.
When this parameter is used, the user's script can't create any kind of 
connection object or run system().


My plan is to run R like this:

cat INPUT | R --vanila --slave --sandbox --file SCRIPT.R  OUTPUT

Where 'INPUT' is my chosen input and 'SCRIPT.R' is the script submitted by 
the user.
If the script tries to create a conncetion or run a disabled function, an 
error is printed.


This is the patch:
http://cancan.cshl.edu/labmembers/gordon/files/R_2.11.0_sandbox.patch

So my questions are:
1. Would you be willing to consider this feature for inclusion ?
2. Are there any other 'dangerous' functions I need to intercept ( 
.Internal perhaps ?)


All comments and suggestions are welcomed,
thanks,
 -gordon

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R in sandbox/jail (long question)

2010-05-19 Thread Matt Shotwell
How about some computing on the language, something like this:
  
exprs - parse(SCRIPT.R)
invalids - c(.Internal, .Primitive)
if( any( invalids %in% all.names(exprs) ) )
   stop(sandbox check failed)


I believe this would prevent evaluating any direct calls to '.Primitive'
and '.Internal'. Of course, you could extend the 'invalids' vector to
include any names. If you want to consider arguments to calls (i.e.
argument to 'file' or 'library') or something more sophisticated, check
out the functions in the codetools package, something like this:


library(codetools)

walkerCall - function(e, w) {
  for( ee in as.list(e)) {
if(!missing(ee)) {
  if(is.call(ee)) {

#stop .Internal calls
if(ee[1] == call(.Internal))
  stop(invalid \'.Internal()\' call)

#restrict file to STDIN
if(ee[1] == call(file)) {
  mc - match.call(file, ee)
  if(mc[[2]] != stdin)
stop(\'file()\' only valid with \'description=\stdin\\')
}

  }
  walkCode(ee, w)
}
  }
}

walker - makeCodeWalker(call=walkerCall, leaf=function(e,w){})
exprs - parse(SCRIPT.R)
for( expr in exprs ) 
walkCode(expr,walker)

I'm a little surprised this there isn't a 'sandbox' package or something
similar to this. A reverse depends check on the codetools package
indicates there is not. However, I believe there is some demand for it.

Matt Shotwell
http://biostatmatt.com


On Tue, 2010-05-18 at 22:38 -0400, Assaf Gordon wrote:
 Hello,
 
 I have a setup similar to Rweb (  http://www.math.montana.edu/Rweb/ ):
 I get R scripts from users and need to execute them in in a safe manner (they 
 are executed automatically, without human inspection).
 
 I would like to limit the user's script to reading from STDIN and writing to 
 STDOUT/ERR.
 Specifically, preventing any kind of interaction with the underlying 
 operating system (files, sockets, system(), etc.).
 
 I've found this old thread:
 http://r.789695.n4.nabble.com/R-in-a-sandbox-jail-td921991.html
 But for technical reasons I'd prefer not to setup a chroot jail.
 
 I have written a patch that adds a --sandbox parameter.
 When this parameter is used, the user's script can't create any kind of 
 connection object or run system().
 
 My plan is to run R like this:
 
 cat INPUT | R --vanila --slave --sandbox --file SCRIPT.R  OUTPUT
 
 Where 'INPUT' is my chosen input and 'SCRIPT.R' is the script submitted by 
 the user.
 If the script tries to create a conncetion or run a disabled function, an 
 error is printed.
 
 This is the patch:
 http://cancan.cshl.edu/labmembers/gordon/files/R_2.11.0_sandbox.patch
 
 So my questions are:
 1. Would you be willing to consider this feature for inclusion ?
 2. Are there any other 'dangerous' functions I need to intercept ( 
 .Internal perhaps ?)
 
 All comments and suggestions are welcomed,
 thanks,
-gordon
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R in sandbox/jail (long question)

2010-05-18 Thread Duncan Murdoch

On 18/05/2010 10:38 PM, Assaf Gordon wrote:

Hello,

I have a setup similar to Rweb (  http://www.math.montana.edu/Rweb/ ):
I get R scripts from users and need to execute them in in a safe manner (they 
are executed automatically, without human inspection).

I would like to limit the user's script to reading from STDIN and writing to 
STDOUT/ERR.
Specifically, preventing any kind of interaction with the underlying operating 
system (files, sockets, system(), etc.).

I've found this old thread:
http://r.789695.n4.nabble.com/R-in-a-sandbox-jail-td921991.html
But for technical reasons I'd prefer not to setup a chroot jail.

I have written a patch that adds a --sandbox parameter.
When this parameter is used, the user's script can't create any kind of connection object 
or run system().
  


That sounds too restrictive.  R uses connections internally in various 
places, with no reference to the file system.  It also uses them when 
reading its own files.  So if you stop a user from creating connections, 
you'll somehow need to distinguish between user-created ones and 
internally necessary ones:  not easy.



My plan is to run R like this:

cat INPUT | R --vanila --slave --sandbox --file SCRIPT.R  OUTPUT

Where 'INPUT' is my chosen input and 'SCRIPT.R' is the script submitted by the 
user.
If the script tries to create a conncetion or run a disabled function, an error 
is printed.

This is the patch:
http://cancan.cshl.edu/labmembers/gordon/files/R_2.11.0_sandbox.patch

So my questions are:
1. Would you be willing to consider this feature for inclusion ?
2. Are there any other 'dangerous' functions I need to intercept ( .Internal 
perhaps ?)
  


.Internal is needed by tons of base functions.  So again, you'll need to 
distinguish where the call is coming from, and that's not easy.


Duncan Murdoch

All comments and suggestions are welcomed,
thanks,
   -gordon

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel