[pypy-commit] pypy sandbox-lib: expand the interface, still only theoretical
Author: Armin Rigo Branch: sandbox-lib Changeset: r89283:be4412e6ecf2 Date: 2016-12-29 19:02 +0100 http://bitbucket.org/pypy/pypy/changeset/be4412e6ecf2/ Log:expand the interface, still only theoretical diff --git a/rpython/translator/rsandbox/src/part.h b/rpython/translator/rsandbox/src/part.h --- a/rpython/translator/rsandbox/src/part.h +++ b/rpython/translator/rsandbox/src/part.h @@ -1,40 +1,46 @@ -/*** rpython/translator/rsandbox/src/part.h ***/ - #ifndef _RSANDBOX_H_ #define _RSANDBOX_H_ #ifndef RPY_SANDBOX_EXPORTED -/* Common definitions when including this file from an external C project */ - -#include -#include - -#define RPY_SANDBOX_EXPORTED extern - -typedef long Signed; -typedef unsigned long Unsigned; - +# define RPY_SANDBOX_EXPORTED extern #endif /* *** - WARNING: Python is not meant to be a safe language. For example, - think about making a custom code object with a random byte string and - trying to interpret that. A sandboxed PyPy contains extra safety - checks that can detect such invalid operations before they cause - problems. When such a case is detected, THE WHOLE PROCESS IS + A direct interface for safely embedding Python inside a larger + application written in C (or any other language which can access C + libraries). + + For now, there is little support for more complex cases. Notably, + any call to functions like open() or any attempt to do 'import' of + any non-builtin module will fail. This interface is not meant to + "drop in" a large amount of existing Python code. If you are looking + for this and are not concerned about security, look at CFFI + embedding: http://cffi.readthedocs.org/en/latest/embedding.html . + Instead, this interface is meant to run small amounts of untrusted + Python code from third-party sources. (It is possible to rebuild a + module system on top of this interface, by writing a custom + __import__ hook in Python. Similarly, you cannot return arbitrary + Python objects to C code, but you can make a Python-side data + structure like a list or a dict, and pass integer indices to C.) + + WARNING: Python is originally not meant to be a safe language. For + example, think about making a custom code object with a random byte + string and trying to interpret that. A sandboxed PyPy contains extra + safety checks that can detect such invalid operations before they + cause problems. When such a case is detected, THE WHOLE PROCESS IS ABORTED right now. In the future, there should be a setjmp/longjmp alternative to this, but the details need a bit of care (e.g. it would still create memory leaks). - For now, you have to accept that the process can be aborted if - given malicious code. Also, running several Python sources from - different sources in the same process is not recommended---there is - only one global state: malicious code can easily mangle the state - of the Python interpreter, influencing subsequent runs. Unless you - are fine with both issues, you MUST run Python from subprocesses, - not from your main program. + For now, you have to accept that the process can be aborted if given + malicious code. Also, running several Python codes from different + untrusted sources in the same process is not recommended---there is + only one global state: malicious code can easily mangle the state of + the PyPy interpreter, influencing subsequent runs. Unless you are + fine with both issues, you MUST run Python from subprocesses, not + from your main program. Multi-threading issues: DO NOT USE FROM SEVERAL THREADS AT THE SAME TIME! You need a lock. If you use subprocesses, they will likely @@ -150,6 +156,14 @@ */ RPY_SANDBOX_EXPORTED void rsandbox_result_bytes(char *buf, size_t bufsize); +/* If the called function returns a tuple of values, then the above + 'result' functions work on individual items in the tuple, initially + the 0th one. This function changes the current item to + 'current_item' if that is within bounds. Returns the total length of + the tuple, or -1 if not a tuple. +*/ +RPY_SANDBOX_EXPORTED int rsandbox_result_tuple_item(int current_item); + /* When an exception occurred in rsandbox_open() or rsandbox_call(), return more information as a 'char *' string. Same rules as rsandbox_result_bytes(). (Careful, you MUST NOT assume that the @@ -163,14 +177,38 @@ RPY_SANDBOX_EXPORTED void rsandbox_last_exception(char *buf, size_t bufsize, int traceback_limit); +/* Installs a callback inside the module 'mod' under the name 'fnname'. + The Python code then sees a function 'fnname()' which invokes back + the C function given as the 'callback' parameter. The 'callback' is + called with 'data' as sole argument (use NULL if you don't need + this). + + When the Python 'fnname()' is
[pypy-commit] pypy sandbox-lib: string => bytes
Author: Armin Rigo
Branch: sandbox-lib
Changeset: r89282:3d02cf9459c7
Date: 2016-12-28 17:59 +0100
http://bitbucket.org/pypy/pypy/changeset/3d02cf9459c7/
Log:string => bytes
diff --git a/rpython/translator/rsandbox/src/part.h
b/rpython/translator/rsandbox/src/part.h
--- a/rpython/translator/rsandbox/src/part.h
+++ b/rpython/translator/rsandbox/src/part.h
@@ -20,7 +20,7 @@
/* ***
WARNING: Python is not meant to be a safe language. For example,
- think about making a custom code object with a random string and
+ think about making a custom code object with a random byte string and
trying to interpret that. A sandboxed PyPy contains extra safety
checks that can detect such invalid operations before they cause
problems. When such a case is detected, THE WHOLE PROCESS IS
@@ -72,7 +72,7 @@
rsandbox_module_t *compile_expression(const char *expression)
{
- rsandbox_push_string(expression); // 'expression' is untrusted
+ rsandbox_push_bytes(expression); // 'expression' is untrusted
return rsandbox_open(
"code = compile(args[0], '', 'eval')\n"
"def evaluate(n):\n"
@@ -102,8 +102,8 @@
*/
RPY_SANDBOX_EXPORTED void rsandbox_push_long(long);
RPY_SANDBOX_EXPORTED void rsandbox_push_double(double);
-RPY_SANDBOX_EXPORTED void rsandbox_push_string(const char *);
-RPY_SANDBOX_EXPORTED void rsandbox_push_string_and_size(const char *, size_t);
+RPY_SANDBOX_EXPORTED void rsandbox_push_bytes(const char *);
+RPY_SANDBOX_EXPORTED void rsandbox_push_bytes_and_size(const char *, size_t);
RPY_SANDBOX_EXPORTED void rsandbox_push_none(void);
RPY_SANDBOX_EXPORTED void rsandbox_push_rw_buffer(char *, size_t);
@@ -122,24 +122,25 @@
malicious code returning results like inf, nan, or 1e-323.) */
RPY_SANDBOX_EXPORTED double rsandbox_result_double(void);
-/* Returns the length of the string returned in the previous
- rsandbox_call(). If it was not a string, returns 0. */
-RPY_SANDBOX_EXPORTED size_t rsandbox_result_string_length(void);
+/* Returns the length of the byte string returned in the previous
+ rsandbox_call(). If it was not a byte string, returns 0. */
+RPY_SANDBOX_EXPORTED size_t rsandbox_result_bytes_length(void);
-/* Returns the data in the string. This function always writes an
- additional '\0'. If the string is longer than 'bufsize-1', it is
+/* Returns the data in the byte string. This function always writes an
+ additional '\0'. If the byte string is longer than 'bufsize-1', it is
truncated to 'bufsize-1' characters.
For small human-readable strings you can call
- rsandbox_result_string() with some fixed maximum size. You get a
+ rsandbox_result_bytes() with some fixed maximum size. You get a
regular null-terminated 'char *' string. (If it contains embedded
'\0', it will appear truncated; if the Python function did not
- return a string at all, it will be completely empty; but anyway
+ return a byte string at all, it will be completely empty; but anyway
you MUST be ready to handle any malformed string at all.)
For strings of larger sizes or strings that can meaningfully
- contain embedded '\0', you should allocate a 'buf' of size
- 'rsandbox_result_string_length() + 1'.
+ contain embedded '\0', you should compute 'bufsize =
+ rsandbox_result_bytes_length() + 1' and allocate a buffer of this
+ length.
To repeat: Be careful when reading strings from Python! They can
contain any character, so be sure to escape them correctly (or
@@ -147,17 +148,20 @@
further. Malicious code can return any string. Your code must be
ready for anything. Err on the side of caution.
*/
-RPY_SANDBOX_EXPORTED void rsandbox_result_string(char *buf, size_t bufsize);
+RPY_SANDBOX_EXPORTED void rsandbox_result_bytes(char *buf, size_t bufsize);
/* When an exception occurred in rsandbox_open() or rsandbox_call(),
- return more information as a string. Same rules as
- rsandbox_result_string(). (Careful, you MUST NOT assume that the
+ return more information as a 'char *' string. Same rules as
+ rsandbox_result_bytes(). (Careful, you MUST NOT assume that the
string is well-formed: malicious code can make it contain anything.
If you are copying it to a web page, for example, then a good idea
is to replace any character not in a whitelist with '?'.)
+
+ If 'traceback_limit' is greater than zero, the output is a multiline
+ traceback like in standard Python, with up to 'traceback_limit' levels.
*/
RPY_SANDBOX_EXPORTED void rsandbox_last_exception(char *buf, size_t bufsize,
- int include_traceback);
+ int traceback_limit);
//
___
pypy-commit mailing list
[email protected]
https
[pypy-commit] pypy default: document branch
Author: Armin Rigo Branch: Changeset: r89286:999ff3b3f9a4 Date: 2017-01-01 11:31 +0100 http://bitbucket.org/pypy/pypy/changeset/999ff3b3f9a4/ Log:document branch diff --git a/pypy/doc/whatsnew-head.rst b/pypy/doc/whatsnew-head.rst --- a/pypy/doc/whatsnew-head.rst +++ b/pypy/doc/whatsnew-head.rst @@ -76,3 +76,8 @@ PyMemoryViewObject with a PyBuffer attached so that the call to ``PyMemoryView_GET_BUFFER`` does not leak a PyBuffer-sized piece of memory. Properly call ``bf_releasebuffer`` when not ``NULL``. + +.. branch: boehm-rawrefcount + +Support translations of cpyext with the Boehm GC (for special cases like +revdb). ___ pypy-commit mailing list [email protected] https://mail.python.org/mailman/listinfo/pypy-commit
[pypy-commit] pypy default: hg merge boehm-rawrefcount
Author: Armin Rigo
Branch:
Changeset: r89284:a3aedbe6023d
Date: 2017-01-01 11:27 +0100
http://bitbucket.org/pypy/pypy/changeset/a3aedbe6023d/
Log:hg merge boehm-rawrefcount
A branch to add minimal support for rawrefcount in Boehm
translations. This is needed by revdb.
diff --git a/pypy/module/cpyext/state.py b/pypy/module/cpyext/state.py
--- a/pypy/module/cpyext/state.py
+++ b/pypy/module/cpyext/state.py
@@ -1,7 +1,7 @@
from rpython.rlib.objectmodel import we_are_translated
from rpython.rtyper.lltypesystem import rffi, lltype
from pypy.interpreter.error import OperationError, oefmt
-from pypy.interpreter.executioncontext import AsyncAction
+from pypy.interpreter import executioncontext
from rpython.rtyper.lltypesystem import lltype
from rpython.rtyper.annlowlevel import llhelper
from rpython.rlib.rdynload import DLLHANDLE
@@ -14,8 +14,9 @@
self.reset()
self.programname = lltype.nullptr(rffi.CCHARP.TO)
self.version = lltype.nullptr(rffi.CCHARP.TO)
-pyobj_dealloc_action = PyObjDeallocAction(space)
-self.dealloc_trigger = lambda: pyobj_dealloc_action.fire()
+if space.config.translation.gc != "boehm":
+pyobj_dealloc_action = PyObjDeallocAction(space)
+self.dealloc_trigger = lambda: pyobj_dealloc_action.fire()
def reset(self):
from pypy.module.cpyext.modsupport import PyMethodDef
@@ -67,6 +68,11 @@
state.api_lib = str(api.build_bridge(self.space))
else:
api.setup_library(self.space)
+#
+if self.space.config.translation.gc == "boehm":
+action = BoehmPyObjDeallocAction(self.space)
+self.space.actionflag.register_periodic_action(action,
+use_bytecode_counter=True)
def install_dll(self, eci):
"""NOT_RPYTHON
@@ -84,8 +90,10 @@
from pypy.module.cpyext.api import init_static_data_translated
if we_are_translated():
-rawrefcount.init(llhelper(rawrefcount.RAWREFCOUNT_DEALLOC_TRIGGER,
- self.dealloc_trigger))
+if space.config.translation.gc != "boehm":
+rawrefcount.init(
+llhelper(rawrefcount.RAWREFCOUNT_DEALLOC_TRIGGER,
+self.dealloc_trigger))
init_static_data_translated(space)
setup_new_method_def(space)
@@ -143,15 +151,23 @@
self.extensions[path] = w_copy
-class PyObjDeallocAction(AsyncAction):
+def _rawrefcount_perform(space):
+from pypy.module.cpyext.pyobject import PyObject, decref
+while True:
+py_obj = rawrefcount.next_dead(PyObject)
+if not py_obj:
+break
+decref(space, py_obj)
+
+class PyObjDeallocAction(executioncontext.AsyncAction):
"""An action that invokes _Py_Dealloc() on the dying PyObjects.
"""
+def perform(self, executioncontext, frame):
+_rawrefcount_perform(self.space)
+class BoehmPyObjDeallocAction(executioncontext.PeriodicAsyncAction):
+# This variant is used with Boehm, which doesn't have the explicit
+# callback. Instead we must periodically check ourselves.
def perform(self, executioncontext, frame):
-from pypy.module.cpyext.pyobject import PyObject, decref
-
-while True:
-py_obj = rawrefcount.next_dead(PyObject)
-if not py_obj:
-break
-decref(self.space, py_obj)
+if we_are_translated():
+_rawrefcount_perform(self.space)
diff --git a/rpython/rlib/rawrefcount.py b/rpython/rlib/rawrefcount.py
--- a/rpython/rlib/rawrefcount.py
+++ b/rpython/rlib/rawrefcount.py
@@ -4,10 +4,11 @@
# This is meant for pypy's cpyext module, but is a generally
# useful interface over our GC. XXX "pypy" should be removed here
#
-import sys, weakref
-from rpython.rtyper.lltypesystem import lltype, llmemory
+import sys, weakref, py
+from rpython.rtyper.lltypesystem import lltype, llmemory, rffi
from rpython.rlib.objectmodel import we_are_translated, specialize, not_rpython
from rpython.rtyper.extregistry import ExtRegistryEntry
+from rpython.translator.tool.cbuild import ExternalCompilationInfo
from rpython.rlib import rgc
@@ -245,6 +246,11 @@
v_p, v_ob = hop.inputargs(*hop.args_r)
hop.exception_cannot_occur()
hop.genop(name, [_unspec_p(hop, v_p), _unspec_ob(hop, v_ob)])
+#
+if hop.rtyper.annotator.translator.config.translation.gc == "boehm":
+c_func = hop.inputconst(lltype.typeOf(func_boehm_eci),
+func_boehm_eci)
+hop.genop('direct_call', [c_func])
class Entry(ExtRegistryEntry):
@@ -297,3 +303,10 @@
v_ob = hop.genop('gc_rawrefcount_next_dead', [],
resulttype = llmemory.Address)
return _spec_ob(hop, v_ob)
+
+src_dir = py.path.local(__file__).dirpath() / 'src'
+boehm_eci = ExternalCo
[pypy-commit] pypy default: Allow --gc=boehm with the cpyext module.
Author: Armin Rigo
Branch:
Changeset: r89285:257848776fca
Date: 2017-01-01 11:30 +0100
http://bitbucket.org/pypy/pypy/changeset/257848776fca/
Log:Allow --gc=boehm with the cpyext module.
diff --git a/pypy/goal/targetpypystandalone.py
b/pypy/goal/targetpypystandalone.py
--- a/pypy/goal/targetpypystandalone.py
+++ b/pypy/goal/targetpypystandalone.py
@@ -305,9 +305,9 @@
config.objspace.lonepycfiles = False
if config.objspace.usemodules.cpyext:
-if config.translation.gc != 'incminimark':
+if config.translation.gc not in ('incminimark', 'boehm'):
raise Exception("The 'cpyext' module requires the
'incminimark'"
-" GC. You need either
'targetpypystandalone.py"
+" 'boehm' GC. You need either
'targetpypystandalone.py"
" --withoutmod-cpyext' or '--gc=incminimark'")
config.translating = True
___
pypy-commit mailing list
[email protected]
https://mail.python.org/mailman/listinfo/pypy-commit
