On 27 June 2018 at 17:04, Fiedler Roman <roman.fied...@ait.ac.at> wrote: > Hello List, > > Context: we are conducting machine learning experiments that generate some > kind of nested decision trees. As the tree includes specific decision > elements (which require custom code to evaluate), we decided to store the > decision tree (result of the analysis) as generated Python code. Thus the > decision tree can be transferred to sensor nodes (detectors) that will then > filter data according to the decision tree when executing the given code. > > Tracking down a crash when executing that generated code, we came to > following simplified reproducer that will cause the interpreter to crash (on > Python 2/3) when loading the code before execution is started: > > #!/usr/bin/python2 -BEsStt > A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])]) > > The error message is: > > s_push: parser stack overflow > MemoryError > > Despite the machine having 16GB of RAM, the code cannot be loaded. Splitting > it into two lines using an intermediate variable is the current workaround to > still get it running after manual adapting.
This seems like it may indicate a potential problem in the pgen2 parser generator, since the compilation is failing at the original parse step, but checking the largest version of this that CPython can parse on my machine gives a syntax tree of only ~77kB: >>> tree = parser.expr("A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])") >>> sys.getsizeof(tree) 77965 Attempting to print that hints more closely at the potential problem: >>> tree.tolist() Traceback (most recent call last): File "<stdin>", line 1, in <module> RecursionError: maximum recursion depth exceeded while getting the repr of an object As far as I'm aware, the CPython parser is using the actual C stack for recursion, and is hence throwing MemoryError because it ran out of stack space to recurse into, not because it ran out of memory in general (RecursionError would be a more accurate exception). Trying your original example in PyPy (which uses a different parser implementation) suggests you may want to try using that as your execution target before resorting to switching languages entirely: >>>> tree2 = parser.expr("A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A([A(None)])])])])])])])])])]]))])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])") >>>> len(tree2.tolist()) 5 Alternatively, you could explore mimicking the way that scikit-learn saves its trained models (which I believe is a variation on "use pickle", but I've never actually gone and checked for sure). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/