On 12/9/19 2:54 PM, Laurent Gautier wrote: > > > Le lun. 9 déc. 2019 à 05:43, Tomas Kalibera <tomas.kalib...@gmail.com > <mailto:tomas.kalib...@gmail.com>> a écrit : > > On 12/7/19 10:32 PM, Laurent Gautier wrote: >> Thanks for the quick response Tomas. >> >> The same error is indeed happening when trying to have a >> zero-length variable name in an environment. The surprising bit >> is then "why is this happening during parsing" (that is why are >> variables assigned to an environment) ? > > The emitted R error (in the R console) is not a parse (syntax) > error, but an error emitted during parsing when the parser tries > to intern a name - look it up in a symbol table. Empty string is > not allowed as a symbol name, and hence the error. In the call > "list(''=1)" , the empty name is what could eventually become a > name of a local variable inside list(), even though not yet during > parsing. > > > Thanks Tomas. > > I guess this has do with R expressions being lazily evaluated, and > names of arguments in a call are also part of the expression. Now the > puzzling part is why is that at all part of the parsing: I would have > expected R_ParseVector() to be restricted to parsing... Now it feels > like R_ParseVector() is performing parsing, and a first level of > evalution for expressions that "should never work" (the empty name). Think of it as an exception in say Python. Some failures during parsing result in an exception (called error in R and implemented using a long jump). Any time you are calling into R you can get an error; out of memory is also signalled as R error. > > There is probably some error in how the external code is handling > R errors (Fatal error: unable to initialize the JIT, stack > smashing, etc) and possibly also how R is initialized before > calling ParseVector. Probably you would get the same problem when > running say "stop('myerror')". Please note R errors are > implemented as long-jumps, so care has to be taken when calling > into R, Writing R Extensions has more details (and section 8 > specifically about embedding R). This is unlike parse (syntax) > errors signaled via return value to ParseVector() > > > The issue is that the segfault (because of stack smashing, therefore > because of what also suspected to be an incontrolled jump) is > happening within the execution of R_ParseVector(). I would think that > an issue with the initialization of R is less likely because the > project is otherwise used a fair bit and is well covered by automated > continuous tests. > > After looking more into R's gram.c I suspect that an execution context > is required for R_ParseVector() to know to properly work (know where > to jump in case of error) when the parsing code decides to fail > outside what it thinks is a syntax error. If the case, this would make > R_ParseVector() function well when called from say, a C-extension to > an R package, but fail the way I am seeing it fail when called from an > embedded R.
Yes, contexts are used internally to handle errors. For external use please see Writing R Extensions, section 6.12. Best Tomas > Best, > > Laurent > > Best, > Tomas > >> >> We are otherwise aware that the error is not occurring in the R >> console, but can be traced to a call to R_ParseVector() in R's C >> >> API:(https://github.com/rpy2/rpy2/blob/master/rpy2/rinterface_lib/_rinterface_capi.py#L509). >> >> Our specific setup is calling an embedded R from Python, using >> the cffi library. An error on end was the first possibility >> considered, but the puzzling specificity of the error (as shown >> below other parsing errors are handled properly) and the >> difficulty tracing what is in happening in R_ParseVector() made >> me ask whether someone on this list had a suggestion about the >> possible issue" >> >> ``` >> >>> import rpy2.rinterface as ri >> >>> ri.initr() >> >>> e = ri.parse("list(''=1+") >> >> --------------------------------------------------------------------------- >> RParsingError Traceback (most recent >> call last)>>> e = ri.parse("list(''=123") R[write to console]: Error: >> attempt to use zero-length variable name R[write to console]: >> Fatal error: unable to initialize the JIT *** stack smashing >> detected ***: <unknown> terminated ``` >> >> Le lun. 2 déc. 2019 à 06:37, Tomas Kalibera >> <tomas.kalib...@gmail.com <mailto:tomas.kalib...@gmail.com>> a >> écrit : >> >> Dear Laurent, >> >> could you please provide a complete reproducible example >> where parsing >> results in a crash of R? Calling parse(text="list(''=123") >> from R works >> fine for me (gives Error: attempt to use zero-length variable >> name). >> >> I don't think the problem you observed could be related to >> the memory >> leak. The leak is on the heap, not stack. >> >> Zero-length names of elements in a list are allowed. They are >> not the >> same thing as zero-length variables in an environment. If you >> try to >> convert "lst" from your example to an environment, you would >> get the >> error (attempt to use zero-length variable name). >> >> Best >> Tomas >> >> >> On 11/30/19 11:55 PM, Laurent Gautier wrote: >> > Hi again, >> > >> > Beside R_ParseVector()'s possible inconsistent behavior, >> R's handling of >> > zero-length named elements does not seem consistent either: >> > >> > ``` >> >> lst <- list() >> >> lst[[""]] <- 1 >> >> names(lst) >> > [1] "" >> >> list("" = 1) >> > Error: attempt to use zero-length variable name >> > ``` >> > >> > Should the parser be made to accept as valid what is >> otherwise possible >> > when using `[[<` ? >> > >> > >> > Best, >> > >> > Laurent >> > >> > >> > >> > Le sam. 30 nov. 2019 à 17:33, Laurent Gautier >> <lgaut...@gmail.com <mailto:lgaut...@gmail.com>> a écrit : >> > >> >> I found the following code comment in `src/main/gram.c`: >> >> >> >> ``` >> >> >> >> /* Memory leak >> >> >> >> yyparse(), as generated by bison, allocates extra space >> for the parser >> >> stack using malloc(). Unfortunately this means that there >> is a memory >> >> leak in case of an R error (long-jump). In principle, we >> could define >> >> yyoverflow() to relocate the parser stacks for bison and >> allocate say on >> >> the R heap, but yyoverflow() is undocumented and somewhat >> complicated >> >> (we would have to replicate some macros from the generated >> parser here). >> >> The same problem exists at least in the Rd and LaTeX >> parsers in tools. >> >> */ >> >> >> >> ``` >> >> >> >> Could this be related to be issue ? >> >> >> >> Le sam. 30 nov. 2019 à 14:04, Laurent Gautier >> <lgaut...@gmail.com <mailto:lgaut...@gmail.com>> a >> >> écrit : >> >> >> >>> Hi, >> >>> >> >>> The behavior of >> >>> ``` >> >>> SEXP R_ParseVector(SEXP, int, ParseStatus *, SEXP); >> >>> ``` >> >>> defined in `src/include/R_ext/Parse.h` appears to be >> inconsistent >> >>> depending on the string to be parsed. >> >>> >> >>> Trying to parse a string such as `"list(''=1+"` sets the >> >>> `ParseStatus` to incomplete parsing error but trying to parse >> >>> `"list(''=123"` will result in R sending a message to the >> console (followed but a crash): >> >>> >> >>> ``` >> >>> R[write to console]: Error: attempt to use zero-length >> variable nameR[write to console]: Fatal error: unable to >> initialize the JIT*** stack smashing detected ***: <unknown> >> terminated >> >>> ``` >> >>> >> >>> Is there a reason for the difference in behavior, and is >> there a workaround ? >> >>> >> >>> Thanks, >> >>> >> >>> >> >>> Laurent >> >>> >> >>> >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-devel@r-project.org <mailto:R-devel@r-project.org> >> mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> >> > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel