Le lun. 9 déc. 2019 à 05:43, Tomas Kalibera <tomas.kalib...@gmail.com> a écrit :
> On 12/7/19 10:32 PM, Laurent Gautier wrote: > > Thanks for the quick response Tomas. > > The same error is indeed happening when trying to have a zero-length > variable name in an environment. The surprising bit is then "why is this > happening during parsing" (that is why are variables assigned to an > environment) ? > > The emitted R error (in the R console) is not a parse (syntax) error, but > an error emitted during parsing when the parser tries to intern a name - > look it up in a symbol table. Empty string is not allowed as a symbol name, > and hence the error. In the call "list(''=1)" , the empty name is what > could eventually become a name of a local variable inside list(), even > though not yet during parsing. > Thanks Tomas. I guess this has do with R expressions being lazily evaluated, and names of arguments in a call are also part of the expression. Now the puzzling part is why is that at all part of the parsing: I would have expected R_ParseVector() to be restricted to parsing... Now it feels like R_ParseVector() is performing parsing, and a first level of evalution for expressions that "should never work" (the empty name). There is probably some error in how the external code is handling R errors > (Fatal error: unable to initialize the JIT, stack smashing, etc) and > possibly also how R is initialized before calling ParseVector. Probably you > would get the same problem when running say "stop('myerror')". Please note > R errors are implemented as long-jumps, so care has to be taken when > calling into R, Writing R Extensions has more details (and section 8 > specifically about embedding R). This is unlike parse (syntax) errors > signaled via return value to ParseVector() > The issue is that the segfault (because of stack smashing, therefore because of what also suspected to be an incontrolled jump) is happening within the execution of R_ParseVector(). I would think that an issue with the initialization of R is less likely because the project is otherwise used a fair bit and is well covered by automated continuous tests. After looking more into R's gram.c I suspect that an execution context is required for R_ParseVector() to know to properly work (know where to jump in case of error) when the parsing code decides to fail outside what it thinks is a syntax error. If the case, this would make R_ParseVector() function well when called from say, a C-extension to an R package, but fail the way I am seeing it fail when called from an embedded R. Best, Laurent > Best, > Tomas > > > We are otherwise aware that the error is not occurring in the R console, > but can be traced to a call to R_ParseVector() in R's C API:( > https://github.com/rpy2/rpy2/blob/master/rpy2/rinterface_lib/_rinterface_capi.py#L509 > ). > > Our specific setup is calling an embedded R from Python, using the cffi > library. An error on end was the first possibility considered, but the > puzzling specificity of the error (as shown below other parsing errors are > handled properly) and the difficulty tracing what is in happening in > R_ParseVector() made me ask whether someone on this list had a suggestion > about the possible issue" > > ``` > > >>> import rpy2.rinterface as ri>>> ri.initr()>>> e = ri.parse("list(''=1+") > >>> ---------------------------------------------------------------------------RParsingError > >>> Traceback (most recent call last)>>> e = > >>> ri.parse("list(''=123") R[write to console]: Error: attempt to use > >>> zero-length variable name > R[write to console]: Fatal error: unable to initialize the JIT > > *** stack smashing detected ***: <unknown> terminated > ``` > > > Le lun. 2 déc. 2019 à 06:37, Tomas Kalibera <tomas.kalib...@gmail.com> a > écrit : > >> Dear Laurent, >> >> could you please provide a complete reproducible example where parsing >> results in a crash of R? Calling parse(text="list(''=123") from R works >> fine for me (gives Error: attempt to use zero-length variable name). >> >> I don't think the problem you observed could be related to the memory >> leak. The leak is on the heap, not stack. >> >> Zero-length names of elements in a list are allowed. They are not the >> same thing as zero-length variables in an environment. If you try to >> convert "lst" from your example to an environment, you would get the >> error (attempt to use zero-length variable name). >> >> Best >> Tomas >> >> >> On 11/30/19 11:55 PM, Laurent Gautier wrote: >> > Hi again, >> > >> > Beside R_ParseVector()'s possible inconsistent behavior, R's handling of >> > zero-length named elements does not seem consistent either: >> > >> > ``` >> >> lst <- list() >> >> lst[[""]] <- 1 >> >> names(lst) >> > [1] "" >> >> list("" = 1) >> > Error: attempt to use zero-length variable name >> > ``` >> > >> > Should the parser be made to accept as valid what is otherwise possible >> > when using `[[<` ? >> > >> > >> > Best, >> > >> > Laurent >> > >> > >> > >> > Le sam. 30 nov. 2019 à 17:33, Laurent Gautier <lgaut...@gmail.com> a >> écrit : >> > >> >> I found the following code comment in `src/main/gram.c`: >> >> >> >> ``` >> >> >> >> /* Memory leak >> >> >> >> yyparse(), as generated by bison, allocates extra space for the parser >> >> stack using malloc(). Unfortunately this means that there is a memory >> >> leak in case of an R error (long-jump). In principle, we could define >> >> yyoverflow() to relocate the parser stacks for bison and allocate say >> on >> >> the R heap, but yyoverflow() is undocumented and somewhat complicated >> >> (we would have to replicate some macros from the generated parser >> here). >> >> The same problem exists at least in the Rd and LaTeX parsers in tools. >> >> */ >> >> >> >> ``` >> >> >> >> Could this be related to be issue ? >> >> >> >> Le sam. 30 nov. 2019 à 14:04, Laurent Gautier <lgaut...@gmail.com> a >> >> écrit : >> >> >> >>> Hi, >> >>> >> >>> The behavior of >> >>> ``` >> >>> SEXP R_ParseVector(SEXP, int, ParseStatus *, SEXP); >> >>> ``` >> >>> defined in `src/include/R_ext/Parse.h` appears to be inconsistent >> >>> depending on the string to be parsed. >> >>> >> >>> Trying to parse a string such as `"list(''=1+"` sets the >> >>> `ParseStatus` to incomplete parsing error but trying to parse >> >>> `"list(''=123"` will result in R sending a message to the console >> (followed but a crash): >> >>> >> >>> ``` >> >>> R[write to console]: Error: attempt to use zero-length variable >> nameR[write to console]: Fatal error: unable to initialize the JIT*** stack >> smashing detected ***: <unknown> terminated >> >>> ``` >> >>> >> >>> Is there a reason for the difference in behavior, and is there a >> workaround ? >> >>> >> >>> Thanks, >> >>> >> >>> >> >>> Laurent >> >>> >> >>> >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-devel@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> >> >> > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel