I was trying to get an interactive R prompt with the current working directory. I reviewed R source 'main.c' and 'options.c', and saw that a 20 char buffer is used when in Browse debugging mode, but that no other validation is done on the length of the prompt option.
This hangs R, or takes extremely long to return: # R --vanilla big <- paste(sample(LETTERS, size = 1e7, replace = TRUE), collapse = "") options(prompt = big) Running R with gdb and interrupting to get backtraces shows that 'pushReadLine' in 'unix/sys-std.c' results in a chain of libreadline calls, including, in my case at least, UTF-8 and a lot of __strlen_avx2 activity. 'R_PromptString' in 'main.c' should check prompt is a reasonable length, as well as a check when setting the prompt in 'options.c'. This may be a readline bug, too? I watched it do nothing for a while, it didn't seem to accumulate much or any new memory while watching 'top', but did max one core of CPU. > sessionInfo() R version 3.5.3 (2019-03-11) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 19.04 Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.5.3 > I've searched R-devel and see minimal discussion of security threats in R. Has anybody fuzzed R with data or source files? As R grows in popularity, I hope there is some pro-active security work going on, which I understand may not always best be done on a public mailing list. Jack Wasey ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel