Re: [Rd] [EXTERNAL] Re: NOTE: multiple local function definitions for ?fun? with different formal arguments
I went looking and found this in codetools, where it's been for 20 years https://gitlab.com/luke-tierney/codetools/-/blame/master/R/codetools.R?ref_type=heads#L951 I think the call stack in codetools is checkUsagePackage -> checkUsageEnv -> checkUsage, and these are similarly established. The call from the tools package https://github.com/wch/r-source/blame/95146f0f366a36899e4277a6a722964a51b93603/src/library/tools/R/QC.R#L4585 is also quite old. I'm not sure this had been said explicitly, but perhaps the original intent was to protect against accidentally redefining a local function. Obviously one could do this with a local variable too, though that might less often be an error… toto <- function(mode) { tata <- function(a, b) a * b # intended tata <- function(a, b) a / b # oops … } Another workaround is to actually name the local functions toto <- function(mode) { tata <- function(a, b) a * b titi <- function(u, v, w) (u + v) / w if (mode == 1) tata else titi } … or to use a switch statement toto <- function(mode) { ## fun <- switch(…) for use of `fun()` in toto switch( mode, tata = function(a, b) a * b, titi = function(u, v, w) (u + v) / w, stop("unknown `mode = '", mode, "'`") ) } … or similarly to write `fun <- if … else …`, assigning the result of the `if` to `fun`. I guess this last formulation points to the fact that a more careful analysis of Hervé's original code means that `fun` can only take one value (only one branch of the `if` can be taken) so there can only be one version of `fun` in any invocation of `toto()`. Perhaps the local names (and string-valued 'mode') are suggestive of special case, so serve as implicit documentation? Adding `…` to `tata` doesn't seem like a good idea; toto(1)(3, 5, 7) no longer signals an error. There seems to be a lot in common with S3 and S4 methods, where `toto` corresponds to the generic, `tata` and `titi` to methods. This 'dispatch' is brought out by using `switch()`. There is plenty of opportunity for thinking that you're invoking one method but actually you're invoking the other. For instance with dplyr, I like that I can tbl |> print(n = 2) so much that I find myself doing this with data.frame df |> print(n = 2), which is an error (`n` partially matches `na.print`, and 2 is not a valid value); both methods silently ignore the typo print(m = 2). Martin Morgan From: R-devel on behalf of Henrik Bengtsson Date: Tuesday, February 6, 2024 at 4:34 PM To: Izmirlian, Grant (NIH/NCI) [E] Cc: r-devel@r-project.org Subject: Re: [Rd] [EXTERNAL] Re: NOTE: multiple local function definitions for ?fun? with different formal arguments Here's a dummy example that I think illustrates the problem: toto <- function() { if (runif(1) < 0.5) function(a) a else function(a,b) a+b } > fcn <- toto() > fcn(1,2) [1] 3 > fcn <- toto() > fcn(1,2) [1] 3 > fcn <- toto() > fcn(1,2) Error in fcn(1, 2) : unused argument (2) How can you use the returned function, if you get different arguments? In your example, you cannot use the returned function without knowing 'mode', or by inspecting the returned function. So, the warning is there to alert you to a potential bug. Anecdotally, I'm pretty sure this R CMD check NOTE has caught at least one such bug in one of my/our packages. If you want to keep the current design pattern, one approach could be to add ... to your function definitions: toto <- function(mode) { if (mode == 1) fun <- function(a, b, ...) a*b else fun <- function(u, v, w) (u + v) / w fun } to make sure that toto() returns functions that accept the same minimal number of arguments. /Henrik On Tue, Feb 6, 2024 at 1:15 PM Izmirlian, Grant (NIH/NCI) [E] via R-devel wrote: > > Because functions get called and therefore, the calling sequence matters. > It’s just protecting you from yourself, but as someone pointed out, there’s a > way to silence such notes. > G > > > From: Hervé Pagès > Sent: Tuesday, February 6, 2024 2:40 PM > To: Izmirlian, Grant (NIH/NCI) [E] ; Duncan Murdoch > ; r-devel@r-project.org > Subject: Re: [EXTERNAL] Re: [Rd] NOTE: multiple local function definitions > for ?fun? with different formal arguments > > > On 2/6/24 11:19, Izmirlian, Grant (NIH/NCI) [E] wrote: > The note refers to the fact that the function named ‘fun’ appears to be > defined in two different ways. > > Sure I get that. But how is that any different from a variable being defined > in two different ways like in > > if (mode == 1) > x <- -8 > else > x <- 55 > > This is such a common and perfectly fine pattern. Why would
Re: [Rd] Building R from source always fails on tools:::sysdata2LazyLoadDB
Thank you, especially for the R-admin link and link to the underlying issue. This is macOS Monterey version 12.6.5, so I am stuck with rebuilding -- empirically it seems like make distclean && && make all is needed. This doesn't seem to make sense (shouldn't `make clean` remove all the compiled libraries?) so I'll investigate a bit more; maybe it is because I do not `make install`? I know there is no value in papering over upstream issues, but wonder if I were to unlink all (or some?) *so / *dylib before `make all` would be effective? After the fact I realized that the R-SIG-mac mailing list might have been more appropriate, but I did not see the issue discussed there. Martin From: Prof Brian Ripley Date: Wednesday, May 31, 2023 at 3:46 AM To: Martin Morgan Cc: Tomas Kalibera , R-devel Subject: Re: [Rd] Building R from source always fails on tools:::sysdata2LazyLoadDB On 30/05/2023 22:57, Martin Morgan wrote: > Thanks Ivan & Tomas > > A simpler way to trigger the problem is library(tools) or > library.dynam("tools", "tools", ".") so I guess it is loading src/tools.so > > Ivan, adding -d lldb I need to tell lldb where to find the R library > > (lldb) process launch --environment > DYLD_LIBRARY_PATH=/Users/ma38727/bin/R-devel/lib > > And then `library(tools)` works. To run lldb I needed to grant Xcode > permissions using my local administrator account. > > @Thomas I can't see anything in the Console app logs, but this might be > partly my ineptitude. Which version of macOS is this? - Prior to Ventura it is the known behaviour. - With Ventura the same happens if any R process from the current build is in use, even a crashed one. (The latter happened to me this morning: there were reports under Crash Reports in the Console App.) As the R-admin manual says "Updating an �arm64� build may fail because of the bug described at https://openradar.appspot.com/FB8914243 but ab initio builds work. This has been far rarer since macOS 13." Once it happens, you need to rebuild (make clean;make all should suffice). > > Martin > > From: Tomas Kalibera > Date: Tuesday, May 30, 2023 at 4:54 PM > To: Martin Morgan , R-devel > Subject: Re: [Rd] Building R from source always fails on > tools:::sysdata2LazyLoadDB > > On 5/30/23 22:09, Martin Morgan wrote: >> I build my own R from source on an M1 mac. I have a clean svn checkout in >> one directory ~/src/R-devel. I switch to ~/bin/R-devel and the first time run >> >> cd ~/bin/R-devel >> ~/src/R-devel/configure --enable-R-shlib 'CFLAGS=-g -O0' >> CPPFLAGS=-I/opt/R/arm64/include 'CXXFLAGS=-g -O0' >> make -j >> >> At some point in the future I svn update src/R-devel, then >> >> cd ~/bin/R-devel >> make -j >> >> This always ends with >> >> installing 'sysdata.rda' >> /bin/sh: line 1: 99497 Doneecho >> "tools:::sysdata2LazyLoadDB(\"/Users/XXX/src/R-devel/src/library/utils/R/sysdata.rda\",\"../../../library/utils/R\")" >>99498 Killed: 9 | R_DEFAULT_PACKAGES=NULL LC_ALL=C >> ../../../bin/R --vanilla --no-echo >> make[4]: *** [sysdata] Error 137 >> make[3]: *** [all] Error 2 >> make[2]: *** [R] Error 1 >> make[1]: *** [R] Error 1 >> make: *** [R] Error 1 >> >> what am I doing wrong? Is there a graceful way to fix this (my current >> solution is basically to start over, with `make distclean`)? If I cd into >> ~/bin/R-devel/src/library/utils I can start an interactive session and >> reproduce the error >> >> ~/bin/R-devel/src/library/utils $ R_DEFAULT_PACKAGES=NULL ../../../bin/R >> --vanilla >>> tools:::sysdata2LazyLoadDB("/Users/ma38727/src/R-devel/src/library/utils/R/sysdata.rda","../../../library/utils/R") >> zsh: killed R_DEFAULT_PACKAGES=NULL ../../../bin/R --vanilla >> >> or simply >> >>> tools:::sysdata2LazyLoadDB >> zsh: killed R_DEFAULT_PACKAGES=NULL LC_ALL=C R_ENABLE_JIT=0 TZ=UTC >> ../../../bin/R > > If it is macOS, it might be worth checking the system logs (Console > app). It may be some system security feature. > > Tomas -- Brian D. Ripley, rip...@stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Building R from source always fails on tools:::sysdata2LazyLoadDB
Thanks Ivan & Tomas A simpler way to trigger the problem is library(tools) or library.dynam("tools", "tools", ".") so I guess it is loading src/tools.so Ivan, adding -d lldb I need to tell lldb where to find the R library (lldb) process launch --environment DYLD_LIBRARY_PATH=/Users/ma38727/bin/R-devel/lib And then `library(tools)` works. To run lldb I needed to grant Xcode permissions using my local administrator account. @Thomas I can't see anything in the Console app logs, but this might be partly my ineptitude. Martin From: Tomas Kalibera Date: Tuesday, May 30, 2023 at 4:54 PM To: Martin Morgan , R-devel Subject: Re: [Rd] Building R from source always fails on tools:::sysdata2LazyLoadDB On 5/30/23 22:09, Martin Morgan wrote: > I build my own R from source on an M1 mac. I have a clean svn checkout in one > directory ~/src/R-devel. I switch to ~/bin/R-devel and the first time run > > cd ~/bin/R-devel > ~/src/R-devel/configure --enable-R-shlib 'CFLAGS=-g -O0' > CPPFLAGS=-I/opt/R/arm64/include 'CXXFLAGS=-g -O0' > make -j > > At some point in the future I svn update src/R-devel, then > > cd ~/bin/R-devel > make -j > > This always ends with > > installing 'sysdata.rda' > /bin/sh: line 1: 99497 Doneecho > "tools:::sysdata2LazyLoadDB(\"/Users/XXX/src/R-devel/src/library/utils/R/sysdata.rda\",\"../../../library/utils/R\")" > 99498 Killed: 9 | R_DEFAULT_PACKAGES=NULL LC_ALL=C > ../../../bin/R --vanilla --no-echo > make[4]: *** [sysdata] Error 137 > make[3]: *** [all] Error 2 > make[2]: *** [R] Error 1 > make[1]: *** [R] Error 1 > make: *** [R] Error 1 > > what am I doing wrong? Is there a graceful way to fix this (my current > solution is basically to start over, with `make distclean`)? If I cd into > ~/bin/R-devel/src/library/utils I can start an interactive session and > reproduce the error > > ~/bin/R-devel/src/library/utils $ R_DEFAULT_PACKAGES=NULL ../../../bin/R > --vanilla >> tools:::sysdata2LazyLoadDB("/Users/ma38727/src/R-devel/src/library/utils/R/sysdata.rda","../../../library/utils/R") > zsh: killed R_DEFAULT_PACKAGES=NULL ../../../bin/R --vanilla > > or simply > >> tools:::sysdata2LazyLoadDB > zsh: killed R_DEFAULT_PACKAGES=NULL LC_ALL=C R_ENABLE_JIT=0 TZ=UTC > ../../../bin/R If it is macOS, it might be worth checking the system logs (Console app). It may be some system security feature. Tomas > >[[alternative HTML version deleted]] > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Building R from source always fails on tools:::sysdata2LazyLoadDB
I build my own R from source on an M1 mac. I have a clean svn checkout in one directory ~/src/R-devel. I switch to ~/bin/R-devel and the first time run cd ~/bin/R-devel ~/src/R-devel/configure --enable-R-shlib 'CFLAGS=-g -O0' CPPFLAGS=-I/opt/R/arm64/include 'CXXFLAGS=-g -O0' make -j At some point in the future I svn update src/R-devel, then cd ~/bin/R-devel make -j This always ends with installing 'sysdata.rda' /bin/sh: line 1: 99497 Doneecho "tools:::sysdata2LazyLoadDB(\"/Users/XXX/src/R-devel/src/library/utils/R/sysdata.rda\",\"../../../library/utils/R\")" 99498 Killed: 9 | R_DEFAULT_PACKAGES=NULL LC_ALL=C ../../../bin/R --vanilla --no-echo make[4]: *** [sysdata] Error 137 make[3]: *** [all] Error 2 make[2]: *** [R] Error 1 make[1]: *** [R] Error 1 make: *** [R] Error 1 what am I doing wrong? Is there a graceful way to fix this (my current solution is basically to start over, with `make distclean`)? If I cd into ~/bin/R-devel/src/library/utils I can start an interactive session and reproduce the error ~/bin/R-devel/src/library/utils $ R_DEFAULT_PACKAGES=NULL ../../../bin/R --vanilla > tools:::sysdata2LazyLoadDB("/Users/ma38727/src/R-devel/src/library/utils/R/sysdata.rda","../../../library/utils/R") zsh: killed R_DEFAULT_PACKAGES=NULL ../../../bin/R --vanilla or simply > tools:::sysdata2LazyLoadDB zsh: killed R_DEFAULT_PACKAGES=NULL LC_ALL=C R_ENABLE_JIT=0 TZ=UTC ../../../bin/R [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] gsub() hex character range problems in R-devel?
Thanks Tomas and 'Brodie' for your expert explanation; it provides great help in understanding and solving my immediate problem. Thomas' observation to 'do something like e.g. "only keep ASCII digits, ASCII space, ASCII underscore, but remove all other characters"' points to a basic weakness in the code I'm looking at. E.g., removing non-breaking space is probably not appropriate ('foo\ua0bar' is probably cleaned to 'foo bar' and not 'foobar'). And more generally other non-ASCII characters ('fancy' quotes, em-dashes, ...) would require special treatment. It seems like the right thing to do is to handle the raw data in its original encoding, rather than to try to clean it to ASCII. Martin On 1/5/22, 4:17 AM, "Tomas Kalibera" wrote: Hi Martin, I'd add few comments to the excellent analysis of Brodie. - \xhh is allowed and defined in Perl regular expressions, see ?regex (would need perl=TRUE), but to enter that in an R string, you need to escape the backslash. - \xhh is not defined by POSIX for extended regular expressions, neither it is documented in ?regex for those; TRE supports it, but still portable programs should not rely on that - literal \xhh in an R string is turned to the byte by R, but I would say this should not be used at all by users, because the result is encoding specific - use of \u and \U in an R string is fine, it has well defined semantics and the corresponding string will then be flagged UTF-8 in R (so e.g. \ua0 is fine to represent the Unicode no-break space) - see caveats of using character ranges with POSIX extended regular expressions in ?regex re encodings, using Perl regular expressions in UTF-8 mode is more reliable for those So, a variant of your example might be: > gsub("[\\x7f-\\xff]", "", "fo\ua0o", perl=TRUE) [1] "foo" (note that the \ua0 ensures that the text is UTF-8, and hence the UTF-8 mode for regular expressions is used, ?regex has more) However, I think it is better to formulate regular expressions to cover all of Unicode, so do something like e.g. "only keep ASCII digits, ASCII space, ASCII underscore, but remove all other characters". Best Tomas On 1/4/22 8:35 PM, Martin Morgan wrote: > I'm not very good at character encoding / etc so this might be user error. The following code is meant to replace extended ASCII characters, in particular a non-breaking space, with "", and it works in R-4-1-branch > >> R.version.string > [1] "R version 4.1.2 Patched (2022-01-04 r81445)" >> gsub("[\x7f-\xff]", "", "fo\xa0o") > [1] "foo" > > but fails in R-devel > >> R.version.string > [1] "R Under development (unstable) (2022-01-04 r81445)" >> gsub("[\x7f-\xff]", "", "fo\xa0o") > Error in gsub("[\177-\xff]", "", "fo\xa0o") : invalid regular expression '[-�]', reason 'Invalid character range' > In addition: Warning message: > In gsub("[\177-\xff]", "", "fo\xa0o") : >TRE pattern compilation error 'Invalid character range' > > There are other oddities, too, like > >> gsub("[[:alnum:]]", "", "fo\xa0o") # R-4-1-branch > [1] "\xfc\xbe\x8c\x86\x84\xbc" > >> gsub("[[:alnum:]]", "", "fo\xa0o") # R-devel > [1] "<>" > > The R-devel sessionInfo is > >> sessionInfo() > R Under development (unstable) (2022-01-04 r81445) > Platform: x86_64-apple-darwin19.6.0 (64-bit) > Running under: macOS Catalina 10.15.7 > > Matrix products: default > BLAS: /Users/ma38727/bin/R-devel/lib/libRblas.dylib > LAPACK: /Users/ma38727/bin/R-devel/lib/libRlapack.dylib > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_4.2.0 > > (I have built my own R on macOS; similar behavior is observed on a Linux machine) > > Any hints welcome, > > Martin Morgan > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] gsub() hex character range problems in R-devel?
I'm not very good at character encoding / etc so this might be user error. The following code is meant to replace extended ASCII characters, in particular a non-breaking space, with "", and it works in R-4-1-branch > R.version.string [1] "R version 4.1.2 Patched (2022-01-04 r81445)" > gsub("[\x7f-\xff]", "", "fo\xa0o") [1] "foo" but fails in R-devel > R.version.string [1] "R Under development (unstable) (2022-01-04 r81445)" > gsub("[\x7f-\xff]", "", "fo\xa0o") Error in gsub("[\177-\xff]", "", "fo\xa0o") : invalid regular expression '[-�]', reason 'Invalid character range' In addition: Warning message: In gsub("[\177-\xff]", "", "fo\xa0o") : TRE pattern compilation error 'Invalid character range' There are other oddities, too, like > gsub("[[:alnum:]]", "", "fo\xa0o") # R-4-1-branch [1] "\xfc\xbe\x8c\x86\x84\xbc" > gsub("[[:alnum:]]", "", "fo\xa0o") # R-devel [1] "<>" The R-devel sessionInfo is > sessionInfo() R Under development (unstable) (2022-01-04 r81445) Platform: x86_64-apple-darwin19.6.0 (64-bit) Running under: macOS Catalina 10.15.7 Matrix products: default BLAS: /Users/ma38727/bin/R-devel/lib/libRblas.dylib LAPACK: /Users/ma38727/bin/R-devel/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.2.0 (I have built my own R on macOS; similar behavior is observed on a Linux machine) Any hints welcome, Martin Morgan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] unicode in R documentation
I have options(useFancyQuotes = FALSE) in my ~/.Rprofile. Martin Morgan On 7/13/21, 11:37 AM, "R-devel on behalf of Frederick Eaton" wrote: Dear R Team, I am running R from the terminal command line (not RStudio). I've noticed that R has been using Unicode quotes in its documentation for some time, maybe since before I started using it. I am wondering if it is possible to compile the documentation to use normal quotes instead. I find it useful to be able to search documentation for strings with quotes, for example when reading "?options" I might search for "'dev" to find an option starting with the letters "dev". Without the single-quote at the front, there would be a lot of matches that I'm not interested in, but the single-quote at the front helps narrow it down to the parameters that are being indexed in the documentation. However, I can't actually search for "'dev" in "?options" because it is written with curly quotes "‘device’" and "'" does not match "‘" on my machine. Similarly, when I read manual pages for commands on Linux, I sometimes search for "-r" instead of "r" because "-r" is likely to find documentation for the option "-r", while searching for "r" will match almost every line. I'm wondering what other people do when reading through documentation. Do you search for things at all or just read it straight through? Is there a hyperlinked version that just lets you jump to the "device" entry in "?options" or do you have to type out a search string? What search string do you use? Do you have a way to enter Unicode quotes when doing this, or does your pager provide a special regular expression syntax which makes it easier to match them? Thanks, Frederick __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] How to get utf8 string using R externals
On Wed, 2 Jun 2021, 22:31 Duncan Murdoch, wrote: > On 02/06/2021 4:33 p.m., xiaoyan yu wrote: > > I have a R Script Predict.R: > > set.seed(42) > > C <- seq(1:1000) > > A <- rep(seq(1:200),5) > > E <- (seq(1:1000) * (0.8 + (0.4*runif(50, 0, 1 > > L <- ifelse(runif(1000)>.5,1,0) > > df <- data.frame(cbind(C, A, E, L)) > > load("C:/Temp/tree.RData")# load the model for scoring > > > >P <- as.character(predict(tree_model_1,df,type='class')) > > > > Then in a C++ program > > I call eval to evaluate the script and then findVar the P variable. > > After get each class label from P using string_elt and then > > Rf_translateChar, the characters are unicodes () instead > of > > utf8 encoding of the korean characters 부실. > > Can I know how to get UTF8 by using R externals? > > > > I also found the same script giving utf8 characters in RGui but unicode > in > > Rterm. > > I tried to attach a screenshot but got message "The message's content > type > > was not explicitly allowed" > > In RGui, I saw the output 부실, while in Rterm, . > > Sounds like you're using Windows. Stop doing that. > > Duncan Murdoch > Could as well say: "Sounds like you are using R. Stop doing that." Start using Julia. ;-) > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R Console Bug?
Hi Simon, Thank you for the feedback. It is really strange that you have a different output. I have attached a picture of my R console. I am just trying to port some pure C code that prints progress bars to R but it does not seem to be printing properly. It seems I am doing something wrong with REprintf and R_FlushConsole. Best regards, Morgan On Sat, Apr 17, 2021 at 12:36 AM Simon Urbanek wrote: > Sorry, unable to reproduce on macOS, in R console: > > > dyn.load("test.so") > > .Call("printtest",1e4L) > >Processing data chunk 1 of 3 > [==] 100% > >Processing data chunk 2 of 3 > [==] 100% > >Processing data chunk 3 of 3 > [==] 100% > NULL > > But honestly I'm not sure sure I understand the report. R_FlushConsole is > a no-op for terminal console and your code just prints on stderr anyway > (which is not buffered). All this does is just a lot of \r output (which is > highly inefficient anywhere but in Terminal by definition). Can you clarify > what the code tries to trigger? > > Cheers, > Simon > > > > On Apr 16, 2021, at 23:11, Morgan Morgan > wrote: > > > > Hi, > > > > I am getting a really weird behaviour with the R console. > > Here is the code to reproduce it. > > > > 1/ C code: --- > > > > SEXP printtest(SEXP x) { > > const int PBWIDTH = 30, loop = INTEGER(x)[0]; > > int val, lpad; > > double perc; > > char PBSTR[PBWIDTH], PBOUT[PBWIDTH]; > > memset(PBSTR,'=', sizeof(PBSTR)); > > memset(PBOUT,'-', sizeof(PBOUT)); > > for (int k = 0; k < 3; ++k) { > >REprintf("\n Processing data chunk %d of 3\n",k+1); > >for (int i = 0; i < loop; ++i) { > > perc = (double) i/(loop-1); > > val = (int) (perc * 100); > > lpad = (int) (perc * PBWIDTH); > > REprintf("\r [%.*s%.*s] %3d%%", lpad, PBSTR, PBWIDTH - lpad, PBOUT, > > val); > > R_FlushConsole(); > >} > >REprintf("\n"); > > } > > return R_NilValue; > > } > > > > 2/ Build so/dll: --- > > > > R CMD SHLIB > > > > 3/ Run code : --- > > > > dyn.load("test.so") > > .Call("printtest",1e4L) > > dyn.unload("test.so") > > > > 4/ Issue: --- > > If you run the above code in RStudio, it works well both on Mac and > Windows. > > If you run it in Windows cmd, it is slow. > > If you run it in Windows RGui, it is slow but also all texts are flushed. > > If you run it in Mac terminal, it runs perfectly. > > If you run it in Mac R Console, it prints something like : > >> .Call("printtest",1e4L) > > [==] 100%NULL] > 0% > > > > I am using R 4.0.4 (Mac) / 4.0.5 (Windows) > > > > Is that a bug or am I doing something wrong? > > > > Thank you > > Best regards, > > Morgan > > > > [[alternative HTML version deleted]] > > > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] R Console Bug?
Hi, I am getting a really weird behaviour with the R console. Here is the code to reproduce it. 1/ C code: --- SEXP printtest(SEXP x) { const int PBWIDTH = 30, loop = INTEGER(x)[0]; int val, lpad; double perc; char PBSTR[PBWIDTH], PBOUT[PBWIDTH]; memset(PBSTR,'=', sizeof(PBSTR)); memset(PBOUT,'-', sizeof(PBOUT)); for (int k = 0; k < 3; ++k) { REprintf("\n Processing data chunk %d of 3\n",k+1); for (int i = 0; i < loop; ++i) { perc = (double) i/(loop-1); val = (int) (perc * 100); lpad = (int) (perc * PBWIDTH); REprintf("\r [%.*s%.*s] %3d%%", lpad, PBSTR, PBWIDTH - lpad, PBOUT, val); R_FlushConsole(); } REprintf("\n"); } return R_NilValue; } 2/ Build so/dll: --- R CMD SHLIB 3/ Run code : --- dyn.load("test.so") .Call("printtest",1e4L) dyn.unload("test.so") 4/ Issue: --- If you run the above code in RStudio, it works well both on Mac and Windows. If you run it in Windows cmd, it is slow. If you run it in Windows RGui, it is slow but also all texts are flushed. If you run it in Mac terminal, it runs perfectly. If you run it in Mac R Console, it prints something like : > .Call("printtest",1e4L) [==] 100%NULL] 0% I am using R 4.0.4 (Mac) / 4.0.5 (Windows) Is that a bug or am I doing something wrong? Thank you Best regards, Morgan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Faster sorting algorithm...
My apologies to Professor Neal. Thank you for correcting me. Best regards Morgan On Mon, 22 Mar 2021, 05:05 , wrote: > I think it is "Professor Neal" :) > > I also appreciate the pqR comparisons. > > On Wed, Mar 17, 2021 at 09:23:15AM +, Morgan Morgan wrote: > >Thank you Neal. This is interesting. I will have a look at pqR. > >Indeed radix only does C collation, I believe that is why it is not the > >default choice for character ordering and sorting. > >Not sure but I believe it can help address the following bugzilla item: > >https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17400 > > > >On the same topic of collation, there is an experimental sorting function > >"psort" in package kit that might help address this issue. > > > >> library(kit) > >Attaching kit 0.0.7 (OPENMP enabled using 1 thread) > >> x <- c("b","A","B","a","\xe4") > >> Encoding(x) <- "latin1" > >> identical(psort(x, c.locale=FALSE), sort(x)) > >[1] TRUE > >> identical(psort(x, c.locale=TRUE), sort(x, method="radix")) > >[1] TRUE > > > >Coming back to the topic of fsort, I have just finished the implementation > >for double, integer, factor and logical. > >The implementation takes into account NA, Inf.. values. Values can be > >sorted in a decreasing order or increasing order. > >Comparing benchmark with the current implementation in data.table, it is > >currently over 30% faster. > >There might bugs but I am sure performance can be further improved as I > did > >not really try hard. > >If there is interest in both the implementation and cross community > >sharing, please let know > > > >Best regards, > >Morgan > > > >On Wed, 17 Mar 2021, 00:37 Radford Neal, wrote: > > > >> Those interested in faster sorting may want to look at the merge sort > >> implemented in pqR (see pqR-project.org). It's often used as the > >> default, because it is stable, and does different collations, while > >> being faster than shell sort (except for small vectors). > >> > >> Here are examples, with timings, for pqR-2020-07-23 and R-4.0.2, > >> compiled identically: > >> > >> - > >> pqR-2020-07-23 in C locale: > >> > >> > set.seed(1) > >> > N <- 100 > >> > x <- as.character (sample(N,N,replace=TRUE)) > >> > print(system.time (os <- order(x,method="shell"))) > >>user system elapsed > >> 1.332 0.000 1.334 > >> > print(system.time (or <- order(x,method="radix"))) > >>user system elapsed > >> 0.092 0.004 0.096 > >> > print(system.time (om <- order(x,method="merge"))) > >>user system elapsed > >> 0.363 0.000 0.363 > >> > print(identical(os,or)) > >> [1] TRUE > >> > print(identical(os,om)) > >> [1] TRUE > >> > > >> > x <- c("a","~") > >> > print(order(x,method="shell")) > >> [1] 1 2 > >> > print(order(x,method="radix")) > >> [1] 1 2 > >> > print(order(x,method="merge")) > >> [1] 1 2 > >> > >> - > >> R-4.0.2 in C locale: > >> > >> > set.seed(1) > >> > N <- 100 > >> > x <- as.character (sample(N,N,replace=TRUE)) > >> > print(system.time (os <- order(x,method="shell"))) > >>user system elapsed > >> 2.381 0.004 2.387 > >> > print(system.time (or <- order(x,method="radix"))) > >>user system elapsed > >> 0.138 0.000 0.137 > >> > #print(system.time (om <- order(x,method="merge"))) > >> > print(identical(os,or)) > >> [1] TRUE > >> > #print(identical(os,om)) > >> > > >> > x <- c("a","~") > >> > print(order(x,method="shell")) > >> [1] 1 2 > >> > print(order(x,method="radix")) > >> [1] 1 2 > >> > #print(order(x,method="merge")) > >> > >> > >> pqR-2020-07-23 in fr_CA.utf8 locale: > >> > >> > set.seed(1) > >> > N <- 100 > >> > x <- as.character (sample(N,N,replace=TRUE)) > >> > print(system.t
Re: [Rd] Faster sorting algorithm...
Thank you Neal. This is interesting. I will have a look at pqR. Indeed radix only does C collation, I believe that is why it is not the default choice for character ordering and sorting. Not sure but I believe it can help address the following bugzilla item: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17400 On the same topic of collation, there is an experimental sorting function "psort" in package kit that might help address this issue. > library(kit) Attaching kit 0.0.7 (OPENMP enabled using 1 thread) > x <- c("b","A","B","a","\xe4") > Encoding(x) <- "latin1" > identical(psort(x, c.locale=FALSE), sort(x)) [1] TRUE > identical(psort(x, c.locale=TRUE), sort(x, method="radix")) [1] TRUE Coming back to the topic of fsort, I have just finished the implementation for double, integer, factor and logical. The implementation takes into account NA, Inf.. values. Values can be sorted in a decreasing order or increasing order. Comparing benchmark with the current implementation in data.table, it is currently over 30% faster. There might bugs but I am sure performance can be further improved as I did not really try hard. If there is interest in both the implementation and cross community sharing, please let know Best regards, Morgan On Wed, 17 Mar 2021, 00:37 Radford Neal, wrote: > Those interested in faster sorting may want to look at the merge sort > implemented in pqR (see pqR-project.org). It's often used as the > default, because it is stable, and does different collations, while > being faster than shell sort (except for small vectors). > > Here are examples, with timings, for pqR-2020-07-23 and R-4.0.2, > compiled identically: > > - > pqR-2020-07-23 in C locale: > > > set.seed(1) > > N <- 100 > > x <- as.character (sample(N,N,replace=TRUE)) > > print(system.time (os <- order(x,method="shell"))) >user system elapsed > 1.332 0.000 1.334 > > print(system.time (or <- order(x,method="radix"))) >user system elapsed > 0.092 0.004 0.096 > > print(system.time (om <- order(x,method="merge"))) >user system elapsed > 0.363 0.000 0.363 > > print(identical(os,or)) > [1] TRUE > > print(identical(os,om)) > [1] TRUE > > > > x <- c("a","~") > > print(order(x,method="shell")) > [1] 1 2 > > print(order(x,method="radix")) > [1] 1 2 > > print(order(x,method="merge")) > [1] 1 2 > > - > R-4.0.2 in C locale: > > > set.seed(1) > > N <- 100 > > x <- as.character (sample(N,N,replace=TRUE)) > > print(system.time (os <- order(x,method="shell"))) >user system elapsed > 2.381 0.004 2.387 > > print(system.time (or <- order(x,method="radix"))) >user system elapsed > 0.138 0.000 0.137 > > #print(system.time (om <- order(x,method="merge"))) > > print(identical(os,or)) > [1] TRUE > > #print(identical(os,om)) > > > > x <- c("a","~") > > print(order(x,method="shell")) > [1] 1 2 > > print(order(x,method="radix")) > [1] 1 2 > > #print(order(x,method="merge")) > > > pqR-2020-07-23 in fr_CA.utf8 locale: > > > set.seed(1) > > N <- 100 > > x <- as.character (sample(N,N,replace=TRUE)) > > print(system.time (os <- order(x,method="shell"))) > utilisateur système écoulé > 2.960 0.000 2.962 > > print(system.time (or <- order(x,method="radix"))) > utilisateur système écoulé > 0.083 0.008 0.092 > > print(system.time (om <- order(x,method="merge"))) > utilisateur système écoulé > 1.143 0.000 1.142 > > print(identical(os,or)) > [1] TRUE > > print(identical(os,om)) > [1] TRUE > > > > x <- c("a","~") > > print(order(x,method="shell")) > [1] 2 1 > > print(order(x,method="radix")) > [1] 1 2 > > print(order(x,method="merge")) > [1] 2 1 > > > R-4.0.2 in fr_CA.utf8 locale: > > > set.seed(1) > > N <- 100 > > x <- as.character (sample(N,N,replace=TRUE)) > > print(system.time (os <- order(x,method="shell"))) > utilisateur système écoulé > 4.222 0.016 4.239 > > print(system.time (or <- order(x,method="radix"
Re: [Rd] Faster sorting algorithm...
Default method for sort is not radix(especially for character vector). You might want to read the documentation of sort. For your second question, I invite you to look at the code of fsort. It is implemented only for positive finite double, and default to data.table:::forder ... when the types are different than positive double... Please read the pdf link I sent, everything is explained in it. Thank you Morgan On Mon, 15 Mar 2021, 16:52 Avraham Adler, wrote: > Isn’t the default method now “radix” which is the data.table sort, and > isn’t that already parallel using openmp where available? > > Avi > > On Mon, Mar 15, 2021 at 12:26 PM Morgan Morgan > wrote: > >> Hi, >> I am not sure if this is the right mailing list, so apologies in advance >> if >> it is not. >> >> I found the following link/presentation: >> https://www.r-project.org/dsc/2016/slides/ParallelSort.pdf >> >> The implementation of fsort is interesting but incomplete (not sure why?) >> and can be improved or made faster (at least 25% I believe). I might be >> wrong but there are maybe a couple of bugs as well. >> >> My questions are: >> >> 1/ Is the R Core team interested in a faster sorting algo? (Multithread or >> even single threaded) >> >> 2/ I see an issue with the license, which is MPL-2.0, and hence not >> compatible with base R, Python and Julia. Is there an interest to change >> the license of fsort so all 3 languages (and all the people using these >> languages) can benefit from it? (Like suggested on the first page) >> >> Please let me know if there is an interest to address the above points, I >> would be happy to look into it (free of charge of course!). >> >> Thank you >> Best regards >> Morgan >> >> [[alternative HTML version deleted]] >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > -- > Sent from Gmail Mobile > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Faster sorting algorithm...
Hi, I am not sure if this is the right mailing list, so apologies in advance if it is not. I found the following link/presentation: https://www.r-project.org/dsc/2016/slides/ParallelSort.pdf The implementation of fsort is interesting but incomplete (not sure why?) and can be improved or made faster (at least 25% I believe). I might be wrong but there are maybe a couple of bugs as well. My questions are: 1/ Is the R Core team interested in a faster sorting algo? (Multithread or even single threaded) 2/ I see an issue with the license, which is MPL-2.0, and hence not compatible with base R, Python and Julia. Is there an interest to change the license of fsort so all 3 languages (and all the people using these languages) can benefit from it? (Like suggested on the first page) Please let me know if there is an interest to address the above points, I would be happy to look into it (free of charge of course!). Thank you Best regards Morgan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [External] Something is wrong with the unserialize function
This Index: src/main/altrep.c === --- src/main/altrep.c (revision 79385) +++ src/main/altrep.c (working copy) @@ -275,10 +275,11 @@ SEXP psym = ALTREP_SERIALIZED_CLASS_PKGSYM(info); SEXP class = LookupClass(csym, psym); if (class == NULL) { - SEXP pname = ScalarString(PRINTNAME(psym)); + SEXP pname = PROTECT(ScalarString(PRINTNAME(psym))); R_tryCatchError(find_namespace, pname, handle_namespace_error, NULL); class = LookupClass(csym, psym); + UNPROTECT(1); } return class; } seems to remove the warning; I'm guessing that the other SEXP already exist so don't need protecting? Martin Morgan On 10/29/20, 12:47 PM, "R-devel on behalf of luke-tier...@uiowa.edu" wrote: Thanks for the report. Will look into it when I get a chance unless someone else gets there first. A simpler reprex: ## create and serialize a memmory-mapped file object filePath <- "x.dat" con <- file(filePath, "wrb") writeBin(rep(0.0,10),con) close(con) library(simplemmap) x <- mmap(filePath, "double") saveRDS(x, file = "x.Rds") ## in a separate R process: gctorture() readRDS("x.Rds") Looks like a missing PROTECT somewhere. Best, luke On Thu, 29 Oct 2020, Jiefei Wang wrote: > Hi all, > > I am not able to export an ALTREP object when `gctorture` is on in the > worker. The package simplemmap can be used to reproduce the problem. See > the example below > ``` > ## Create a temporary file > filePath <- tempfile() > con <- file(filePath, "wrb") > writeBin(rep(0.0,10),con) > close(con) > > library(simplemmap) > library(parallel) > cl <- makeCluster(1) > x <- mmap(filePath, "double") > ## Turn gctorture on > clusterEvalQ(cl, gctorture()) > clusterExport(cl, "x") > ## x is an 0-length vector on the worker > clusterEvalQ(cl, x) > stopCluster(cl) > ``` > > you can find more info on the problem if you manually build a connection > between two R processes and export the ALTREP object. See output below > ``` >> con <- socketConnection(port = 1234,server = FALSE) >> gctorture() >> x <- unserialize(con) > Warning message: > In unserialize(con) : > cannot unserialize ALTVEC object of class 'mmap_real' from package > 'simplemmap'; returning length zero vector > ``` > It seems like simplemmap did not get loaded correctly on the worker. If > you run `library( simplemmap)` before unserializing the ALTREP, there will > be no problem. But I suppose we should be able to unserialize objects > without preloading the library? > > This issue can be reproduced on Ubuntu with R version 4.0.2 (2020-06-22) > and Windows with R Under development (unstable) (2020-09-03 r79126). > > Here is the link to simplemmap: > https://github.com/ALTREP-examples/Rpkg-simplemmap > > Best, > Jiefei > > [[alternative HTML version deleted]] > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tier...@uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Is it possible to simply the use of NULL slots (or at least improve the help files)?
Answering to convey the 'rules' as I know them, rather than to address the underlying issues that I guess you are really after... The S4 practice is to use setOldClass() to explicitly treat an S3 character() vector of classes as an assertion of linear inheritance > x <- structure (sqrt (37), class = c ("sqrt.prime", "numeric") ) > is(x, "maybeNumber") [1] FALSE > setOldClass(class(x)) > is(x, "maybeNumber") [1] TRUE There are some quite amusing things that can go on with S3 classes, since the class attribute is just a character vector. So > x <- structure ("September", class = c ("sqrt.prime", "numeric") ) > is(x, "numeric") ## similarly, inherits() [1] TRUE > x <- structure (1, class = c ("numeric", "character")) > is(x, "numeric") [1] TRUE > is(x, "character") [1] TRUE Perhaps the looseness of the S3 system motivated the use of setOldClass() for anything more than assertion of simple relationships? At least in this context setOldClass() provides some type checking sanity > setOldClass(c("character", "numeric")) Error in setOldClass(c("character", "numeric")) : inconsistent old-style class information for "character"; the class is defined but does not extend "numeric" and is not valid as the data part In addition: Warning message: In .validDataPartClass(cl, where, dataPartClass) : more than one possible class for the data part: using "numeric" rather than "character" Martin Morgan On 9/24/20, 4:51 PM, "Abby Spurdle" wrote: Hi Martin, Thankyou for your response. I suspect that we're not going to agree on the main point. Making it trivially simple (as say Java) to set slots to NULL. So, I'll move on to the other points here. ***Note that cited text uses excerpts only.*** > setClassUnion("character_OR_NULL", c("character", "NULL")) > A = setClass("A", slots = c(x = "character_OR_NULL")) I think the above construct needs to be documented much more clearly. i.e. In the introductory and details pages for S4 classes. This is something that many people will want to do. And BasicClasses or NULL-class, are not the most obvious place to start looking, either. Also, I'd recommend the S4 authors, go one step further. Include character_OR_NULL, numeric_OR_NULL, etc, or something similar, in S4's predefined basic classes. Otherwise, contributed packages will (eventually) end up with hundreds of copies of these. > setClassUnion("maybeNumber", c("numeric", "logical")) > every instance of numeric _is_ a maybeNumber, e.g., > > is(1, "maybeNumber") > [1] TRUE > which I think is consistent with the use of 'superclass' Not quite. x <- structure (sqrt (37), class = c ("sqrt.prime", "numeric") ) is (x, "numeric") #TRUE is (x, "maybeNumber") #FALSE So now, an object x, is a numeric but not a maybeNumber. Perhaps a class union should be described as a partial imitation of a superclass, for the purpose of making slots more flexible. B. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Is it possible to simply the use of NULL slots (or at least improve the help files)?
I did ?"NULL at the command line and was lead to ?"NULL-class" and the BasicClasses help page in the methods package. getClass("NULL"), getClass("character") show that these objects are unrelated, so a class union is the way to define a class that is the union of these. The essence of the behavior you would like is setClassUnion("character_OR_NULL", c("character", "NULL")) .A = setClass("A", slots = c(x = "character_OR_NULL")) with > .A(x = NULL) An object of class "A" Slot "x": NULL > .A(x = month.abb) An object of class "A" Slot "x": [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec" > .A(x = 1:5) Error in validObject(.Object) : invalid class "A" object: invalid object for slot "x" in class "A": got class "integer", should be or extend class "character_OR_NULL" I understand there are situations where NULL is desired, perhaps to indicate 'not yet initialized' and distinct from character(0) or NA_character_, but want to mention those often appropriate alternatives. With setClassUnion("maybeNumber", c("numeric", "logical")) every instance of numeric _is_ a maybeNumber, e.g., > is(1, "maybeNumber") [1] TRUE > is(1L, "maybeNumber") [1] TRUE > is(numeric(), "maybeNumber") [1] TRUE > is(NA_integer_, "maybeNumber") [1] TRUE which I think is consistent with the use of 'superclass' on the setClassUnion help page. Martin Morgan On 9/23/20, 5:20 PM, "R-devel on behalf of Abby Spurdle" wrote: As far as I can tell, there's no trivial way to set arbitrary S4 slots to NULL. Most of the online examples I can find, use setClassUnion and are about 10 years old. Which, in my opinion, is defective. There's nothing "robust" about making something that should be trivially simple, really complicated. Maybe there is a simpler way, and I just haven't worked it out, yet. But either way, could the documentation for the methods package be improved? I can find any obvious info on NULL slots: Introduction Classes Classes_Details setClass slot Again, maybe I missed it. Even setClassUnion, which is what's used in the online examples, doesn't contain a NULL slot example. One more thing: The help file for setClassUnion, uses the term "superclass", incorrectly. Its examples include the following: setClassUnion("maybeNumber", c("numeric", "logical")) If maybeNumber was the superclass of numeric, then every instance of numeric would also be an instance of maybeNumber... __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Garbage collection of seemingly PROTECTed pairlist
I put your code into a file tmp.R and eliminated the need for a package by compiling this to a shared object R CMD SHLIB tmp.c I'm then able to use a simple script 'tmp.R' dyn.load("/tmp/tmp.so") fullocate <- function(int_mat) .Call("C_fullocate", int_mat) int_mat <- rbind(c(5L, 6L), c(7L, 10L), c(20L, 30L)) while(TRUE) res <- fullocate(int_mat) to generate a segfault. Looking at your code, it seemed like I could get towards a simpler reproducible example by eliminating most of the 'while' loop and then functions and code branches that are not used #include SEXP C_int_mat_nth_row_nrnc(int *int_mat_int, int nr, int nc, int n) { SEXP out = PROTECT(Rf_allocVector(INTSXP, nc)); int *out_int = INTEGER(out); for (int i = 0; i != nr; ++i) { out_int[i] = int_mat_int[n - 1 + i * nr]; } UNPROTECT(1); return out;} SEXP C_fullocate(SEXP int_mat) { int nr = Rf_nrows(int_mat), *int_mat_int = INTEGER(int_mat); int row_num = 2; // row_num will be 1-indexed SEXP prlst0cdr = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, 1)); SEXP prlst = PROTECT(Rf_list1(prlst0cdr)); SEXP row = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, row_num)); Rf_PrintValue(prlst); // This is where the error occurs UNPROTECT(3); return R_NilValue; } my script still gives an error, but not a segfault, and the values printed sometimes differ between calls ... [[1]] [1] 5 6 . [[1]] NULL ... Error in FUN(X[[i]], ...) : cannot coerce type 'NULL' to vector of type 'character' Calls: message -> .makeMessage -> lapply Execution halted The differing values in particular, and the limited PROTECTion in the call and small allocations (hence limited need / opportunity for garbage collection), suggest that you're corrupting memory, rather than having a problem with garbage collection. Indeed, SEXP prlst0cdr = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, 1)); allocates a vector of length 2 at SEXP out = PROTECT(Rf_allocVector(INTSXP, nc)); but writes three elements (the 0th, 1st, and 2nd) at for (int i = 0; i != nr; ++i) { out_int[i] = int_mat_int[n - 1 + i * nr]; } Martin Morgan On 9/11/20, 9:30 PM, "R-devel on behalf of Rory Nolan" wrote: I want to write an R function using R's C interface that takes a 2-column matrix of increasing, non-overlapping integer intervals and returns a list with those intervals plus some added intervals, such that there are no gaps. For example, it should take the matrix rbind(c(5L, 6L), c(7L, 10L), c(20L, 30L)) and return list(c(5L, 6L), c(7L, 10L), c(11L, 19L), c(20L, 30L)). Because the output is of variable length, I use a pairlist (because it is growable) and then I call Rf_PairToVectorList() at the end to make it into a regular list. I'm getting a strange garbage collection error. My PROTECTed pairlist prlst gets garbage collected away and causes a memory leak error when I try to access it. Here's my code. #include SEXP C_int_mat_nth_row_nrnc(int *int_mat_int, int nr, int nc, int n) { SEXP out = PROTECT(Rf_allocVector(INTSXP, nc)); int *out_int = INTEGER(out); if (n <= 0 | n > nr) { for (int i = 0; i != nc; ++i) { out_int[i] = NA_INTEGER; } } else { for (int i = 0; i != nr; ++i) { out_int[i] = int_mat_int[n - 1 + i * nr]; } } UNPROTECT(1); return out;} SEXP C_make_len2_int_vec(int first, int second) { SEXP out = PROTECT(Rf_allocVector(INTSXP, 2)); int *out_int = INTEGER(out); out_int[0] = first; out_int[1] = second; UNPROTECT(1); return out;} SEXP C_fullocate(SEXP int_mat) { int nr = Rf_nrows(int_mat), *int_mat_int = INTEGER(int_mat); int last, row_num; // row_num will be 1-indexed SEXP prlst0cdr = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, 1)); SEXP prlst = PROTECT(Rf_list1(prlst0cdr)); SEXP prlst_tail = prlst; last = INTEGER(prlst0cdr)[1]; row_num = 2; while (row_num <= nr) { Rprintf("row_num: %i\n", row_num); SEXP row = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, row_num)); Rf_PrintValue(prlst); // This is where the error occurs int *row_int = INTEGER(row); if (row_int[0] == last + 1) { Rprintf("here1"); SEXP next = PROTECT(Rf_list1(row)); prlst_tail = SETCDR(prlst_tail, next); last = row_int[1]; UNPROTECT(1); ++row_num; } else { Rprintf("here2"); SEXP next_car = PROTECT(C_make_len2_int_vec(last + 1, row_int[0] - 1)); SEXP next = PROTECT(Rf_list
Re: [Rd] lapply and vapply Primitive Documentation
Was hoping for an almost record old bug fix (older than some R users!), but apparently the documentation bug is only a decade old (maybe only older than some precious R users) https://github.com/wch/r-source/blame/2118f1d0ff70c1ebd06148b6cb7659efe5ff4d99/src/library/base/man/lapply.Rd#L116 (I don't see lapply / vapply referenced as primitive in the original text changed by the commit). Martin Morgan On 7/10/20, 3:52 AM, "R-devel on behalf of Martin Maechler" wrote: >>>>> Cole Miller >>>>> on Thu, 9 Jul 2020 20:38:10 -0400 writes: > The documentation of ?lapply includes: >> lapply and vapply are primitive functions. > However, both evaluate to FALSE in `is.primitive()`: > is.primitive(vapply) #FALSE > is.primitive(lapply) #FALSE > It appears that they are not primitives and that the > documentation might be outdated. Thank you for your time > and work. Thank you, Cole. Indeed, they were primitive originally (but e.g. lapply() seems to have become .Internal with r7885 | ripley | 2000-01-31 08:58:59 +0100 i.e. about 4 weeks *before* release of R 1.0.0 Changes made to both 'R-devel' and 'R-patched'. Martin > Cole Miller > P.S. During research, my favorite `help()` is > `?.Internal()`: "Only true R wizards should even consider > using this function..." Thanks again! ;-) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Build a R call at C level
Sorry Dirk, I don't remember discussing this topic or alternatives with you at all. Have a nice day. On Tue, 30 Jun 2020, 14:42 Morgan Morgan, wrote: > Thanks Jan and Tomas for the feedback. > Answer from Jan is what I am looking for. > Maybe I am not looking in the right place buy it is not easy to understand > how these LCONS, CONS, SETCDR...etc works. > > Thank you > Best regards > Morgan > > > > On Tue, 30 Jun 2020, 12:36 Tomas Kalibera, > wrote: > >> On 6/30/20 1:06 PM, Jan Gorecki wrote: >> > It is quite known that R documentation on R C api could be improved... >> >> Please see "5.11 Evaluating R expressions from C" from "Writing R >> Extensions" >> >> Best >> Tomas >> >> > Still R-package-devel mailing list should be preferred for this kind >> > of questions. >> > Not sure if that is the best way, but works. >> > >> > call_to_sum <- inline::cfunction( >> >language = "C", >> >sig = c(x = "SEXP"), body = " >> > >> > SEXP e = PROTECT(lang2(install(\"sum\"), x)); >> > SEXP r_true = PROTECT(CONS(ScalarLogical(1), R_NilValue)); >> > SETCDR(CDR(e), r_true); >> > SET_TAG(CDDR(e), install(\"na.rm\")); >> > Rf_PrintValue(e); >> > SEXP ans = PROTECT(eval(e, R_GlobalEnv)); >> > UNPROTECT(3); >> > return ans; >> > >> > ") >> > >> > call_to_sum(c(1L,NA,3L)) >> > >> > On Tue, Jun 30, 2020 at 10:08 AM Morgan Morgan >> > wrote: >> >> Hi All, >> >> >> >> I was reading the R extension manual section 5.11 ( Evaluating R >> expression >> >> from C) and I tried to build a simple call to the sum function. Please >> see >> >> below. >> >> >> >> call_to_sum <- inline::cfunction( >> >>language = "C", >> >>sig = c(x = "SEXP"), body = " >> >> >> >> SEXP e = PROTECT(lang2(install(\"sum\"), x)); >> >> SEXP ans = PROTECT(eval(e, R_GlobalEnv)); >> >> UNPROTECT(2); >> >> return ans; >> >> >> >> ") >> >> >> >> call_to_sum(1:3) >> >> >> >> The above works. My question is how do I add the argument "na.rm=TRUE" >> at C >> >> level to the above call? I have tried various things based on what is >> in >> >> section 5.11 but I did not manage to get it to work. >> >> >> >> Thank you >> >> Best regards >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> __ >> >> R-devel@r-project.org mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > __ >> > R-devel@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> >> >> [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Build a R call at C level
Thanks Jan and Tomas for the feedback. Answer from Jan is what I am looking for. Maybe I am not looking in the right place buy it is not easy to understand how these LCONS, CONS, SETCDR...etc works. Thank you Best regards Morgan On Tue, 30 Jun 2020, 12:36 Tomas Kalibera, wrote: > On 6/30/20 1:06 PM, Jan Gorecki wrote: > > It is quite known that R documentation on R C api could be improved... > > Please see "5.11 Evaluating R expressions from C" from "Writing R > Extensions" > > Best > Tomas > > > Still R-package-devel mailing list should be preferred for this kind > > of questions. > > Not sure if that is the best way, but works. > > > > call_to_sum <- inline::cfunction( > >language = "C", > >sig = c(x = "SEXP"), body = " > > > > SEXP e = PROTECT(lang2(install(\"sum\"), x)); > > SEXP r_true = PROTECT(CONS(ScalarLogical(1), R_NilValue)); > > SETCDR(CDR(e), r_true); > > SET_TAG(CDDR(e), install(\"na.rm\")); > > Rf_PrintValue(e); > > SEXP ans = PROTECT(eval(e, R_GlobalEnv)); > > UNPROTECT(3); > > return ans; > > > > ") > > > > call_to_sum(c(1L,NA,3L)) > > > > On Tue, Jun 30, 2020 at 10:08 AM Morgan Morgan > > wrote: > >> Hi All, > >> > >> I was reading the R extension manual section 5.11 ( Evaluating R > expression > >> from C) and I tried to build a simple call to the sum function. Please > see > >> below. > >> > >> call_to_sum <- inline::cfunction( > >>language = "C", > >>sig = c(x = "SEXP"), body = " > >> > >> SEXP e = PROTECT(lang2(install(\"sum\"), x)); > >> SEXP ans = PROTECT(eval(e, R_GlobalEnv)); > >> UNPROTECT(2); > >> return ans; > >> > >> ") > >> > >> call_to_sum(1:3) > >> > >> The above works. My question is how do I add the argument "na.rm=TRUE" > at C > >> level to the above call? I have tried various things based on what is in > >> section 5.11 but I did not manage to get it to work. > >> > >> Thank you > >> Best regards > >> > >> [[alternative HTML version deleted]] > >> > >> __ > >> R-devel@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-devel > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Build a R call at C level
Hi All, I was reading the R extension manual section 5.11 ( Evaluating R expression from C) and I tried to build a simple call to the sum function. Please see below. call_to_sum <- inline::cfunction( language = "C", sig = c(x = "SEXP"), body = " SEXP e = PROTECT(lang2(install(\"sum\"), x)); SEXP ans = PROTECT(eval(e, R_GlobalEnv)); UNPROTECT(2); return ans; ") call_to_sum(1:3) The above works. My question is how do I add the argument "na.rm=TRUE" at C level to the above call? I have tried various things based on what is in section 5.11 but I did not manage to get it to work. Thank you Best regards [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] subset data.frame at C level
Thank you Jim for the feedback. I actually implemented it the way I describe it in my first email and it seems fast enough for me. Just to give a bit of context I will need it at some point in package kit. I also implemented subset by row which I actually need more as I am working on a faster version of the unique and duplicated function. The function unique is particularly slow for data.frame. So far I got a 100x speedup. Best regards Morgan On Tue, 23 Jun 2020, 21:11 Jim Hester, wrote: > It looks to me like internally .subset2 uses `get1index()`, but this > function is declared in Defn.h, which AFAIK is not part of the exported R > API. > > Looking at the code for `get1index()` it looks like it just loops over > the (translated) names, so I guess I just do that [0]. > > [0]: > https://github.com/r-devel/r-svn/blob/1ff1d4197495a6ee1e1d88348a03ff841fd27608/src/main/subscript.c#L226-L235 > > On Wed, Jun 17, 2020 at 6:11 AM Morgan Morgan > wrote: > >> Hi, >> >> Hope you are well. >> >> I was wondering if there is a function at C level that is equivalent to >> mtcars$carb or .subset2(mtcars, "carb"). >> >> If I have the index of the column then the answer would be VECTOR_ELT(df, >> asInteger(idx)) but I was wondering if there is a way to do it directly >> from the name of the column without having to loop over columns names to >> find the index? >> >> Thank you >> Best regards >> Morgan >> >> [[alternative HTML version deleted]] >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] subset data.frame at C level
Hi, Hope you are well. I was wondering if there is a function at C level that is equivalent to mtcars$carb or .subset2(mtcars, "carb"). If I have the index of the column then the answer would be VECTOR_ELT(df, asInteger(idx)) but I was wondering if there is a way to do it directly from the name of the column without having to loop over columns names to find the index? Thank you Best regards Morgan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Precision of function mean,bug?
Sorry, posting back to the list. Thank you all. Morgan On Thu, 21 May 2020, 16:33 Henrik Bengtsson, wrote: > Hi. > > Good point and a good example. Feel free to post to the list. The purpose > of my reply wasn't to take away Peter's point but to emphasize that > base::mean() does a two-pass scan over the elements too lower the impact of > addition of values with widely different values (classical problem in > numerical analysis). But I can see how it may look like that. > > Cheers, > > Henrik > > > On Thu, May 21, 2020, 03:21 Morgan Morgan > wrote: > >> Thank you Henrik for the feedback. >> Note that for idx=4 and refine = TRUE, your equality b==c is FALSE. I >> think that as Peter said == can't be trusted with FP. >> His example is good. Here is an even more shocking one. >> a=0.786546798 >> b=a+ 1e6 -1e6 >> a==b >> # [1] FALSE >> >> Best regards >> Morgan Jacob >> >> On Wed, 20 May 2020, 20:18 Henrik Bengtsson, >> wrote: >> >>> On Wed, May 20, 2020 at 11:10 AM brodie gaslam via R-devel >>> wrote: >>> > >>> > > On Wednesday, May 20, 2020, 7:00:09 AM EDT, peter dalgaard < >>> pda...@gmail.com> wrote: >>> > > >>> > > Expected, see FAQ 7.31. >>> > > >>> > > You just can't trust == on FP operations. Notice also >>> > >>> > Additionally, since you're implementing a "mean" function you are >>> testing >>> > against R's mean, you might want to consider that R uses a two-pass >>> > calculation[1] to reduce floating point precision error. >>> >>> This one is important. >>> >>> FWIW, matrixStats::mean2() provides argument refine=TRUE/FALSE to >>> calculate mean with and without this two-pass calculation; >>> >>> > a <- c(x[idx],y[idx],z[idx]) / 3 >>> > b <- mean(c(x[idx],y[idx],z[idx])) >>> > b == a >>> [1] FALSE >>> > b - a >>> [1] 2.220446e-16 >>> >>> > c <- matrixStats::mean2(c(x[idx],y[idx],z[idx])) ## default to >>> refine=TRUE >>> > b == c >>> [1] TRUE >>> > b - c >>> [1] 0 >>> >>> > d <- matrixStats::mean2(c(x[idx],y[idx],z[idx]), refine=FALSE) >>> > a == d >>> [1] TRUE >>> > a - d >>> [1] 0 >>> > c == d >>> [1] FALSE >>> > c - d >>> [1] 2.220446e-16 >>> >>> Not surprisingly, the two-pass higher-precision version (refine=TRUE) >>> takes roughly twice as long as the one-pass quick version >>> (refine=FALSE). >>> >>> /Henrik >>> >>> > >>> > Best, >>> > >>> > Brodie. >>> > >>> > [1] >>> https://github.com/wch/r-source/blob/tags/R-4-0-0/src/main/summary.c#L482 >>> > >>> > > > a2=(z[idx]+x[idx]+y[idx])/3 >>> > > > a2==a >>> > > [1] FALSE >>> > > > a2==b >>> > > [1] TRUE >>> > > >>> > > -pd >>> > > >>> > > > On 20 May 2020, at 12:40 , Morgan Morgan < >>> morgan.email...@gmail.com> wrote: >>> > > > >>> > > > Hello R-dev, >>> > > > >>> > > > Yesterday, while I was testing the newly implemented function >>> pmean in >>> > > > package kit, I noticed a mismatch in the output of the below R >>> expressions. >>> > > > >>> > > > set.seed(123) >>> > > > n=1e3L >>> > > > idx=5 >>> > > > x=rnorm(n) >>> > > > y=rnorm(n) >>> > > > z=rnorm(n) >>> > > > a=(x[idx]+y[idx]+z[idx])/3 >>> > > > b=mean(c(x[idx],y[idx],z[idx])) >>> > > > a==b >>> > > > # [1] FALSE >>> > > > >>> > > > For idx= 1, 2, 3, 4 the last line is equal to TRUE. For 5, 6 and >>> many >>> > > > others the difference is small but still. >>> > > > Is that expected or is it a bug? >>> > >>> > __ >>> > R-devel@r-project.org mailing list >>> > https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Precision of function mean,bug?
Hello R-dev, Yesterday, while I was testing the newly implemented function pmean in package kit, I noticed a mismatch in the output of the below R expressions. set.seed(123) n=1e3L idx=5 x=rnorm(n) y=rnorm(n) z=rnorm(n) a=(x[idx]+y[idx]+z[idx])/3 b=mean(c(x[idx],y[idx],z[idx])) a==b # [1] FALSE For idx= 1, 2, 3, 4 the last line is equal to TRUE. For 5, 6 and many others the difference is small but still. Is that expected or is it a bug? Thank you Best Regards Morgan Jacob [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] psum/pprod
Good morning All, Just wanted to do quick follow-up on this thread: https://r.789695.n4.nabble.com/There-is-pmin-and-pmax-each-taking-na-rm-how-about-psum-td4647841.html For those (including the R-core team) of you who are interested in a C implementation of psum and pprod there is one in the "kit" package (I am the author) on CRAN. I will continue working on the package in my spare time if I see that users are missing basic functionalities not implemented in base R. Have a great weekend. Kind regards Morgan Jacob [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] defining r audio connections
yep, you're right, after some initial clean-up and running with or without --as-cran R CMD check gives a NOTE * checking compiled code File ‘socketeer/libs/socketeer.so’: Found non-API calls to R: ‘R_GetConnection’, ‘R_new_custom_connection’ Compiled code should not call non-API entry points in R. See 'Writing portable packages' in the 'Writing R Extensions' manual. Connections in general seem more useful than ad-hoc functions, though perhaps for Frederick's use case Duncan's suggestion is sufficient. For non-CRAN packages I personally would implement a connection. (I mistakenly thought this was a more specialized mailing list; I wouldn't have posted to R-devel on this topic otherwise) Martin Morgan On 5/6/20, 4:12 PM, "Gábor Csárdi" wrote: AFAIK that API is not allowed on CRAN. It triggers a NOTE or a WARNING, and your package will not be published. Gabor On Wed, May 6, 2020 at 9:04 PM Martin Morgan wrote: > > The public connection API is defined in > > https://github.com/wch/r-source/blob/trunk/src/include/R_ext/Connections.h > > I'm not sure of a good pedagogic example; people who want to write their own connections usually want to do so for complicated reasons! > > This is my own abandoned attempt https://github.com/mtmorgan/socketeer/blob/b0a1448191fe5f79a3f09d1f939e1e235a22cf11/src/connection.c#L169-L192 where connection_local_client() is called from R and _connection_local() creates and populates the appropriate structure. Probably I have done things totally wrong (e.g., by not checking the version of the API, as advised in the header file!) > > Martin Morgan > > On 5/6/20, 2:26 PM, "R-devel on behalf of Duncan Murdoch" wrote: > > On 06/05/2020 1:09 p.m., frede...@ofb.net wrote: > > Dear R Devel, > > > > Since Linux moved away from using a file-system interface for audio, I think it is necessary to write special libraries to interface with audio hardware from various languages on Linux. > > > > In R, it seems like the appropriate datatype for a `snd_pcm_t` handle pointing to an open ALSA source or sink would be a "connection". Connection types are already defined in R for "file", "url", "pipe", "fifo", "socketConnection", etc. > > > > Is there a tutorial or an example package where a new type of connection is defined, so that I can see how to do this properly in a package? > > > > I can see from the R source that, for example, `do_gzfile` is defined in `connections.c` and referenced in `names.c`. However, I thought I should ask here first in case there is a better place to start, than trying to copy this code. > > > > I only want an object that I can use `readBin` and `writeBin` on, to read and write audio data using e.g. `snd_pcm_writei` which is part of the `alsa-lib` package. > > I don't think R supports user-defined connections, but probably writing > readBin and writeBin equivalents specific to your library wouldn't be > any harder than creating a connection. For those, you will probably > want to work with an "external pointer" (see Writing R Extensions). > Rcpp probably has support for these if you're working in C++. > > Duncan Murdoch > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] defining r audio connections
The public connection API is defined in https://github.com/wch/r-source/blob/trunk/src/include/R_ext/Connections.h I'm not sure of a good pedagogic example; people who want to write their own connections usually want to do so for complicated reasons! This is my own abandoned attempt https://github.com/mtmorgan/socketeer/blob/b0a1448191fe5f79a3f09d1f939e1e235a22cf11/src/connection.c#L169-L192 where connection_local_client() is called from R and _connection_local() creates and populates the appropriate structure. Probably I have done things totally wrong (e.g., by not checking the version of the API, as advised in the header file!) Martin Morgan On 5/6/20, 2:26 PM, "R-devel on behalf of Duncan Murdoch" wrote: On 06/05/2020 1:09 p.m., frede...@ofb.net wrote: > Dear R Devel, > > Since Linux moved away from using a file-system interface for audio, I think it is necessary to write special libraries to interface with audio hardware from various languages on Linux. > > In R, it seems like the appropriate datatype for a `snd_pcm_t` handle pointing to an open ALSA source or sink would be a "connection". Connection types are already defined in R for "file", "url", "pipe", "fifo", "socketConnection", etc. > > Is there a tutorial or an example package where a new type of connection is defined, so that I can see how to do this properly in a package? > > I can see from the R source that, for example, `do_gzfile` is defined in `connections.c` and referenced in `names.c`. However, I thought I should ask here first in case there is a better place to start, than trying to copy this code. > > I only want an object that I can use `readBin` and `writeBin` on, to read and write audio data using e.g. `snd_pcm_writei` which is part of the `alsa-lib` package. I don't think R supports user-defined connections, but probably writing readBin and writeBin equivalents specific to your library wouldn't be any harder than creating a connection. For those, you will probably want to work with an "external pointer" (see Writing R Extensions). Rcpp probably has support for these if you're working in C++. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Hash functions at C level
Dear R-dev, Hope you are all well. I would like to know if there is a hash function available for the R C API? I noticed that there are hash structures and functions defined in the file "unique.c". These would definitly suit my needs, however is there a way to access them at C level? Thank you for your time. Best regards Morgan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Long vector support in data.frame
Hi All, Happy New Year! I was wondering if there is a plan at some point to support long vectors in data.frames? I understand that it would need some internal changes to lift the current limit. If there is a plan what is currently preventing it from happening? Is it time, resources? If so is there a way for people willing to help to contribute or help the R-dev team? How? I noticed that an increasing number of function are supporting long vectors in base R. Is there more functions that need to support long vectors before having long vectors support in data.frames? Thank you Best regards Morgan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Aggregate function FR
Hi, I was wondering if it would be possible to add an argument to the aggreagte function to retain NA by categories?(default can not to in order to avoid breaking code) Please see below example: df = iris df$Species[5] = NA aggregate(`Petal.Width` ~ Species, df, sum) # does not include NA aggregate(`Petal.Width` ~ addNA(Species), df, sum) # include NA data.table and dplyr include NA by default. Python pandas has an aggreagate function inspired by base R aggregate. An option has been added to include NA. Thank you Best regards Morgan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Questions on the R C API
Thank you for your reply Jiefei. I think in theory your solution should work. I'll have to give them a try. On Mon, 4 Nov 2019 23:41 Wang Jiefei, wrote: > Hi Morgan, > > My solutions might not be the best one(I believe it's not), but it should > work for your question. > > 1. Have you considered Rf_duplicate function? If you want to change the > value of `a` and reset it later, you have to have a duplication somewhere > for resetting it. Instead of changing the value of `a` directly, why not > changing the value of a duplicated `a`? So you do not have to reset it. > > 2. I think a pairlist behaves like a linked list(I might be wrong here and > please correct me if so). Therefore, there is no simple way to locate an > element in a pairlist. As for as I know, R defines a set of > convenient functions for you to access a limited number of elements. See > below > > ``` > #define CAR(e) ((e)->u.listsxp.carval) > #define CDR(e) ((e)->u.listsxp.cdrval) > #define CAAR(e) CAR(CAR(e)) > #define CDAR(e) CDR(CAR(e)) > #define CADR(e) CAR(CDR(e)) > #define CDDR(e) CDR(CDR(e)) > #define CDDDR(e) CDR(CDR(CDR(e))) > #define CADDR(e) CAR(CDR(CDR(e))) > #define CADDDR(e) CAR(CDR(CDR(CDR(e > #define CAD4R(e) CAR(CDR(CDR(CDR(CDR(e) > ``` > > You can use them to get first a few arguments from a pairlist. Another > solution would be converting the pairlist into a list so that you can use > the methods defined for a list to access any element. I do not know which C > function can achieve that but `as.list` at R level should be able to do > this job, you can evaluate an R function at C level and get the list > result( By calling `Rf_eval`). I think this operation is relatively low > cost because the list should only contain a set of pointers pointing to > each element. There is no object duplication(Again I might be wrong here). > So there is no way to reset a pairlist to its first element? > 3. You can get unevaluated expression at the R level before you call the C > function and pass it to your C function( by calling `substitute` function). > However, from my vague memory, the expression would be eventually evaluated > at the C level even you pass the expression to it. Therefore, I think you > can create a list of unevaluated arguments before you enter the C function, > so your C function can expect a list rather than a pairlist as its > argument. This can solve both your second and third questions. > Correct me if I am wrong but does it mean that I will have to change "..." to "list(...)" and use .Call instead of .External? Also does it mean that to avoid expression to be evaluated at the R level, I have to use "list" or "substitute"? The function "switch" in R does not use them but manage to achieve that. switch(1, "a", stop("a")) #[1] "a" It is a primitive but I don't understand how it manage to do that. Best, Morgan > Best, > Jiefei > > > On Mon, Nov 4, 2019 at 2:41 PM Morgan Morgan > wrote: > >> Hi All, >> >> I have some questions regarding the R C API. >> >> Let's assume I have a function which is defined as follows: >> >> R file: >> >> myfunc <- function(a, b, ...) .External(Cfun, a, b, ...) >> >> C file: >> >> SEXP Cfun(SEXP args) { >> args = CDR(args); >> SEXP a = CAR(args); args = CDR(args); >> SEXP b = CAR(args); args = CDR(args); >> /* continue to do something with remaining arguments in "..." using the >> same logic as above*/ >> >> return R_NilValue; >> } >> >> 1/ Let's suppose that in my c function I change the value of a inside the >> function but I want to reset it to what it was when I did SEXP a = >> CAR(args); . How can I do that? >> >> 2/Is there a method to set "args" at a specific position so I can access a >> specific value of my choice? If yes, do you have an simple example? >> >> 3/ Let's suppose now, I call the function in R. Is there a way to avoid >> the >> function to evaluate its arguments before going to the C call? Do I have >> to >> do it at the R level or can it be done at the C level? >> >> Thank you very much in advance. >> Best regards >> Morgan >> >> [[alternative HTML version deleted]] >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Questions on the R C API
Hi All, I have some questions regarding the R C API. Let's assume I have a function which is defined as follows: R file: myfunc <- function(a, b, ...) .External(Cfun, a, b, ...) C file: SEXP Cfun(SEXP args) { args = CDR(args); SEXP a = CAR(args); args = CDR(args); SEXP b = CAR(args); args = CDR(args); /* continue to do something with remaining arguments in "..." using the same logic as above*/ return R_NilValue; } 1/ Let's suppose that in my c function I change the value of a inside the function but I want to reset it to what it was when I did SEXP a = CAR(args); . How can I do that? 2/Is there a method to set "args" at a specific position so I can access a specific value of my choice? If yes, do you have an simple example? 3/ Let's suppose now, I call the function in R. Is there a way to avoid the function to evaluate its arguments before going to the C call? Do I have to do it at the R level or can it be done at the C level? Thank you very much in advance. Best regards Morgan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New matrix function
Basically the problem is to find the position of a submatrix inside a larger matrix. Here are some links describing the problem: https://stackoverflow.com/questions/10529278/fastest-way-to-find-a-m-x-n-submatrix-in-m-x-n-matrix https://stackoverflow.com/questions/16750739/find-a-matrix-in-a-big-matrix Best Morgan On Fri, 11 Oct 2019 23:36 Gabor Grothendieck, wrote: > The link you posted used the same inputs as in my example. If that is > not what you meant maybe > a different example is needed. > Regards. > > On Fri, Oct 11, 2019 at 2:39 PM Pages, Herve wrote: > > > > Has someone looked into the image processing area for this? That sounds > > a little bit too high-level for base R to me (and I would be surprised > > if any mainstream programming language had this kind of functionality > > built-in). > > > > H. > > > > On 10/11/19 03:44, Morgan Morgan wrote: > > > Hi All, > > > > > > I was looking for a function to find a small matrix inside a larger > matrix > > > in R similar to the one described in the following link: > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mathworks.com_matlabcentral_answers_194708-2Dindex-2Da-2Dsmall-2Dmatrix-2Din-2Da-2Dlarger-2Dmatrix&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=v96tqHMO3CLNBS7KTmdshM371i6W_v8_2H5bdVy_KHo&s=9Eu0WySIEzrWuYXFhwhHETpZQzi6hHLd84DZsbZsXYY&e= > > > > > > I couldn't find anything. > > > > > > The above function can be seen as a "generalisation" of the "which" > > > function as well as the function described in the following post: > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__coolbutuseless.github.io_2018_04_03_finding-2Da-2Dlength-2Dn-2Dneedle-2Din-2Da-2Dhaystack_&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=v96tqHMO3CLNBS7KTmdshM371i6W_v8_2H5bdVy_KHo&s=qZ3SJ8t8zEDA-em4WT7gBmN66qvvCKKKXRJunoF6P3k&e= > > > > > > Would be possible to add such a function to base R? > > > > > > I am happy to work with someone from the R core team (if you wish) and > > > suggest an implementation in C. > > > > > > Thank you > > > Best regards, > > > Morgan > > > > > > [[alternative HTML version deleted]] > > > > > > __ > > > R-devel@r-project.org mailing list > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=v96tqHMO3CLNBS7KTmdshM371i6W_v8_2H5bdVy_KHo&s=tyVSs9EYVBd_dmVm1LSC23GhUzbBv8ULvtsveo-COoU&e= > > > > > > > -- > > Hervé Pagès > > > > Program in Computational Biology > > Division of Public Health Sciences > > Fred Hutchinson Cancer Research Center > > 1100 Fairview Ave. N, M1-B514 > > P.O. Box 19024 > > Seattle, WA 98109-1024 > > > > E-mail: hpa...@fredhutch.org > > Phone: (206) 667-5791 > > Fax:(206) 667-1319 > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > -- > Statistics & Software Consulting > GKX Group, GKX Associates Inc. > tel: 1-877-GKX-GROUP > email: ggrothendieck at gmail.com > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New matrix function
Your answer makes much more sense to me. I will probably end up adding the function to a package. Some processes and decisions on how R is developed seems to be obscure to me. Thank you Morgan On Fri, 11 Oct 2019 15:30 Avraham Adler, wrote: > It’s rather difficult. For example, the base R Kendall tau is written with > the naive O(n^2). The much faster O(n log n) implementation was programmed > and is in the pcaPP package. When I say much faster, I mean that my > implementation in Excel VBA was faster than R for 10,000 or so pairs. > R-Core decided not to implement that code, and instead made a note about > the faster implementation living in pcaPP in the help for “cor”. See [1] > for the 2012 discussion. My point is it’s really really difficult to get > something in Base R. Develop it well, put it in a package, and you have > basically the same result. > > Avi > > [1] https://stat.ethz.ch/pipermail/r-devel/2012-June/064351.html > > On Fri, Oct 11, 2019 at 9:55 AM Morgan Morgan > wrote: > >> How do you prove usefulness of a feature? >> Do you have an example of a feature that has been added after proving to >> be >> useful in the package space first? >> >> Thank you, >> Morgan >> >> On Fri, 11 Oct 2019 13:53 Michael Lawrence, >> wrote: >> >> > Thanks for this interesting suggestion, Morgan. While there is no strict >> > criteria for base R inclusion, one criterion relevant in this case is >> that >> > the usefulness of a feature be proven in the package space first. >> > >> > Michael >> > >> > >> > On Fri, Oct 11, 2019 at 5:19 AM Morgan Morgan < >> morgan.email...@gmail.com> >> > wrote: >> > >> >> On Fri, 11 Oct 2019 10:45 Duncan Murdoch, >> >> wrote: >> >> >> >> > On 11/10/2019 6:44 a.m., Morgan Morgan wrote: >> >> > > Hi All, >> >> > > >> >> > > I was looking for a function to find a small matrix inside a larger >> >> > matrix >> >> > > in R similar to the one described in the following link: >> >> > > >> >> > > >> >> > >> >> >> https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix >> >> > > >> >> > > I couldn't find anything. >> >> > > >> >> > > The above function can be seen as a "generalisation" of the "which" >> >> > > function as well as the function described in the following post: >> >> > > >> >> > > >> >> > >> >> >> https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/ >> >> > > >> >> > > Would be possible to add such a function to base R? >> >> > > >> >> > > I am happy to work with someone from the R core team (if you wish) >> and >> >> > > suggest an implementation in C. >> >> > >> >> > That seems like it would sometimes be a useful function, and maybe >> >> > someone will point out a package that already contains it. But if >> not, >> >> > why would it belong in base R? >> >> > >> >> >> >> If someone already implemented it, that would great indeed. I think it >> is >> >> a >> >> very general and basic function, hence base R could be a good place for >> >> it? >> >> >> >> But this is probably not a good reason; maybe someone from the R core >> team >> >> can shed some light on how they decide whether or not to include a >> >> function >> >> in base R? >> >> >> >> >> >> > Duncan Murdoch >> >> > >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> __ >> >> R-devel@r-project.org mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> >> > >> > >> > -- >> > Michael Lawrence >> > Scientist, Bioinformatics and Computational Biology >> > Genentech, A Member of the Roche Group >> > Office +1 (650) 225-7760 >> > micha...@gene.com >> > >> > Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube >> > >> >> [[alternative HTML version deleted]] >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > -- > Sent from Gmail Mobile > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New matrix function
I think you are confusing package and function here. Plus some of the R Core packages, that you mention, contain functions that should probably be replaced by functions with better implementation from packages on CRAN. Best regards Morgan On Fri, 11 Oct 2019 15:22 Joris Meys, wrote: > > > On Fri, Oct 11, 2019 at 3:55 PM Morgan Morgan > wrote: > >> How do you prove usefulness of a feature? >> Do you have an example of a feature that has been added after proving to >> be >> useful in the package space first? >> >> Thank you, >> Morgan >> > > The parallel package (a base package like utils, stats, ...) was added as > a drop-in replacement of the packages snow and multicore for parallel > computing. That's one example, but sure there's more. > > Kind regards > Joris > > -- > Joris Meys > Statistical consultant > > Department of Data Analysis and Mathematical Modelling > Ghent University > Coupure Links 653, B-9000 Gent (Belgium) > > <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g> > > --- > Biowiskundedagen 2018-2019 > http://www.biowiskundedagen.ugent.be/ > > --- > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New matrix function
How do you prove usefulness of a feature? Do you have an example of a feature that has been added after proving to be useful in the package space first? Thank you, Morgan On Fri, 11 Oct 2019 13:53 Michael Lawrence, wrote: > Thanks for this interesting suggestion, Morgan. While there is no strict > criteria for base R inclusion, one criterion relevant in this case is that > the usefulness of a feature be proven in the package space first. > > Michael > > > On Fri, Oct 11, 2019 at 5:19 AM Morgan Morgan > wrote: > >> On Fri, 11 Oct 2019 10:45 Duncan Murdoch, >> wrote: >> >> > On 11/10/2019 6:44 a.m., Morgan Morgan wrote: >> > > Hi All, >> > > >> > > I was looking for a function to find a small matrix inside a larger >> > matrix >> > > in R similar to the one described in the following link: >> > > >> > > >> > >> https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix >> > > >> > > I couldn't find anything. >> > > >> > > The above function can be seen as a "generalisation" of the "which" >> > > function as well as the function described in the following post: >> > > >> > > >> > >> https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/ >> > > >> > > Would be possible to add such a function to base R? >> > > >> > > I am happy to work with someone from the R core team (if you wish) and >> > > suggest an implementation in C. >> > >> > That seems like it would sometimes be a useful function, and maybe >> > someone will point out a package that already contains it. But if not, >> > why would it belong in base R? >> > >> >> If someone already implemented it, that would great indeed. I think it is >> a >> very general and basic function, hence base R could be a good place for >> it? >> >> But this is probably not a good reason; maybe someone from the R core team >> can shed some light on how they decide whether or not to include a >> function >> in base R? >> >> >> > Duncan Murdoch >> > >> >> [[alternative HTML version deleted]] >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > > -- > Michael Lawrence > Scientist, Bioinformatics and Computational Biology > Genentech, A Member of the Roche Group > Office +1 (650) 225-7760 > micha...@gene.com > > Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New matrix function
On Fri, 11 Oct 2019 10:45 Duncan Murdoch, wrote: > On 11/10/2019 6:44 a.m., Morgan Morgan wrote: > > Hi All, > > > > I was looking for a function to find a small matrix inside a larger > matrix > > in R similar to the one described in the following link: > > > > > https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix > > > > I couldn't find anything. > > > > The above function can be seen as a "generalisation" of the "which" > > function as well as the function described in the following post: > > > > > https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/ > > > > Would be possible to add such a function to base R? > > > > I am happy to work with someone from the R core team (if you wish) and > > suggest an implementation in C. > > That seems like it would sometimes be a useful function, and maybe > someone will point out a package that already contains it. But if not, > why would it belong in base R? > If someone already implemented it, that would great indeed. I think it is a very general and basic function, hence base R could be a good place for it? But this is probably not a good reason; maybe someone from the R core team can shed some light on how they decide whether or not to include a function in base R? > Duncan Murdoch > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] New matrix function
Hi All, I was looking for a function to find a small matrix inside a larger matrix in R similar to the one described in the following link: https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix I couldn't find anything. The above function can be seen as a "generalisation" of the "which" function as well as the function described in the following post: https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/ Would be possible to add such a function to base R? I am happy to work with someone from the R core team (if you wish) and suggest an implementation in C. Thank you Best regards, Morgan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Evaluate part of an expression at C level
Hi, I am wondering if the below is possible? Let's assume I have the following expression: 1:10 < 5 Is there a way at the R C API level to only evaluate the 5th element (i.e 5 < 5) instead of evaluating the whole expression and then select the 5th element in the logical vector? Thank you Best regards Morgan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Convert STRSXP or INTSXP to factor
Hi, Using the R C PAI, is there a way to convert to convert STRSXP or INTSXP to factor. The idea would be to do in C something similar to the "factor" function (example below): > letters[1:5] # [1] "a" "b" "c" "d" "e" > factor(letters[1:5]) # [1] a b c d e # Levels: a b c d e There is the function setAttrib the levels of a SXP however when returned to R the object is of type character not factor. Ideally what i would like to return from the C function is the same output as above when the input is of type character. Please let me if you need more informations. Thank you Best regards Morgan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] R C API resize matrix
Hi, Is there a way to resize a matrix defined as follows: SEXP a = PROTECT(allocMatrix(INTSXP, 10, 2)); int *pa = INTEGER(a) To row = 5 and col = 1 or do I have to allocate a second matrix "b" with pointer *pb and do a "for" loop to transfer the value of a to b? Thank you Best regards Morgan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Package inclusion in R core implementation
Hi, It sometimes happens that some packages get included to R like for example the parallel package. I was wondering if there is a process to decide whether or not to include a package in the core implementation of R? For example, why not include the Rcpp package, which became for a lot of user the main tool to extend R? What is our view on the (not so well known) dotCall64 package which is an interesting alternative for extending R? Thank you Best regards, Morgan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Compiler + stopifnot bug
For what it's worth this also introduced > df = data.frame(v = package_version("1.2")) > rbind(df, df)$v [[1]] [1] 1 2 [[2]] [1] 1 2 instead of > rbind(df, df)$v [1] '1.2' '1.2' which shows up in Travis builds of Bioconductor packages https://stat.ethz.ch/pipermail/bioc-devel/2019-January/014506.html and elsewhere Martin Morgan On 1/3/19, 7:05 PM, "R-devel on behalf of Duncan Murdoch" wrote: On 03/01/2019 3:37 p.m., Duncan Murdoch wrote: > I see this too; by bisection, it seems to have first appeared in r72943. Sorry, that was a typo. I meant r75943. Duncan Murdoch > > Duncan Murdoch > > On 03/01/2019 2:18 p.m., Iñaki Ucar wrote: >> Hi, >> >> I found the following issue in r-devel (2019-01-02 r75945): >> >> `foo<-` <- function(x, value) { >> bar(x) <- value * x >> x >> } >> >> `bar<-` <- function(x, value) { >> stopifnot(all(value / x == 1)) >> x + value >> } >> >> `foo<-` <- compiler::cmpfun(`foo<-`) >> `bar<-` <- compiler::cmpfun(`bar<-`) >> >> x <- c(2, 2) >> foo(x) <- 1 >> x # should be c(4, 4) >> #> [1] 3 3 >> >> If the functions are not compiled or the stopifnot call is removed, >> the snippet works correctly. So it seems that something is messing >> around with the references to "value" when the call to stopifnot gets >> compiled, and the wrong "value" is modified. Note also that if "x <- >> 2", then the result is correct, 4. >> >> Regards, >> > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] length of `...`
nargs() provides the number of arguments without evaluating them > f = function(x, ..., y) nargs() > f() [1] 0 > f(a=1, b=2) [1] 2 > f(1, a=1, b=2) [1] 3 > f(x=1, a=1, b=2) [1] 3 > f(stop()) [1] 1 On 05/03/2018 11:01 AM, William Dunlap via R-devel wrote: In R-3.5.0 you can use ...length(): > f <- function(..., n) ...length() > f(stop("one"), stop("two"), stop("three"), n=7) [1] 3 Prior to that substitute() is the way to go > g <- function(..., n) length(substitute(...())) > g(stop("one"), stop("two"), stop("three"), n=7) [1] 3 R-3.5.0 also has the ...elt(n) function, which returns the evaluated n'th entry in ... , without evaluating the other ... entries. > fn <- function(..., n) ...elt(n) > fn(stop("one"), 3*5, stop("three"), n=2) [1] 15 Prior to 3.5.0, eval the appropriate component of the output of substitute() in the appropriate environment: > gn <- function(..., n) { + nthExpr <- substitute(...())[[n]] + eval(nthExpr, envir=parent.frame()) + } > gn(stop("one"), environment(), stop("two"), n=2) Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, May 3, 2018 at 7:29 AM, Dénes Tóth wrote: Hi, In some cases the number of arguments passed as ... must be determined inside a function, without evaluating the arguments themselves. I use the following construct: dotlength <- function(...) length(substitute(expression(...))) - 1L # Usage (returns 3): dotlength(1, 4, something = undefined) How can I define a method for length() which could be called directly on `...`? Or is it an intention to extend the base length() function to accept ellipses? Regards, Denes __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel This email message may contain legally privileged and/or...{{dropped:2}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] download.file does not process gz files correctly (truncates them?)
On 05/03/2018 05:48 AM, Joris Meys wrote: Dear all, I've been diving a bit deeper into this per request of Tomas Kalibra, and found the following : - the lock on the file is only after trying to read it using oligo, so that's not a R problem in itself. The problem is independent of extrenal packages. - using Windows' fc utility and cygwin's cmp utility I found out that every so often the download.file() function inserts an extra byte. There's no real obvious pattern in how these bytes are added, but the file downloaded using download.file() is actually larger (in this case by about 8 kb). The file xxx_inR.CEL.gz is read in using: I believe the difference in mode = "w" vs "wb", and the reason this is restricted to Windows downloads, is due to the difference in text file line endings, where with mode="w", download.file (and many other utilities outside R) recognize the "foo\n" as "foo\r\n". Obviously this messes up binary files. I guess in the CEL.gz file there are about 8k "\n" characters. Henrik's suggestion (default = "wb") would introduce the complementary problem -- text files would have incorrect line endings. Martin setwd("E:/Temp/genexpr/Compare") id <- "GSM907854" flink <- paste0(" https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907854&format=file&file=GSM907854%2ECEL%2Egz ") fname <- paste0(id,"_inR.CEL.gz") download.file(flink, destfile = fname) The file xxx_direct.CEL.gz is downloaded from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907854 (download link at the bottom of the page). Output of dir in CMD: 05/03/2018 11:02 AM 4,529,547 GSM907854_direct.CEL.gz 05/03/2018 11:17 AM 4,537,668 GSM907854_inR.CEL.gz or from R : diff(file.size(dir())) # contains both CEL files. [1] 8121 Strangely enough I get the following message from download.file() : Content type 'application/octet-stream' length 4529547 bytes (4.3 MB) downloaded 4.3 MB So the reported length is exactly the same as if I would download the file directly, but the file on disk itself is larger. So it seems download.file() is adding bytes when saving the data on disk. This behaviour is independent of antivirus and/or firewalls turned on or off. Also keep in mind that these are NOT standard gzipped files. These files are a specific format for Affymetrix Human Gene 1.0 ST Arrays. If I need to run other tests, please let me know. Kind regards Joris On Wed, May 2, 2018 at 9:21 PM, Joris Meys wrote: Dear all, I've noticed by trying to download gz files from here : https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811 At the bottom one can download GSM907811.CEL.gz . If I download this manually and try oligo::read.celfiles("GSM907811.CEL.gz") everything works fine. (oligo is a bioConductor package) However, if I download using download.file("https://www.ncbi.nlm.nih.gov/geo/download/ ?acc=GSM907811&format=file&file=GSM907811%2ECEL%2Egz", destfile = "GSM907811.CEL.gz") The file is downloaded, but oligo::read.celfiles() returns the following error: Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) : End of gz file reached unexpectedly. Perhaps this file is truncated. Moreover, if I try to delete it after using download.file(), I get a warning that permission is denied. I can only remove it using Windows file explorer after I closed the R session, indicating that the connection is still open. Yet, showConnections() doesn't show any open connections either. Session info below. Note that I started from a completely fresh R session. oligo is needed due to the specific file format of these gz files. They're not standard tarred files. Cheers Joris Session Info - R version 3.5.0 (2018-04-23) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) Matrix products: default locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats4parallel stats graphics grDevices utils datasets methods [9] base other attached packages: [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8 oligo_1.44.0 [4] Biobase_2.39.2 oligoClasses_1.42.0 RSQLite_2.1.0 [7] Biostrings_2.48.0 XVector_0.19.9 IRanges_2.13.28 [10] S4Vectors_0.17.42 BiocGenerics_0.25.3 loaded via a namespace (and not attached): [1] Rcpp_0.12.16compiler_3.5.0 [3] BiocInstaller_1.30.0GenomeInfoDb_1.15.5 [5] bitops_1.0-6iterators_1.0.9 [7] tools_3.5.0 zlibbioc_1.25.0 [9] digest_0.6.15 bit_1.1-12 [11] memoise_1.1.0 preprocessCore_1.41.0 [13] lattice_0.20-35 ff_2.2-13 [15] pkgconfig_2.0.1 Matrix_1.2-14 [17] foreach_1.4.4 DelayedArray_0.5.31
Re: [Rd] download.file does not process gz files correctly (truncates them?)
On 05/02/2018 03:21 PM, Joris Meys wrote: Dear all, I've noticed by trying to download gz files from here : https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811 At the bottom one can download GSM907811.CEL.gz . If I download this manually and try oligo::read.celfiles("GSM907811.CEL.gz") everything works fine. (oligo is a bioConductor package) However, if I download using download.file(" https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907811&format=file&file=GSM907811%2ECEL%2Egz ", destfile = "GSM907811.CEL.gz") On windows, the 'mode' argument to download.file() needs to be "wb" (write binary) for binary files. Martin The file is downloaded, but oligo::read.celfiles() returns the following error: Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) : End of gz file reached unexpectedly. Perhaps this file is truncated. Moreover, if I try to delete it after using download.file(), I get a warning that permission is denied. I can only remove it using Windows file explorer after I closed the R session, indicating that the connection is still open. Yet, showConnections() doesn't show any open connections either. Session info below. Note that I started from a completely fresh R session. oligo is needed due to the specific file format of these gz files. They're not standard tarred files. Cheers Joris Session Info - R version 3.5.0 (2018-04-23) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) Matrix products: default locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats4parallel stats graphics grDevices utils datasets methods [9] base other attached packages: [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8 oligo_1.44.0 [4] Biobase_2.39.2 oligoClasses_1.42.0 RSQLite_2.1.0 [7] Biostrings_2.48.0 XVector_0.19.9 IRanges_2.13.28 [10] S4Vectors_0.17.42 BiocGenerics_0.25.3 loaded via a namespace (and not attached): [1] Rcpp_0.12.16compiler_3.5.0 [3] BiocInstaller_1.30.0GenomeInfoDb_1.15.5 [5] bitops_1.0-6iterators_1.0.9 [7] tools_3.5.0 zlibbioc_1.25.0 [9] digest_0.6.15 bit_1.1-12 [11] memoise_1.1.0 preprocessCore_1.41.0 [13] lattice_0.20-35 ff_2.2-13 [15] pkgconfig_2.0.1 Matrix_1.2-14 [17] foreach_1.4.4 DelayedArray_0.5.31 [19] yaml_2.1.18 GenomeInfoDbData_1.1.0 [21] affxparser_1.52.0 bit64_0.9-7 [23] grid_3.5.0 BiocParallel_1.13.3 [25] blob_1.1.1 codetools_0.2-15 [27] matrixStats_0.53.1 GenomicRanges_1.31.23 [29] splines_3.5.0 SummarizedExperiment_1.9.17 [31] RCurl_1.95-4.10 affyio_1.49.2 This email message may contain legally privileged and/or...{{dropped:2}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Why R should never move to git
On 01/25/2018 07:09 AM, Duncan Murdoch wrote: On 25/01/2018 6:49 AM, Dirk Eddelbuettel wrote: On 25 January 2018 at 06:20, Duncan Murdoch wrote: | On 25/01/2018 2:57 AM, Iñaki Úcar wrote: | > For what it's worth, this is my workflow: | > | > 1. Get a fork. | > 2. From the master branch, create a new branch called fix-[something]. | > 3. Put together the stuff there, commit, push and open a PR. | > 4. Checkout master and repeat from 2 to submit another patch. | > | > Sometimes, I forget the step of creating the new branch and I put my | > fix on top of the master branch, which complicates things a bit. But | > you can always rename your fork's master and pull it again from | > upstream. | | I saw no way to follow your renaming suggestion. Can you tell me the | steps it would take? Remember, there's already a PR from the master | branch on my fork. (This is for future reference; I already followed | Gabor's more complicated instructions and have solved the immediate | problem.) 1) Via GUI: fork or clone at github so that you have URL to use in 2) Github would not allow me to fork, because I already had a fork of the same repository. I suppose I could have set up a new user and done it. I don't know if cloning the original would have made a difference. I don't have permission to commit to the original, and the manipulateWidget maintainers wouldn't be able to see my private clone, so I don't see how I could create a PR that they could use. Once again, let me repeat: this should be an easy thing to do. So far I'm pretty convinced that it's actually impossible to do it on the Github website without hacks like creating a new user. It's not trivial but not that difficult for a git expert using command line git. If R Core chose to switch the R sources to use git and used Github to host a copy, problems like mine would come up fairly regularly. I don't think R Core would gain enough from the switch to compensate for the burden of dealing with these problems. A different starting point gives R-core members write access to the R-core git, which is analogous to the current svn setup. A restricted set of commands are needed, mimicking svn git clone ... # svn co git pull# svn up [...; git commit ...] git push ...# svn ci Probably this would mature quickly into a better practice where new features / bug fixes are developed on a local branch. A subset of R-core might participate in managing pull requests on a 'read only' Github mirror. Incorporating mature patches would involve git, rather than the Github GUI. In one's local repository, create a new branch and pull from the repository making the request git checkout -b a-pull-request master git pull https://github.com/a-user/their.git their-branch Check and modify, then merge locally and push to the R-core git ## identify standard / best practice for merging branches git checkout master git merge ... a-pull-request git push ... Creating pull requests is a problem for the developer wanting to contribute to R, not for the R-core developer. As we've seen in this thread, R-core would not need to feel responsible for helping developers create pull requests. Martin Morgan Maybe Gitlab or some other front end would be better. Duncan Murdoch 2) Run git clone giturl to fetch local instance 3) Run git checkout -b feature/new_thing_a (this is 2. above by Inaki) 4) Edit, save, compile, test, revise, ... leading to 1 or more commits 5) Run git push origin standard configuration should have remote branch follow local branch, I think the "long form" is git push --set-upstream origin feature/new_thing_a 6) Run git checkout - or git checkout master and you are back in master. Now you can restart at my 3) above for branches b, c, d and create independent pull requests I find it really to have a bash prompt that shows the branch: edd@rob:~$ cd git/rcpp edd@rob:~/git/rcpp(master)$ git checkout -b feature/new_branch_to_show Switched to a new branch 'feature/new_branch_to_show' edd@rob:~/git/rcpp(feature/new_branch_to_show)$ git checkout - Switched to branch 'master' Your branch is up-to-date with 'origin/master'. edd@rob:~/git/rcpp(master)$ git branch -d feature/new_branch_to_show Deleted branch feature/new_branch_to_show (was 5b25fe62). edd@rob:~/git/rcpp(master)$ There are few tutorials out there about how to do it, I once got mine from Karthik when we did a Software Carpentry workshop. Happy to detail off-list, it adds less than 10 lines to ~/.bashrc. Dirk | | Duncan Murdoch | | > Iñaki | > | > | > | > 2018-01-25 0:17 GMT+01:00 Duncan Murdoch : | >> Lately I've been doing so
Re: [Rd] How to address the following: CRAN packages not using Suggests conditionally
On 01/22/2018 08:40 AM, Ulrich Bodenhofer wrote: Thanks a lot, Iñaki, this is a perfect solution! I already implemented it and it works great. I'll wait for 2 more days before I submit the revised package to CRAN - in order to give others to comment on it. It's very easy for 'pictures of code' (unevaluated code chunks in vignettes) to drift from the actual implementation. So I'd really encourage your conditional evaluation to be as narrow as possible -- during CRAN or even CRAN fedora checks. Certainly trying to use uninstalled Suggest'ed packages in vignettes should provide an error message that is informative to users. Presumably the developer or user intends actually to execute the code, and needs to struggle through whatever issues come up. I'm not sure whether my comments are consistent with Writing R Extensions or not. There is a fundamental tension between the CRAN and Bioconductor release models. The Bioconductor 'devel' package repositories and nightly builds are meant to be a place where new features and breaking changes can be introduced and problems resolved before being exposed to general users as a stable 'release' branch, once every six months. This means that the Bioconductor devel branch periodically (as recently and I suspect over the next several days) contains considerable carnage that propagates to CRAN devel builds, creating additional work for CRAN maintainers. Martin Morgan Bioconductor Best regards, Ulrich On 01/22/2018 10:16 AM, Iñaki Úcar wrote: Re-sending, since I forgot to include the list, sorry. I'm including r-package-devel too this time, as it seems more appropriate for this list. El 22 ene. 2018 10:11, "Iñaki Úcar" <mailto:i.uca...@gmail.com>> escribió: El 22 ene. 2018 8:12, "Ulrich Bodenhofer" mailto:bodenho...@bioinf.jku.at>> escribió: Dear colleagues, dear members of the R Core Team, This was an issue raised by Prof. Brian Ripley and sent privately to all developers of CRAN packages that suggest Bioconductor packages (see original message below). As mentioned in my message enclosed below, it was easy for me to fix the error in examples (new version not submitted to CRAN yet), but it might turn into a major effort for the warnings raised by the package vignette. Since I have not gotten any advice yet, I take the liberty to post it here on this list - hoping that we reach a conclusion here how to deal with this matter. Just disable code chunk evaluation if suggested packages are missing (see [1]). As explained by Prof. Ripley, it will only affect Fedora checks on r-devel, i.e., your users will still see fully evaluated vignettes on CRAN. [1] https://www.enchufa2.es/archives/suggests-and-vignettes.html <https://www.enchufa2.es/archives/suggests-and-vignettes.html> Iñaki Thanks in advance for your kind assistance, Ulrich Bodenhofer Forwarded Message Subject: Re: CRAN packages not using Suggests conditionally Date: Mon, 15 Jan 2018 08:44:40 +0100 From: Ulrich Bodenhofer mailto:bodenho...@bioinf.jku.at>> To: Prof Brian Ripley mailto:rip...@stats.ox.ac.uk>> CC: [...stripped for the sake of privacy ...] Dear Prof. Ripley, Thank you very much for bringing this important issue to my attention. I am the maintainer of the 'apcluster' package. My package refers to 'Biostrings' in an example section of a help page (a quite insignificant one, by the way), which creates errors on some platforms. It also refers to 'kebabs' in the package vignette, which leads to warnings. I could fix the first, more severe, problem quite easily, (1) since it is relatively easy to wrap an entire examples section in a conditional, and (2), as I have mentioned, it is not a particularly important help page. Regarding the vignette, I want to ask for your advice now, since the situation appears more complicated to me. While it is, of course, only one code chunk that loads the 'kebabs' package, five more code chunks depend on the package (more specifically, the data objects created by a method implemented in the package) - with quite some text in between. So the handling of the conditional loading of the package would propagate to multiple code chunks and also affect the validity of the explanations in between. I would see the following options: 1. Remove the entire section of the vignette. That would be a pity, sin
Re: [Rd] R CMD check warning about compiler warning flags
On 12/21/2017 01:02 PM, Winston Chang wrote: On recent builds of R-devel, R CMD check gives a WARNING when some compiler warning flags are detected, such as -Werror, because they are non-portable. This appears to have been added in this commit: https://github.com/wch/r-source/commit/2e80059 That is not the canonical R sources. Yes, that is obvious. The main page for that repository says it is a mirror of the R sources, right at the top. I know that because I put the message there, and because I see it every time I visit the repository. If you have a good way of pointing people to the changes made in a commit with the canonical R sources, please let us know. I and many others would be happy to use it. In case 'pointing to' is not to mean exclusively 'pointing a mouse at', 'a good way' can include typing at the console and living with the merits and demerits of svn, and the question is not rhetorical(probably FALSE on all accounts, but one never knows...) Check out or update the source (linux, mac, or Windows) svn co https://svn.r-project.org/R/trunk R-devel cd R-devel svn up browse the commit history svn log | less and review the change svn diff -c73909 Restrict by specifying a path svn diff -c73909 src/library/tools/R/check.R (I don't think one gets finer resolution, other than referencing the line number in the diff) View a range of revisions, e.g., svn diff -r73908:73909 And find commits associated with lines of code svn annotate doc/manual/R-exts.texi | less A quick google search (svn diff visual display) lead me to svn diff --diff-cmd meld -c73909 for my platform, which pops up the diffs in a visual context. Martin Morgan And your description seems wrong: there is now an _optional_ check controlled by an environment variable, primarily for CRAN checks. The check is "optional", but not for packages submitted to CRAN. I'm working on a package where these compiler warning flags are present in a Makefile generated by a configure script -- that is, the configure script detects whether the compiler supports these flags, and if so, puts them in the Makefile. (The configure script is for a third-party C library which is in a subdirectory of src/.) Because the flags are added only if the system supports them, there shouldn't be any worries about portability in practice. Please read the explanation in the manual: there are serious concerns about such flags which have bitten CRAN users several times. To take your example, you cannot know what -Werror does on all compilers (past, present or future) where it is supported (and -W flags do do different things on different compilers). On current gcc it does -Werror Make all warnings into errors. and so its effect depends on what other flags are used (people typically use -Wall, and most new versions of both gcc and clang add more warnings to -Wall -- I read this week exactly such a discussion about the interaction of -Werror with -Wtautological-constant-compare as part of -Wall in clang trunk). Is there a way to get R CMD check to not raise warnings in cases like this? I know I could modify the C library's configure.ac (which is used to generate the configure script) but I'd prefer to leave the library's code untouched if possible. You don't need to (and most likely should not) use the C[XX]FLAGS it generates ... just use the flags which R passes to the package to use. It turns out that there isn't even a risk of these compiler flags being used -- I learned from of my colleagues that the troublesome compiler flags, like -Werror, never actually appear in the Makefile. The configure script prints out those compiler flags out when it checks for them, but in the end it creates a Makefile with the CFLAGS inherited from R. So there's no chance that the library would be compiled using those flags (unless R passed them along). His suggested workaround is to silence the output of the configure script. That also hides some useful information, but it does work for this issue. -Winston __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel This email message may contain legally privileged and/or...{{dropped:2}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Are Rprintf and REprintf thread-safe?
On 11/21/2017 04:12 PM, Winston Chang wrote: Thanks - I'll find another way to send messages to the main thread for printing. The CRAN synchronicity and Bioconductor BiocParallel packages provide inter-process locks that you could use to surround writes (instead of sending message to the main thread), also easy enough to incorporate at the C level using the BH package as source for relevant boost header. Martin -Winston On Tue, Nov 21, 2017 at 12:42 PM, wrote: On Tue, 21 Nov 2017, Winston Chang wrote: Is it safe to call Rprintf and REprintf from a background thread? I'm working on a package that makes calls to fprintf(stderr, ...) on a background thread when errors happen, but when I run R CMD check, it says: Compiled code should not call entry points which might terminate R nor write to stdout/stderr instead of to the console, nor the system RNG. Is it safe to replace these calls with REprintf()? Only if you enjoy race conditions or segfaults. Rprintf and REprintf are not thread-safe. Best, luke -Winston __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tier...@uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel This email message may contain legally privileged and/or...{{dropped:2}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [PATCH] Fix bad free in connections
On 07/20/2017 05:04 PM, Steve Grubb wrote: Hello, There are times when b points to buf which is a stack variable. This leads to a bad free. The current test actually guarantees the stack will try to get freed. Simplest to just drop the variable and directly test if b should get freed. Signed-off-by: Steve Grubb Index: src/main/connections.c === --- src/main/connections.c (revision 72935) +++ src/main/connections.c (working copy) @@ -421,7 +421,6 @@ char buf[BUFSIZE], *b = buf; int res; const void *vmax = NULL; /* -Wall*/ -int usedVasprintf = FALSE; va_list aq; va_copy(aq, ap); @@ -434,7 +433,7 @@ b = buf; buf[BUFSIZE-1] = '\0'; warning(_("printing of extremely long output is truncated")); - } else usedVasprintf = TRUE; + } } #else if(res >= BUFSIZE) { /* res is the desired output length */ @@ -481,7 +480,7 @@ } else con->write(b, 1, res, con); if(vmax) vmaxset(vmax); -if(usedVasprintf) free(b); +if(b != buf) free(b); The code can be exercised with z = paste(rep("a", 11000), collapse="") f = fifo("foo", "w+") writeLines(z, f) If the macro HAVE_VASPRINTF is not defined, then b is the result of R_alloc(), and it is not appropriate to free(b). If the macro is defined we go through res = vasprintf(&b, format, ap); if (res < 0) { b = buf; buf[BUFSIZE-1] = '\0'; warning(_("printing of extremely long output is truncated")); } else usedVasprintf = TRUE; b gets reallocated when res = vasprintf(&b, format, ap); is successful and res >= 0. usedVasprintf is then set to TRUE, and free(b) called. It seems like the code is correct as written? Martin Morgan (the real other Martin M*) return res; } __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel This email message may contain legally privileged and/or...{{dropped:2}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [PATCH] Fix status in main
On 07/20/2017 05:31 PM, Steve Grubb wrote: Hello, This is a patch to fix what appears to be a simple typo. The warning says "invalid status assuming 0", but then instead sets runLast to 0. Signed-of-by: Steve Grubb fixed in 72938 / 39. This seemed not to have consequence, since exit() reports NA & 0377 (i.e., 0) and the incorrect assignment to runLast is immediately over-written by the correct value. Martin Morgan Index: src/main/main.c === --- src/main/main.c (revision 72935) +++ src/main/main.c (working copy) @@ -1341,7 +1341,7 @@ status = asInteger(CADR(args)); if (status == NA_INTEGER) { warning(_("invalid 'status', 0 assumed")); - runLast = 0; + status = 0; } runLast = asLogical(CADDR(args)); if (runLast == NA_LOGICAL) { __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel This email message may contain legally privileged and/or...{{dropped:2}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [PATCH] Fix missing break
On 07/20/2017 05:02 PM, Steve Grubb wrote: Hello, There appears to be a break missing in the switch/case for the LISTSXP case. If this is supposed to fall through, I'd suggest a comment so that others know its by design. Signed-off-by: Steve Grubb An example is $ R --vanilla -e "pl = pairlist(1, 2); length(pl) = 1; pl" > pl = pairlist(1, 2); length(pl) = 1; pl Error in length(pl) = 1 : SET_VECTOR_ELT() can only be applied to a 'list', not a 'pairlist' Execution halted fixed in r72936 (R-devel) / 72937 (R-3-4-branch). Martin Morgan Index: src/main/builtin.c === --- src/main/builtin.c (revision 72935) +++ src/main/builtin.c (working copy) @@ -888,6 +888,7 @@ SETCAR(t, CAR(x)); SET_TAG(t, TAG(x)); } + break; case VECSXP: for (i = 0; i < len; i++) if (i < lenx) { __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel This email message may contain legally privileged and/or...{{dropped:2}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Question about R developpment
Thank you all for these explanations. Kind regards, Morgan On 11 Jun 2017 02:47, "Duncan Murdoch" wrote: > On 10/06/2017 6:09 PM, Duncan Murdoch wrote: > >> On 10/06/2017 2:38 PM, Morgan wrote: >> >>> Hi, >>> >>> I had a question that might not seem obvious to me. >>> >>> I was wondering why there was no patnership between microsoft the R core >>> team and eventually other developpers to improve R in one unified version >>> instead of having different teams developping their own version of R. >>> >> >> As far as I know, there's only one version of R currently being >> developed. Microsoft doesn't offer anything different; they just offer >> a build of a slightly older version of base R, and a few packages that >> are not in the base version. >> > > Actually, I think my first sentence above is wrong. Besides the base R > that the core R team works on, there are a few other implementations of the > language: pqR, for instance. But as others have said, the Microsoft > product is simply a repackaging of the core R, so my second sentence is > right. > > Duncan Murdoch > > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Question about R developpment
Hi, I had a question that might not seem obvious to me. I was wondering why there was no patnership between microsoft the R core team and eventually other developpers to improve R in one unified version instead of having different teams developping their own version of R. Is it because they don't want to team up? Is it because you don't want? Any particular reasons? Different philosophies? Thank you Kind regards Morgan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R libcurl does not recognize server certs
On 03/27/2017 03:09 PM, Roman, John wrote: Dirk, ive changed the subject given the nature of the present debugging. Im aware i can extend extras from download.file to install.packages however im curious to know why libcurl in the R invocation does not honor the CA bundle on my system. how would I pass a CA bundle to install.packages? the function has numerous arguments before the extras are taken. A little shot-in-the-dark but on Linux I have $ curl-config --ca /etc/ssl/certs/ca-certificates.crt and in R ?download.file I'm told (the documentation may read as window-specific, but I don't think that's the case) set environment variable 'CURL_CA_BUNDLE' to the path to a certificate bundle file, usually named 'ca-bundle.crt' or 'curl-ca-bundle.crt'. (This is normally done for a binary So if I were having trouble I might say (or set the environment variable in some other way, e.g., as part of an alias to R) > Sys.setenv(CURL_CA_BUNDLE="/etc/ssl/certs/ca-certificates.crt") > download.file("https://, tempfile()) Maybe with more info about your OS and R installation a more transparent solution would offer itself; I'd guess that the bundle location is inferred when R is built from source, and somehow there has been a disconnect between your R installation and certificate location, e.g., moving the certificate location after R installation. Martin Morgan John Roman Linux System Administrator RAND Corporation joro...@rand.org X7302 From: Dirk Eddelbuettel [dirk.eddelbuet...@gmail.com] on behalf of Dirk Eddelbuettel [e...@debian.org] Sent: Monday, March 27, 2017 11:33 AM To: Roman, John Cc: Dirk Eddelbuettel; R-devel@r-project.org Subject: RE: [Rd] R fails to read repo index on NGINX On 27 March 2017 at 18:27, Roman, John wrote: | Thank you for your elaboration. This issue is related to curl trusting a CA cert as its called by R. | curl called from bash recognizes the system cert bundle for CA's, curl called from R does not. | | may I know how to trust the system certificate bundle from within R? See 'help(download.file)' -- it's a little hidden but you can just make the external curl (which, as you say, works in your particular circumstances) the default for remote file access from R too. Next time please try to be a little more specific with your questions and their subject line. Methinks nothing here has anything to do with the httpd server you employ. Dirk -- http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ This email message is for the sole use of the intended...{{dropped:10}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] different compilers and mzR build fails
On 12/21/2016 01:56 PM, lejeczek wrote: I do this on a vanilla-clean R installation, simply: biocLite("mzR") it pulls some deps in which compile fine, only mzR fails. ... meanwhile... I grabbed devtools and comiled github master - still fails. Should I attach build log? One should not send attachments to the list.. I don't suppose? My opinion is that the appropriate forum is the Bioconductor support site. I think you should EDIT your question on the Bioconductor support site to add the compiler output. If you feel like you can spot where things are going wrong, then edited to include those parts otherwise post the output in its entirety; the support site can mangle formatting, so I'd copy-and-paste the compiler output, and then select it and format it as 'code'. If you feel that the current forum is more appropriate, then cut-and-paste the compiler output into an email message, avoding attachments. Martin On 21/12/16 17:06, Martin Morgan wrote: mzR is a Bioconductor package, so better to ask on the Bioconductor support forum https://support.bioconductor.org Oh, I see you did, and then the advice is to avoid cross-posting! The missing .o files would have been produced in an earlier compilation step; they likely failed in some way, so you need to provide the complete compilation output. Did you do this on a version of the package that did not have any previous build artifacts (e.g., via biocLite() or from a fresh svn checkout)? Martin On 12/21/2016 12:00 PM, lejeczek via R-devel wrote: I'm not sure if I should bother you team with this, apologies in case it's a bother. I'm trying gcc 6.2.1 (from devtoolset-6) with R, everything seems to work just fine, except for mzR. Here is failed build: g++ -m64 -shared -L/usr/lib64/R/lib -Wl,-z,relro -o mzR.so cramp.o ramp_base64.o ramp.o RcppRamp.o RcppRampModule.o rnetCDF.o RcppPwiz.o RcppPwizModule.o RcppIdent.o RcppIdentModule.o ./boost/system/src/error_code.o ./boost/regex/src/posix_api.o ./boost/regex/src/fileiter.o ./boost/regex/src/regex_raw_buffer.o ./boost/regex/src/cregex.o ./boost/regex/src/regex_debug.o ./boost/regex/src/instances.o ./boost/regex/src/icu.o ./boost/regex/src/usinstances.o ./boost/regex/src/regex.o ./boost/regex/src/wide_posix_api.o ./boost/regex/src/regex_traits_defaults.o ./boost/regex/src/winstances.o ./boost/regex/src/wc_regex_traits.o ./boost/regex/src/c_regex_traits.o ./boost/regex/src/cpp_regex_traits.o ./boost/regex/src/static_mutex.o ./boost/regex/src/w32_regex_traits.o ./boost/iostreams/src/zlib.o ./boost/iostreams/src/file_descriptor.o ./boost/thread/pthread/once.o ./boost/thread/pthread/thread.o ./boost/filesystem/src/operations.o ./boost/filesystem/src/path.o ./boost/filesystem/src/utf8_codecvt_facet.o ./boost/chrono/src/chrono.o ./boost/chrono/src/process_cpu_clocks.o ./boost/chrono/src/thread_clock.o ./pwiz/data/msdata/Version.o ./pwiz/data/common/MemoryIndex.o ./pwiz/data/common/CVTranslator.o ./pwiz/data/common/cv.o ./pwiz/data/common/ParamTypes.o ./pwiz/data/common/BinaryIndexStream.o ./pwiz/data/common/diff_std.o ./pwiz/data/common/Unimod.o ./pwiz/data/msdata/SpectrumList_MGF.o ./pwiz/data/msdata/DefaultReaderList.o ./pwiz/data/msdata/ChromatogramList_mzML.o ./pwiz/data/msdata/examples.o ./pwiz/data/msdata/Serializer_mzML.o ./pwiz/data/msdata/Serializer_MSn.o ./pwiz/data/msdata/Reader.o ./pwiz/data/msdata/Serializer_MGF.o ./pwiz/data/msdata/Serializer_mzXML.o ./pwiz/data/msdata/SpectrumList_mzML.o ./pwiz/data/msdata/SpectrumList_MSn.o ./pwiz/data/msdata/BinaryDataEncoder.o ./pwiz/data/msdata/Diff.o ./pwiz/data/msdata/MSData.o ./pwiz/data/msdata/References.o ./pwiz/data/msdata/SpectrumList_mzXML.o ./pwiz/data/msdata/IO.o ./pwiz/data/msdata/SpectrumList_BTDX.o ./pwiz/data/msdata/SpectrumInfo.o ./pwiz/data/msdata/RAMPAdapter.o ./pwiz/data/msdata/LegacyAdapter.o ./pwiz/data/msdata/SpectrumIterator.o ./pwiz/data/msdata/MSDataFile.o ./pwiz/data/msdata/MSNumpress.o ./pwiz/data/msdata/SpectrumListCache.o ./pwiz/data/msdata/Index_mzML.o ./pwiz/data/msdata/SpectrumWorkerThreads.o ./pwiz/data/identdata/IdentDataFile.o ./pwiz/data/identdata/IdentData.o ./pwiz/data/identdata/DefaultReaderList.o ./pwiz/data/identdata/Reader.o ./pwiz/data/identdata/Serializer_protXML.o ./pwiz/data/identdata/Serializer_pepXML.o ./pwiz/data/identdata/Serializer_mzid.o ./pwiz/data/identdata/IO.o ./pwiz/data/identdata/References.o ./pwiz/data/identdata/MascotReader.o ./pwiz/data/proteome/Modification.o ./pwiz/data/proteome/Digestion.o ./pwiz/data/proteome/Peptide.o ./pwiz/data/proteome/AminoAcid.o ./pwiz/utility/minimxml/XMLWriter.o ./pwiz/utility/minimxml/SAXParser.o ./pwiz/utility/chemistry/Chemistry.o ./pwiz/utility/chemistry/ChemistryData.o ./pwiz/utility/chemistry/MZTolerance.o ./pwiz/utility/misc/IntegerSet.o ./pwiz/utility/misc/Base64.o ./pwiz/utility/misc/IterationListener.o ./pwiz/utility/misc/MSIHandler.o ./pwiz/utility/misc/Filesystem.o ./pwiz
Re: [Rd] different compilers and mzR build fails
mzR is a Bioconductor package, so better to ask on the Bioconductor support forum https://support.bioconductor.org Oh, I see you did, and then the advice is to avoid cross-posting! The missing .o files would have been produced in an earlier compilation step; they likely failed in some way, so you need to provide the complete compilation output. Did you do this on a version of the package that did not have any previous build artifacts (e.g., via biocLite() or from a fresh svn checkout)? Martin On 12/21/2016 12:00 PM, lejeczek via R-devel wrote: I'm not sure if I should bother you team with this, apologies in case it's a bother. I'm trying gcc 6.2.1 (from devtoolset-6) with R, everything seems to work just fine, except for mzR. Here is failed build: g++ -m64 -shared -L/usr/lib64/R/lib -Wl,-z,relro -o mzR.so cramp.o ramp_base64.o ramp.o RcppRamp.o RcppRampModule.o rnetCDF.o RcppPwiz.o RcppPwizModule.o RcppIdent.o RcppIdentModule.o ./boost/system/src/error_code.o ./boost/regex/src/posix_api.o ./boost/regex/src/fileiter.o ./boost/regex/src/regex_raw_buffer.o ./boost/regex/src/cregex.o ./boost/regex/src/regex_debug.o ./boost/regex/src/instances.o ./boost/regex/src/icu.o ./boost/regex/src/usinstances.o ./boost/regex/src/regex.o ./boost/regex/src/wide_posix_api.o ./boost/regex/src/regex_traits_defaults.o ./boost/regex/src/winstances.o ./boost/regex/src/wc_regex_traits.o ./boost/regex/src/c_regex_traits.o ./boost/regex/src/cpp_regex_traits.o ./boost/regex/src/static_mutex.o ./boost/regex/src/w32_regex_traits.o ./boost/iostreams/src/zlib.o ./boost/iostreams/src/file_descriptor.o ./boost/thread/pthread/once.o ./boost/thread/pthread/thread.o ./boost/filesystem/src/operations.o ./boost/filesystem/src/path.o ./boost/filesystem/src/utf8_codecvt_facet.o ./boost/chrono/src/chrono.o ./boost/chrono/src/process_cpu_clocks.o ./boost/chrono/src/thread_clock.o ./pwiz/data/msdata/Version.o ./pwiz/data/common/MemoryIndex.o ./pwiz/data/common/CVTranslator.o ./pwiz/data/common/cv.o ./pwiz/data/common/ParamTypes.o ./pwiz/data/common/BinaryIndexStream.o ./pwiz/data/common/diff_std.o ./pwiz/data/common/Unimod.o ./pwiz/data/msdata/SpectrumList_MGF.o ./pwiz/data/msdata/DefaultReaderList.o ./pwiz/data/msdata/ChromatogramList_mzML.o ./pwiz/data/msdata/examples.o ./pwiz/data/msdata/Serializer_mzML.o ./pwiz/data/msdata/Serializer_MSn.o ./pwiz/data/msdata/Reader.o ./pwiz/data/msdata/Serializer_MGF.o ./pwiz/data/msdata/Serializer_mzXML.o ./pwiz/data/msdata/SpectrumList_mzML.o ./pwiz/data/msdata/SpectrumList_MSn.o ./pwiz/data/msdata/BinaryDataEncoder.o ./pwiz/data/msdata/Diff.o ./pwiz/data/msdata/MSData.o ./pwiz/data/msdata/References.o ./pwiz/data/msdata/SpectrumList_mzXML.o ./pwiz/data/msdata/IO.o ./pwiz/data/msdata/SpectrumList_BTDX.o ./pwiz/data/msdata/SpectrumInfo.o ./pwiz/data/msdata/RAMPAdapter.o ./pwiz/data/msdata/LegacyAdapter.o ./pwiz/data/msdata/SpectrumIterator.o ./pwiz/data/msdata/MSDataFile.o ./pwiz/data/msdata/MSNumpress.o ./pwiz/data/msdata/SpectrumListCache.o ./pwiz/data/msdata/Index_mzML.o ./pwiz/data/msdata/SpectrumWorkerThreads.o ./pwiz/data/identdata/IdentDataFile.o ./pwiz/data/identdata/IdentData.o ./pwiz/data/identdata/DefaultReaderList.o ./pwiz/data/identdata/Reader.o ./pwiz/data/identdata/Serializer_protXML.o ./pwiz/data/identdata/Serializer_pepXML.o ./pwiz/data/identdata/Serializer_mzid.o ./pwiz/data/identdata/IO.o ./pwiz/data/identdata/References.o ./pwiz/data/identdata/MascotReader.o ./pwiz/data/proteome/Modification.o ./pwiz/data/proteome/Digestion.o ./pwiz/data/proteome/Peptide.o ./pwiz/data/proteome/AminoAcid.o ./pwiz/utility/minimxml/XMLWriter.o ./pwiz/utility/minimxml/SAXParser.o ./pwiz/utility/chemistry/Chemistry.o ./pwiz/utility/chemistry/ChemistryData.o ./pwiz/utility/chemistry/MZTolerance.o ./pwiz/utility/misc/IntegerSet.o ./pwiz/utility/misc/Base64.o ./pwiz/utility/misc/IterationListener.o ./pwiz/utility/misc/MSIHandler.o ./pwiz/utility/misc/Filesystem.o ./pwiz/utility/misc/TabReader.o ./pwiz/utility/misc/random_access_compressed_ifstream.o ./pwiz/utility/misc/SHA1.o ./pwiz/utility/misc/SHA1Calculator.o ./pwiz/utility/misc/sha1calc.o ./random_access_gzFile.o ./RcppExports.o rampR.o R_init_mzR.o -lpthread -lnetcdf -L/usr/lib64/R/lib -lR g++: error: cramp.o: No such file or directory g++: error: ramp_base64.o: No such file or directory g++: error: ramp.o: No such file or directory g++: error: RcppRamp.o: No such file or directory g++: error: RcppRampModule.o: No such file or directory g++: error: rnetCDF.o: No such file or directory g++: error: RcppPwiz.o: No such file or directory g++: error: RcppPwizModule.o: No such file or directory g++: error: RcppIdent.o: No such file or directory g++: error: RcppIdentModule.o: No such file or directory /usr/share/R/make/shlib.mk:6: recipe for target 'mzR.so' failed make: *** [mzR.so] Error 1 It did compile with 5.2.x (from devtoolset-4) and worked fine. I'm hoping you guys could confirm it is purely compiler problem? Or point me(not a real programme
Re: [Rd] methods(`|`) lists all functions?
On 12/08/2016 05:16 PM, frede...@ofb.net wrote: Dear R-Devel, I was attempting an exercise in Hadley Wickam's book "Advanced R". The exercise is to find the generic with the greatest number of methods. I found that 'methods(`|`)' produces a list of length 2506, in R 3.3.1. Similar behavior is found in 3.4.0. It seems to include all functions and methods. I imagine something is being passed to "grep" without being escaped. Exactly; I've fixed this in r71763 (R-devel). Martin Morgan I hope I didn't miss something in the documentation, and that I'm good to report this as a bug. I can send it to Bugzilla if that's better. By the way, how do I produce such a list of functions (or variables) in a "normal" way? I used 'ls("package:base")' for the exercise, because I saw this call used somewhere as an example, but I couldn't find that "package:" syntax documented under ls()... Also found this confusing: > environmentName(globalenv()) [1] "R_GlobalEnv" > ls("R_GlobalEnv") Error in as.environment(pos) : no item called "R_GlobalEnv" on the search list So I'm not sure if "package:base" is naming an environment, or if there are different ways to name environments and ls() is using one form while environmentName is returning another ... It might be good to add some clarifying examples under "?ls". Thanks, Frederick __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel This email message may contain legally privileged and/or...{{dropped:2}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] On implementing zero-overhead code reuse
On 10/03/2016 01:51 PM, Kynn Jones wrote: Thank you all for your comments and suggestions. @Frederik, my reason for mucking with environments is that I want to minimize the number of names that import adds to my current environment. For instance, if module foo defines a function bar, I want my client code to look like this: import("foo") foo$bar(1,2,3) rather than import("foo") bar(1,2,3) (Just a personal preference.) @Dirk, @Kasper, as I see it, the benefit of scripting languages like Python, Perl, etc., is that they allow very quick development, with minimal up-front cost. Their main strength is precisely that one can, without much difficulty, *immediately* start *programming productively*, without having to worry at all about (to quote Dirk) "repositories. And package management. And version control (at the package level). And ... byte compilation. And associated documentation. And unit tests. And continuous integration." Of course, *eventually*, and for a fraction of one's total code base (in my case, a *very small* fraction), one will want to worry about all those things, but I see no point in burdening *all* my code with all those concerns from the start. Again, please keep in mind that those concerns come into play for at most 5% of the code I write. Also, I'd like to point out that the Python, Perl, etc. communities are no less committed to all the concerns that Dirk listed (version control, package management, documentation, testing, etc.) than the R community is. And yet, Python, Perl, etc. support the "zero-overhead" model of code reuse. There's no contradiction here. Support for "zero-overhead" code reuse does not preclude forms of code reuse with more overhead. One benefit the zero-overhead model is that the concerns of documentation, testing, etc. can be addressed with varying degrees of thoroughness, depending on the situation's demands. (For example, documentation that would be perfectly adequate for me as the author of a function would not be adequate for the general user.) This means that the transition from writing private code to writing code that can be shared with the world can be made much more gradually, according to the programmer's needs and means. Currently, in the R world, the choice for programmers is much starker: either stay writing little scripts that one sources from an interactive session, or learn to implement packages. There's too little in-between. I know it's flogging the same horse, but for the non-expert I create and attach a complete package devtools::create("myutils") library(myutils) Of course it doesn't do anything, so I write my code by editing a plain text file myutils/R/foo.R to contain foo = function() "hello wirld" then return to my still-running R session and install the updated package and use my new function devtools::install("myutils") foo() myutils::foo() # same, but belt-and-suspenders I notice my typo, update the file, and use the updated package devtools::install("myutils") foo() The transition from here to a robust package can be gradual, updating the DESCRIPTION file, adding roxygen2 documentation, unit tests, using version control, etc... in a completely incremental way. At the end of it all, I'll still install and use my package with devtools::install("myutils") foo() maybe graduating to devtools::install_github("mtmorgan/myutils") library(myutils) foo() when it's time to share my work with the wirld. Martin Of course, from the point of view of someone who has already written several packages, the barrier to writing a package may seem too small to fret over, but adopting the expert's perspective is likely to result in excluding the non-experts. Best, kj On Mon, Oct 3, 2016 at 12:06 PM, Kasper Daniel Hansen wrote: On Mon, Oct 3, 2016 at 10:18 AM, wrote: Hi Kynn, Thanks for expanding. I wrote a function like yours when I first started using R. It's basically the same up to your "new.env()" line, I don't do anything with environmentns. I just called my function "mysource" and it's essentially a "source with path". That allows me to find code I reuse in standard locations. I don't know why R does not have built-in support for such a thing. You can get it in C compilers with CPATH, and as you say in Perl with PERL5LIB, in Python, etc. Obviously when I use my "mysource" I have to remember that my code is now not portable without copying over some files from other locations in my home directory. However, as a beginner I find this tool to be indispensable, as R lacks several functions which I use regularly, and I'm not necessarily ready to confront the challenges associated with creating a package. I can pretty much guarantee that when you finally confront the "challenge" of making your own package you'll realize (1) it is pretty easy if the intention is only to use it yourself (and perhaps a couple of collaborators) - by easy I mean I can make a pac
Re: [Rd] failed to assign RegisteredNativeSymbol for splitString
On 07/18/2016 03:45 PM, Andrew Piskorski wrote: I saw a warning from R that I don't fully understand. Here's one way to reproduce it: $ /usr/local/pkg/R-3.2-branch-20160718/bin/R --version | head -n 3 R version 3.2.5 Patched (2016-05-05 r70929) -- "Very, Very Secure Dishes" Copyright (C) 2016 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu/x86_64 (64-bit) $ /usr/local/pkg/R-3.2-branch-20160718/bin/R --vanilla --no-restore --no-save --silent > splitString <- function(...) { print("Test, do nothing") } > invisible(tools::toTitleCase) Warning message: failed to assign RegisteredNativeSymbol for splitString to splitString since splitString is already defined in the 'tools' namespace Another way to trigger that warning is by loading the knitr package, e.g.: or splitString = NULL; loadNamespace("tools") Thanks, it's a bug fixed with ---- r70933 | morgan | 2016-07-18 16:35:39 -0400 (Mon, 18 Jul 2016) | 5 lines assignNativeRoutines looks only in package namespace - previously looked for symbols in inherited environments - https://stat.ethz.ch/pipermail/r-devel/2016-July/072909.html > require("knitr") Loading required package: knitr Warning: failed to assign RegisteredNativeSymbol for splitString to splitString since splitString is already defined in the 'tools' namespace The warning only happens the FIRST time I run any code that triggers it. To get it to happen again, I need to restart R. R 3.1.0 and all earlier versions do not throw that warning, because they do not have any splitString C function (see below) at all. R 3.2.5 does throw the warning, and I believe 3.3 and all later versions of R do also (but I cannot currently test that on this machine). In my case, normally I start R without "--vanilla", and load various custom libraries of my own, one of which contained an R function "splitString". That gave the exact same symptoms as the simpler way of reproducing the warning above. In practice, I solved the problem by renaming my "splitString" function to something else. But I still wonder what exactly was going on with that warning. I noticed that the toTitleCase() R code calls .Call() with a bare splitString identifier, no quotes around it: $ grep -n splitString R-3-[234]*/src/library/tools/R/utils.R R-3-2-branch/src/library/tools/R/utils.R:1988:xx <- .Call(splitString, x, ' -/"()') R-3-3-branch/src/library/tools/R/utils.R:2074:xx <- .Call(splitString, x, ' -/"()\n') R-3-4-trunk/src/library/tools/R/utils.R:2074:xx <- .Call(splitString, x, ' -/"()\n') $ find R-3-4-trunk -name .svn -prune -o -type f -print0 | xargs -0 grep -n splitString R-3-4-trunk/src/library/tools/R/utils.R:2074:xx <- .Call(splitString, x, ' -/"()\n') R-3-4-trunk/src/library/tools/src/text.c:264:SEXP splitString(SEXP string, SEXP delims) R-3-4-trunk/src/library/tools/src/tools.h:45:SEXP splitString(SEXP string, SEXP delims); R-3-4-trunk/src/library/tools/src/init.c:53:CALLDEF(splitString, 2), Doing that is perfectly legal according to help(".Call"), and interestingly, it apparently does NOT matter whether that code puts quotes around the splitString or not - I tried it, and it made no difference. Is it generally the case the users MUST NOT define R functions with the same names as "registered" C functions? Will something break if we do? This email message may contain legally privileged and/or...{{dropped:2}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] dowload.file(method="libcurl") and GET vs. HEAD requests
No I don't think there is a way to avoid the HEAD request. From: Winston Chang Sent: Wednesday, June 22, 2016 12:01:39 PM To: Morgan, Martin Cc: R Devel List Subject: Re: [Rd] dowload.file(method="libcurl") and GET vs. HEAD requests Thanks for looking into it. Is there a way to avoid the HEAD request in R 3.3.0? I'm asking because if there isn't, then I'll add a workaround in a package I'm working on. -Winston On Tue, Jun 21, 2016 at 9:45 PM, Martin Morgan wrote: > On 06/21/2016 09:35 PM, Winston Chang wrote: >> >> In R 3.2.4, if you ran download.file(method="libcurl"), it issues a >> HTTP GET request for the file. However, in R 3.3.0, it issues a HTTP >> HEAD request first, and then a GET requet. This can result in problems >> when the web server gives an error for a HEAD request, even if the >> file is available with a GET request. >> >> Is it possible to tell download.file to simply send a GET request, >> without first sending a HEAD request? >> >> >> In theory, web servers should give the same response for HEAD and GET >> requests, except that for a HEAD request, it sends only headers, and >> not the content. However, not all web servers do this for all files. >> I've seen this problem come up in two different places. >> >> The first is from an issue that someone filed for the downloader >> package. The following works in R 3.2.4, but in R 3.3.0, it fails with >> a 404 (tested on a Mac): >>options(internet.info=1) # Show verbose download info >>url <- >> "https://census.edina.ac.uk/ukborders/easy_download/prebuilt/shape/England_lad_2011_gen.zip"; >> download.file(url, destfile = "out.zip", method="libcurl") >> >> In R 3.3.0, the download succeeds with method="wget", and >> method="curl". It's only method="libcurl" that has problems. >> >> >> The second place I've encountered a problem is in downloading attached >> files from a GitHub release. >>options(internet.info=1) # Show verbose download info >>url <- >> "https://github.com/wch/webshot/releases/download/v0.3/phantomjs-2.1.1-macosx.zip"; >>download.file(url, destfile = "out.zip") >> >> This one fails with a 403 Forbidden because it gets redirected to a >> URL in Amazon S3, where a signature of the file is embedded in the >> URL. However, the signature is computed with the request type (HEAD >> vs. GET), and so the same URL doesn't work for both. (See >> http://stackoverflow.com/a/20580036/412655) >> >> Any help would be appreciated! > > > I think I introduced this, in > > > r69280 | morgan | 2015-09-03 06:24:49 -0400 (Thu, 03 Sep 2015) | 4 lines > > don't create empty file on 404 and similar errors > > - download.file(method="libcurl") > > > > The idea was to test that the file can be downloaded before trying to > download it; previously R would download the error page as though it were > the content. > > I'll give this some thought. > > Martin Morgan > > >> -Winston >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > > This email message may contain legally privileged and/or confidential > information. If you are not the intended recipient(s), or the employee or > agent responsible for the delivery of this message to the intended > recipient(s), you are hereby notified that any disclosure, copying, > distribution, or use of this email message is prohibited. If you have > received this message in error, please notify the sender immediately by > e-mail and delete this email message from your computer. Thank you. This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] dowload.file(method="libcurl") and GET vs. HEAD requests
On 06/21/2016 09:35 PM, Winston Chang wrote: In R 3.2.4, if you ran download.file(method="libcurl"), it issues a HTTP GET request for the file. However, in R 3.3.0, it issues a HTTP HEAD request first, and then a GET requet. This can result in problems when the web server gives an error for a HEAD request, even if the file is available with a GET request. Is it possible to tell download.file to simply send a GET request, without first sending a HEAD request? In theory, web servers should give the same response for HEAD and GET requests, except that for a HEAD request, it sends only headers, and not the content. However, not all web servers do this for all files. I've seen this problem come up in two different places. The first is from an issue that someone filed for the downloader package. The following works in R 3.2.4, but in R 3.3.0, it fails with a 404 (tested on a Mac): options(internet.info=1) # Show verbose download info url <- "https://census.edina.ac.uk/ukborders/easy_download/prebuilt/shape/England_lad_2011_gen.zip"; download.file(url, destfile = "out.zip", method="libcurl") In R 3.3.0, the download succeeds with method="wget", and method="curl". It's only method="libcurl" that has problems. The second place I've encountered a problem is in downloading attached files from a GitHub release. options(internet.info=1) # Show verbose download info url <- "https://github.com/wch/webshot/releases/download/v0.3/phantomjs-2.1.1-macosx.zip"; download.file(url, destfile = "out.zip") This one fails with a 403 Forbidden because it gets redirected to a URL in Amazon S3, where a signature of the file is embedded in the URL. However, the signature is computed with the request type (HEAD vs. GET), and so the same URL doesn't work for both. (See http://stackoverflow.com/a/20580036/412655) Any help would be appreciated! I think I introduced this, in r69280 | morgan | 2015-09-03 06:24:49 -0400 (Thu, 03 Sep 2015) | 4 lines don't create empty file on 404 and similar errors - download.file(method="libcurl") The idea was to test that the file can be downloaded before trying to download it; previously R would download the error page as though it were the content. I'll give this some thought. Martin Morgan -Winston __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel This email message may contain legally privileged and/or...{{dropped:2}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Is it possible to increase MAX_NUM_DLLS in future R releases?
On 05/04/2016 05:15 AM, Prof Brian Ripley wrote: On 04/05/2016 08:44, Martin Maechler wrote: Qin Zhu on Mon, 2 May 2016 16:19:44 -0400 writes: > Hi, > I’m working on a Shiny app for statistical analysis. I ran into this "maximal number of DLLs reached" issue recently because my app requires importing many other packages. > I’ve posted my question on stackoverflow (http://stackoverflow.com/questions/36974206/r-maximal-number-of-dlls-reached <http://stackoverflow.com/questions/36974206/r-maximal-number-of-dlls-reached>). > I’m just wondering is there any reason to set the maximal number of DLLs to be 100, and is there any plan to increase it/not hardcoding it in the future? It seems many people are also running into this problem. I know I can work around this problem by modifying the source, but since my package is going to be used by other people, I don’t think this is a feasible solution. > Any suggestions would be appreciated. Thanks! > Qin Increasing that number is of course "possible"... but it also costs a bit (adding to the fixed memory footprint of R). And not only that. At the time this was done (and it was once 50) the main cost was searching DLLs for symbols. That is still an issue, and few packages exclude their DLL from symbol search so if symbols have to searched for a lot of DLLs will be searched. (Registering all the symbols needed in a package avoids a search, and nowadays by default searches from a namespace are restricted to that namespace.) See https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Registering-native-routines for some further details about the search mechanism. I did not set that limit, but I'm pretty sure it was also meant as reminder for the useR to "clean up" a bit in her / his R session, i.e., not load package namespaces unnecessarily. I cannot yet imagine that you need > 100 packages | namespaces loaded in your R session. OTOH, some packages nowadays have a host of dependencies, so I agree that this at least may happen accidentally more frequently than in the past. I am not convinced that it is needed. The OP says he imports many packages, and I doubt that more than a few are required at any one time. Good practice is to load namespaces as required, using requireNamespace. Extensive package dependencies in Bioconductor make it pretty easy to end up with dozen of packages attached or loaded. For instance library(GenomicFeatures) library(DESeq2) > length(loadedNamespaces()) [1] 63 > length(getLoadedDLLs()) [1] 41 Qin's use case is a shiny app, presumably trying to provide relatively comprehensive access to a particular domain. Even if the app were to load / requireNamespace() (this requires considerable programming discipline to ensure that the namespace is available on all programming paths where it is used), it doesn't seem at all improbable that the user in an exploratory analysis would end up accessing dozens of packages with orthogonal dependencies. This is also the use case with Karl Forner's post https://stat.ethz.ch/pipermail/r-devel/2015-May/071104.html (adding library(crlmm) to the above gets us to 53 DLLs). The real solution of course would be a code improvement that starts with a relatively small number of "DLLinfo" structures (say 32), and then allocates more batches (of size say 32) if needed. The problem of course is that such code will rarely be exercised, and people have made errors on the boundaries (here multiples of 32) many times in the past. (Note too that DLLs can be removed as well as added, another point of coding errors.) That argues for a simple increase in the maximum number of DLLs. This would enable some people to have very bulky applications that pay a performance cost (but the cost here is in small fractions of a second...) in terms of symbol look-up (and collision?), but would have no consequence for those of us with more sane use cases. Martin Morgan Patches to the R sources (development trunk in subversion at https://svn.r-project.org/R/trunk/ ) are very welcome! Martin Maechler ETH Zurich & R Core Team This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] vignette index
Within R these are determined by \VignetteIndexEntry{}. I think you are referring to the order on the CRAN landing page for your package https://cran.r-project.org/web/packages/bst/index.html, and then the question is for a CRAN member. In Rnw \documentclass{article} % \VignetteIndexEntry{01-Foo} \begin{document} 01-Foo \end{document} or Rmd as --- title: "Demo" author: Ima Scientist vignette: > % \VignetteIndexEntry{01-Foo} % \VignetteEngine{knitr::rmarkdown} --- From: R-devel on behalf of Wang, Zhu Sent: Friday, March 4, 2016 11:18 AM To: Duncan Murdoch; r-devel@r-project.org Subject: Re: [Rd] vignette index I think the online order of vignette files are not based on vignette title or filename alphabetically. I am just curious: by what order these vignette files were displayed online so I can make changes accordingly? Thanks, Zhu -Original Message- From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] Sent: Friday, March 04, 2016 10:47 AM To: Wang, Zhu; r-devel@r-project.org Subject: Re: [Rd] vignette index On 04/03/2016 9:44 AM, Wang, Zhu wrote: > Dear helpers, > > I have multiple vignette files for a package, and I would like to have the > "right" order of these files when displayed online. For instance, see below: > > https://cran.r-project.org/web/packages/bst/index.html > > The order of vignette links on CRAN is different from what I hoped for: > > > vignette(package="bst") > Vignettes in package 'bst': > > prosCancer Classification Using Mass > Spectrometry-based Proteomics Data (source, > pdf) > static_khan Classification of Cancer Types Using Gene > Expression Data (Long) (source, pdf) > khanClassification of Cancer Types Using Gene > Expression Data (Short) (source, pdf) > static_mcl Classification of UCI Machine Learning Datasets > (Long) (source, pdf) > mcl Classification of UCI Machine Learning Datasets > (Short) (source, pdf) > > The package bst already has an index.html, and I thought that should have > done the job, but apparently not. Any suggestions? > The index.html file should be used in the online help system, but vignette() doesn't use that, it looks in the internal database of vignettes. I don't think you can control the order in which it displays things. This could conceivably be changed, but not by consulting your index.html file --- it is not required to follow a particular structure, so we can't find what order you want from it. One more likely possibility would be to sort alphabetically in the current locale according to filename or vignette title. So then you could get what you want by naming your vignettes 1pros, 2static_khan, etc. It would also be possible to add a new \Vignette directive so affect collation order, but that seems like overkill. Duncan Murdoch **Connecticut Children's Confidentiality Notice** This e-mail message, including any attachments, is for =...{{dropped:15}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] as.vector in R-devel loaded 3/3/2016
I see as below, where getGeneric and getMethod imply a different signature; the signature is mode="any" for both cases in R version 3.2.3 Patched (2016-01-28 r70038)I don't know how to reproduce Jeff's error, though. > library(Matrix) > as.vector function (x, mode = "any") .Internal(as.vector(x, mode)) > getGeneric("as.vector") standardGeneric for "as.vector" defined from package "base" function (x, mode) standardGeneric("as.vector") Methods may be defined for arguments: x Use showMethods("as.vector") for currently available ones. > selectMethod("as.vector", "ANY") Method Definition (Class "internalDispatchMethod"): function (x, mode) .Internal(as.vector(x, mode)) Signatures: x target "ANY" defined "ANY" > sessionInfo() R Under development (unstable) (2016-02-27 r70232) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.4 LTS locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Matrix_1.2-4 loaded via a namespace (and not attached): [1] grid_3.3.0 lattice_0.20-33 From: R-devel on behalf of Martin Maechler Sent: Friday, March 4, 2016 6:05 AM To: peter dalgaard Cc: r-devel@r-project.org; Jeff Laake - NOAA Federal Subject: Re: [Rd] as.vector in R-devel loaded 3/3/2016 > peter dalgaard > on Fri, 4 Mar 2016 09:21:48 +0100 writes: > Er, until _what_ is fixed? > I see no anomalies with the version in R-pre: Indeed. The problem ... I also have stumbled over .. is that I'm sure Jeff is accidentally loading a different version of 'Matrix' than the one that is part of R-devel. Jeff you must accidentally be loading a version Matrix made with R 3.2.x in R 3.3.0 and that will fail with the as.vector() mismatch error message. (and IIRC, you also get such an error message if you load a 3.3.0-built version of Matrix into a non-3.3.0 version of R). Martin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Error in socketConnection(master, port = port, blocking = TRUE, open = "a+b", : cannot open the connection
Arrange to make the ssh connection passwordless. Do this by copying your 'public key' to the machine that you are trying to connect to. Google will be your friend in accomplishing this. It might be that a firewall stands between you and the other machine, or that the other machine does not allow connections to port 11001. Either way, the direction toward a solution is to speak with your system administrator. If it is firewall, then they are unlikely to accommodate you; the strategy is to run your cluster exclusively on one side of the firewall. Martin Morgan From: R-devel [r-devel-boun...@r-project.org] on behalf of Soumen Pal via R-devel [r-devel@r-project.org] Sent: Friday, January 15, 2016 2:05 AM To: r-devel@r-project.org Subject: [Rd] Error in socketConnection(master, port = port, blocking = TRUE, open = "a+b", :cannot open the connection Dear All I have sucessfully created cluster of four nodes using localhost in my local machine by executing the following command > cl<-makePSOCKcluster(c(rep("localhost",4)),outfile='',homogeneous=FALSE,port=11001) starting worker pid=4271 on localhost:11001 at 12:12:26.164 starting worker pid=4280 on localhost:11001 at 12:12:26.309 starting worker pid=4289 on localhost:11001 at 12:12:26.456 starting worker pid=4298 on localhost:11001 at 12:12:26.604 > > stopCluster(cl) Now I am trying to create a cluster of 2 nodes (one in my local machine and another remote machine) by using "makePSOCKcluster" command. Both machine have identical settings and connected by SSH. OS is Ubuntu 14.04 LTS & R version 3.2.1. I have executed the follwoing command to create cluster but getting the following error message and R Session is getting hanged. cl<-makePSOCKcluster(c(rep("soumen@10.10.2.32",1)),outfile='',homogeneous=FALSE,port=11001) soumen@10.10.2.32's password: starting worker pid=2324 on soumen-HP-ProBook-440-G2:11001 at 12:11:59.349 Error in socketConnection(master, port = port, blocking = TRUE, open = "a+b", : cannot open the connection Calls: ... doTryCatch -> recvData -> makeSOCKmaster -> socketConnection In addition: Warning message: In socketConnection(master, port = port, blocking = TRUE, open = "a+b", : soumen-HP-ProBook-440-G2:11001 cannot be opened Execution halted My sessionInfo() is as follwos sessionInfo() R version 3.2.1 (2015-06-18) Platform: x86_64-unknown-linux-gnu (64-bit) Running under: Ubuntu 14.04.1 LTS locale: [1] LC_CTYPE=en_IN LC_NUMERIC=C LC_TIME=en_IN [4] LC_COLLATE=en_IN LC_MONETARY=en_INLC_MESSAGES=en_IN [7] LC_PAPER=en_IN LC_NAME=CLC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_IN LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base I dont know how to solve this problem.Plese help me to solve this problem. Thanks Soumen Pal [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] For integer vectors, `as(x, "numeric")` has no effect.
>From the Bioconductor side of things, the general feeling is that this is a >step in the right direction and worth the broken packages. Martin Morgan From: R-devel [r-devel-boun...@r-project.org] on behalf of Martin Maechler [maech...@stat.math.ethz.ch] Sent: Friday, December 11, 2015 4:25 AM To: John Chambers; r-devel@r-project.org; bioc-devel list; Benjamin Tyner Cc: Martin Maechler Subject: Re: [Rd] For integer vectors, `as(x, "numeric")` has no effect. >>>>> Martin Maechler >>>>> on Tue, 8 Dec 2015 15:25:21 +0100 writes: >>>>> John Chambers >>>>> on Mon, 7 Dec 2015 16:05:59 -0800 writes: >> We do need an explicit method here, I think. >> The issue is that as() uses methods for the generic function coerce() but cannot use inheritance in the usual way (if it did, you would be immediately back with no change, since "integer" inherits from "numeric"). >> Copying in the general method for coercing to "numeric" as an explicit method for "integer" gives the expected result: >>> setMethod("coerce", c("integer", "numeric"), getMethod("coerce", c("ANY", "numeric"))) >> [1] "coerce" >>> typeof(as(1L, "numeric")) >> [1] "double" >> Seems like a reasonable addition to the code, unless someone sees a problem. >> John > I guess that that some package checks (in CRAN + Bioc + ... - > land) will break, > but I still think we should add such a coercion to R. > Martin Hmm... I've tried to add the above to R and do notice that there are consequences that may be larger than anticipated: Here is example code: myN <- setClass("myN", contains="numeric") myNid <- setClass("myNid", contains="numeric", representation(id="character")) NN <-setClass("NN", representation(x="numeric")) (m1 <- myN (1:3)) (m2 <- myNid(1:3, id = "i3")) tools::assertError(NN (1:3))# in all R versions ## # current R | new R ## # ---|-- class(getDataPart(m1)) # integer| numeric class(getDataPart(m2)) # integer| numeric In other words, with the above setting, the traditional gentleperson's agreement in S and R, __ "numeric" sometimes conveniently means "integer" or "double" __ will be slightly less often used ... which of course may be a very good thing. However, it breaks strict back compatibility also in cases where the previous behavior may have been preferable: After all integer vectors need only have the space of doubles. Shall we still go ahead and do apply this change to R-devel and then all package others will be willing to update where necessary? As this may affect the many hundreds of bioconductor packages using S4 classes, I am -- exceptionally -- cross posting to the bioc-devel list. Martin Maechler >> On Dec 7, 2015, at 3:37 PM, Benjamin Tyner wrote: >>> Perhaps it is not that surprising, given that >>> >>> > mode(1L) >>> [1] "numeric" >>> >>> and >>> >>> > is.numeric(1L) >>> [1] TRUE >>> >>> On the other hand, this is curious, to say the least: >>> >>> > is.double(as(1L, "double")) >>> [1] FALSE >>> >>>> Here's the surprising behavior: >>>> >>>> x <- 1L >>>> xx <- as(x, "numeric") >>>> class(xx) >>>> ## [1] "integer" >>>> >>>> It occurs because the call to `as(x, "numeric")` dispatches the coerce >>>> S4 method for the signature `c("integer", "numeric")`, whose body is >>>> copied in below. >>>> >>>> function (from, to = "numeric", strict = TRUE) >>>> if (strict) { >>>> class(from) <- "numeric" >>>> from >>>> } else from >>>> >>>> This in turn does nothing, even when strict=TRUE, because that >>>> assignment to class "numeric" has no effect: >>>> >>>> x <- 10L >>>> class(x) <- "numeric" >>>> class(x) >>>> [1] "integer" >>>> >>>> Is thi
Re: [Rd] Error generated by .Internal(nchar) disappears when debugging
> -Original Message- > From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Cook, > Malcolm > Sent: Wednesday, October 07, 2015 3:52 PM > To: 'Duncan Murdoch'; Matt Dowle; r-de...@stat.math.ethz.ch > Subject: Re: [Rd] Error generated by .Internal(nchar) disappears when > debugging > > What other packages do you have loaded? Perhaps a BioConductor one that > loads S4Vectors that announces upon load: > > Creating a generic function for 'nchar' from package 'base' in package > 'S4Vectors' This was introduced as a way around the problem, where the declaration of a method was moved to the .onLoad hook .onLoad <- function(libname, pkgname) setMethod("nchar", "Rle", .nchar_Rle) instead of in the body of the package. The rationale was that the method is then created at run-time, when the generic is defined on the user's R, rather than at compile time, when the generic is defined on the build system's R. There was a subsequent independent report that this did not solve the problem, but we were not able to follow up on that. This is only defined in the current release version of S4Vectors, which is the only version expected to straddle R versions with different signatures. Martin Morgan > > Maybe a red herring... > > ~Malcolm > > > -Original Message- > > From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of > Duncan > Murdoch > Sent: Monday, October 05, 2015 6:57 PM > To: Matt > Dowle ; r-de...@stat.math.ethz.ch > Subject: Re: > [Rd] Error generated by .Internal(nchar) disappears when > debugging > > > On 05/10/2015 7:24 PM, Matt Dowle wrote: > > > Joris Meys gmail.com> writes: > > > > > >> > > >> Hi all, > > >> > > >> I have a puzzling problem related to nchar. In R 3.2.1, the internal > > > > nchar > >> gained an extra argument (see > >> > https://stat.ethz.ch/pipermail/r-announce/2015/000586.html) > > >> > > >> I've been testing code using the package copula, and at home I'm > >> > still running R 3.2.0 (I know, I know...). When trying the following > >> > code, I > > > got > >> an error: > > >> > > >>> library(copula) > > >>> fgmCopula(0.8) > > >> Error in substr(sc[i], 2, nchar(sc[i]) - 1) : > > >> 4 arguments passed to .Internal(nchar) which requires 3 > > >> > > >> Cheers > > >> Joris > > > > > > > > > I'm seeing a similar problem. IIUC, the Windows binary .zip from CRAN > > > of any package using base::nchar is affected. Could someone check my > > > answer here is correct please : > > > http://stackoverflow.com/a/32959306/403310 > > > > Nobody has posted a simple reproducible example here, so it's kind of > hard to > say. > > > > I would have guessed that a change to the internal signature of the C code > > underlying nchar() wouldn't have any effect on a package that called the R > > nchar() function. > > > > When I put together my own example (a tiny package containing a > function > calling nchar(), built to .zip using R 3.2.2, installed into R > 3.2.0), it > confirmed > my guess. > > > > On the other hand, if some package is calling the .Internal function > directly, > I'd > expect that to break. Packages shouldn't do that. > > > > So I'd say there's been no evidence posted of a problem in R here, though > > there may be problems in some of the packages involved. I'd welcome an > > example that provided some usable evidence. > > > > Duncan Murdoch > > > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Issues with libcurl + HTTP status codes (eg. 403, 404)
R-devel r69197 returns appropriate errors for the cases below; I know of a few rough edges - ftp error codes are not reported correctly - download.file creates destfile before discovering that http fails, leaving an empty file on disk and am happy to hear of more. Martin On 08/27/2015 08:46 AM, Jeroen Ooms wrote: On Thu, Aug 27, 2015 at 5:16 PM, Martin Maechler wrote: Probably I'm confused now... Both R-patched and R-devel give an error (after a *long* wait!) for download.file("https://someserver.com/mydata.csv";, "mydata.csv") So that problem is I think solved now. I'm sorry for the confusion, this was a hypothetical example. Connection failures are different from http status errors. Below some real examples of servers returning http errors. For each example the "internal" method correctly raises an R error, whereas the "libcurl" method does not. # File not found (404) download.file("http://httpbin.org/data.csv";, "data.csv", method = "internal") download.file("http://httpbin.org/data.csv";, "data.csv", method = "libcurl") readLines(url("http://httpbin.org/data.csv";, method = "internal")) readLines(url("http://httpbin.org/data.csv";, method = "libcurl")) # Unauthorized (401) download.file("https://httpbin.org/basic-auth/user/passwd";, "data.csv", method = "internal") download.file("https://httpbin.org/basic-auth/user/passwd";, "data.csv", method = "libcurl") readLines(url("https://httpbin.org/basic-auth/user/passwd";, method = "internal")) readLines(url("https://httpbin.org/basic-auth/user/passwd";, method = "libcurl")) -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Issues with libcurl + HTTP status codes (eg. 403, 404)
On 08/27/2015 08:16 AM, Martin Maechler wrote: "DM" == Duncan Murdoch on Wed, 26 Aug 2015 19:07:23 -0400 writes: DM> On 26/08/2015 6:04 PM, Jeroen Ooms wrote: >> On Tue, Aug 25, 2015 at 10:33 PM, Martin Morgan wrote: >>> >>> actually I don't know that it does -- it addresses the symptom but I think there should be an error from libcurl on the 403 / 404 rather than from read.dcf on error page... >> >> Indeed, the only correct behavior is to turn the protocol error code >> into an R exception. When the server returns a status code >= 400, it >> indicates that the request was unsuccessful and the response body does >> not contain the content the client had requested, but should instead >> be interpreted as an error message/page. Ignoring this fact and >> proceeding with parsing the body as usual is incorrect and leads to >> all kind of strange errors downstream. DM> Yes. I haven't been following this long thread. Is it only in R-devel, DM> or is this happening in 3.2.2 or R-patched? DM> If the latter, please submit a bug report. If it is only R-devel, DM> please just be patient. When R-devel becomes R-alpha next year, if the DM> bug still exists, please report it. DM> Duncan Murdoch Probably I'm confused now... Both R-patched and R-devel give an error (after a *long* wait!) for download.file("https://someserver.com/mydata.csv";, "mydata.csv") So that problem is I think solved now. Ideally, it would nice to set the *timeout* as an R function argument ourselves.. though. Kevin Ushey's original problem however is still in R-patched and R-devel: ap <- available.packages("http://www.stats.ox.ac.uk/pub/RWin";, method="libcurl") ap giving ap <- available.packages("http://www.stats.ox.ac.uk/pub/RWin";, method="libcurl")Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin: Line starting ' ap Package Version Priority Depends Imports LinkingTo Suggests Enhances License License_is_FOSS License_restricts_use OS_type Archs MD5sum NeedsCompilation File Repository and the resulting 'ap' is the same as e.g., with the the default method which also gives a warning and then an empty list (well "data.frame") of packages. I don't see a big problem with the above. It would be better if the warning did not contain the extra "Line starting ' In Kevin's original post, he was using an earlier version of R, and the code in available.packages was returning an error. The code had been updated (by me) in the version that you are using to return a warning, which was the original design and intention (to convert errors during repository queries into warnings, so other repositories could be queried; this was Kevin's original point). The fix I provided does not address the underlying problem, which is that download.file("http://www.stats.ox.ac.uk/pub/RWin/PACKAGES.gz";, fl <- tempfile(), method="libcurl") actually downloads the error file, without throwing an error > download.file("http://www.stats.ox.ac.uk/pub/RWin/PACKAGES.gz";, fl <- tempfile(), method="libcurl") trying URL 'http://www.stats.ox.ac.uk/pub/RWin/PACKAGES.gz' Content type 'text/html; charset=iso-8859-1' length 302 bytes == downloaded 302 bytes > cat(paste(readLines(fl), collapse="\n")) 404 Not Found Not Found The requested URL /pub/RWin/PACKAGES.gz was not found on this server. Apache/2.2.22 (Debian) Server at www.stats.ox.ac.uk Port 80 > I do have a patch for this, which I will share off-list before committing. Martin -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Issues with libcurl + HTTP status codes (eg. 403, 404)
On 08/25/2015 01:30 PM, Kevin Ushey wrote: Hi Martin, Indeed it does (and I should have confirmed myself with R-patched and R-devel before posting...) actually I don't know that it does -- it addresses the symptom but I think there should be an error from libcurl on the 403 / 404 rather than from read.dcf on error page... Martin Thanks, and sorry for the noise. Kevin On Tue, Aug 25, 2015, 13:11 Martin Morgan mailto:mtmor...@fredhutch.org>> wrote: On 08/25/2015 12:54 PM, Kevin Ushey wrote: > Hi all, > > The following fails for me (on OS X, although I imagine it's the same > on other platforms using libcurl): > > options(download.file.method = "libcurl") > options(repos = c(CRAN = "https://cran.rstudio.com/";, CRANextra = > "http://www.stats.ox.ac.uk/pub/RWin";)) > install.packages("lattice") ## could be any package > > gives me: > > > options(download.file.method = "libcurl") > > options(repos = c(CRAN = "https://cran.rstudio.com/";, CRANextra > = "http://www.stats.ox.ac.uk/pub/RWin";)) > > install.packages("lattice") ## coudl be any package > Installing package into ‘/Users/kevinushey/Library/R/3.2/library’ > (as ‘lib’ is unspecified) > Error: Line starting ' > This seems to come from a call to `available.packages()` to a URL that > doesn't exist on the server (likely when querying PACKAGES on the > CRANextra repo) > > Eg. > > > URL <- "http://www.stats.ox.ac.uk/pub/RWin"; > > available.packages(URL, method = "internal") > Warning: unable to access index for repository > http://www.stats.ox.ac.uk/pub/RWin > Package Version Priority Depends Imports LinkingTo Suggests > Enhances License License_is_FOSS > License_restricts_use OS_type Archs MD5sum NeedsCompilation > File Repository > > available.packages(URL, method = "libcurl") > Error: Line starting ' > It looks like libcurl downloads and retrieves the 403 page itself, > rather than reporting that it was actually forbidden, e.g.: > > > download.file("http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz";, > tempfile(), method = "libcurl") > trying URL 'http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz' > Content type 'text/html; charset=iso-8859-1' length 339 bytes > == > downloaded 339 bytes > > Using `method = "internal"` gives an error related to the inability to > access that URL due to the HTTP status 403. > > The overarching issue here is that package installation shouldn't fail > even if libcurl fails to access one of the repositories set. > With > R.version.string [1] "R version 3.2.2 Patched (2015-08-25 r69179)" the behavior is to warn with an indication of the repository for which the problem occurs > URL <- "http://www.stats.ox.ac.uk/pub/RWin"; > available.packages(URL, method="libcurl") Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin: Line starting ' available.packages(URL, method="internal") Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin: cannot open URL 'http://www.stats.ox.ac.uk/pub/RWin/PACKAGES' Package Version Priority Depends Imports LinkingTo Suggests Enhances License License_is_FOSS License_restricts_use OS_type Archs MD5sum NeedsCompilation File Repository Does that work for you / address the problem? Martin >> sessionInfo() > R version 3.2.2 (2015-08-14) > Platform: x86_64-apple-darwin13.4.0 (64-bit) > Running under: OS X 10.10.4 (Yosemite) > > locale: > [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] testthat_0.8.1.0.99 knitr_1.11 devtools_1.5.0.9001 > [4] BiocInstaller_1.15.5 > > loaded via a namespace (and not attached): > [1] httr_1.0.0 R6_2.0.0.9000 tools_3.2.2parallel_3.2.2 whisker_0.3-2 > [6] RCurl_1
Re: [Rd] Issues with libcurl + HTTP status codes (eg. 403, 404)
On 08/25/2015 12:54 PM, Kevin Ushey wrote: Hi all, The following fails for me (on OS X, although I imagine it's the same on other platforms using libcurl): options(download.file.method = "libcurl") options(repos = c(CRAN = "https://cran.rstudio.com/";, CRANextra = "http://www.stats.ox.ac.uk/pub/RWin";)) install.packages("lattice") ## could be any package gives me: > options(download.file.method = "libcurl") > options(repos = c(CRAN = "https://cran.rstudio.com/";, CRANextra = "http://www.stats.ox.ac.uk/pub/RWin";)) > install.packages("lattice") ## coudl be any package Installing package into ‘/Users/kevinushey/Library/R/3.2/library’ (as ‘lib’ is unspecified) Error: Line starting ' URL <- "http://www.stats.ox.ac.uk/pub/RWin"; > available.packages(URL, method = "internal") Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin Package Version Priority Depends Imports LinkingTo Suggests Enhances License License_is_FOSS License_restricts_use OS_type Archs MD5sum NeedsCompilation File Repository > available.packages(URL, method = "libcurl") Error: Line starting ' download.file("http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz";, tempfile(), method = "libcurl") trying URL 'http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz' Content type 'text/html; charset=iso-8859-1' length 339 bytes == downloaded 339 bytes Using `method = "internal"` gives an error related to the inability to access that URL due to the HTTP status 403. The overarching issue here is that package installation shouldn't fail even if libcurl fails to access one of the repositories set. With > R.version.string [1] "R version 3.2.2 Patched (2015-08-25 r69179)" the behavior is to warn with an indication of the repository for which the problem occurs > URL <- "http://www.stats.ox.ac.uk/pub/RWin"; > available.packages(URL, method="libcurl") Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin: Line starting ' available.packages(URL, method="internal") Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin: cannot open URL 'http://www.stats.ox.ac.uk/pub/RWin/PACKAGES' Package Version Priority Depends Imports LinkingTo Suggests Enhances License License_is_FOSS License_restricts_use OS_type Archs MD5sum NeedsCompilation File Repository Does that work for you / address the problem? Martin sessionInfo() R version 3.2.2 (2015-08-14) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.10.4 (Yosemite) locale: [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] testthat_0.8.1.0.99 knitr_1.11 devtools_1.5.0.9001 [4] BiocInstaller_1.15.5 loaded via a namespace (and not attached): [1] httr_1.0.0 R6_2.0.0.9000 tools_3.2.2parallel_3.2.2 whisker_0.3-2 [6] RCurl_1.95-4.1 memoise_0.2.1 stringr_0.6.2 digest_0.6.4 evaluate_0.7.2 Thanks, Kevin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] download.file() on ftp URL fails in windows with default download method
In r69089 (R-devel) and 69090 (R-3-2-branch) the "wininet" ftp download method tries EPSV / PASV first. Success requires that the client (user) be able to open outgoing unprivileged ports, which will usually be the case. Proxies and so on should be handled by the OS and virtualization layer. Reports to the contrary welcome... Martin - Original Message - > Hi David, > > - Original Message - > > From: "David Smith" > > To: "Dan Tenenbaum" , "Uwe Ligges" > > , "Elliot Waingold" > > > > Cc: "R-devel@r-project.org" > > Sent: Wednesday, August 12, 2015 12:42:39 PM > > Subject: RE: [Rd] download.file() on ftp URL fails in windows with > > default download method > > > > We were also able to reproduce the issue on Windows Server 2012. If > > there's anything we can do to help please let me know; Elliot > > Waingold (CC'd here) can provide access to the VM we used for > > testing if that's of any help. > > > > Thanks! > > I have just been looking at this issue with Martin Morgan. We found > that if we "or" the additional flag INTERNET_FLAG_PASSIVE on line > 1012 of src/modules/internet/internet.c (R-3.2 branch, last changed > in r68393) > that the ftp connection works. > > Further investigation reveals that in a passive ftp connection, > certain ports on the client need to be open. > This machine is in the Amazon cloud so it was easy to open the ports. > But we still have a problem and I believe it's that the wrong IP > address is being sent to the server (on an AWS machine, the machine > thinks of itself as having one IP address, but that is a private > address that is valid inside AWS only). > > Here's a curl command line that gets around this by sending the > correct address (or hostname): > > curl --ftp-port myhostname.com > ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/All/GCF_01405.13.assembly.txt > > Curl normally uses passive mode which is why it works, but the > --ftp-port switch tells it to use active mode with the specified ip > address or hostname. > > So I'm not sure where we go from here. One easy fix is just to add > the INTERNET_FLAG_PASSIVE flag as described above. Another would be > to first check if active mode works, and if not, use passive mode. > > Dan > > > > # David Smith > > > > -- > > David M Smith > > R Community Lead, Revolution Analytics (a Microsoft company) > > Tel: +1 (312) 9205766 (Chicago IL, USA) > > Twitter: @revodavid | Blog: http://blog.revolutionanalytics.com > > We are hiring engineers for Revolution R and Azure Machine > > Learning. > > > > -Original Message- > > From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of > > Dan > > Tenenbaum > > Sent: Tuesday, August 11, 2015 09:51 > > To: Uwe Ligges > > Cc: R-devel@r-project.org > > Subject: Re: [Rd] download.file() on ftp URL fails in windows with > > default download method > > > > > > > > - Original Message - > > > From: "Dan Tenenbaum" > > > To: "Uwe Ligges" > > > Cc: "R-devel@r-project.org" > > > Sent: Saturday, August 8, 2015 4:02:54 PM > > > Subject: Re: [Rd] download.file() on ftp URL fails in windows > > > with > > > default download method > > > > > > > > > > > > - Original Message - > > > > From: "Uwe Ligges" > > > > To: "Dan Tenenbaum" , > > > > "R-devel@r-project.org" > > > > Sent: Saturday, August 8, 2015 3:57:34 PM > > > > Subject: Re: [Rd] download.file() on ftp URL fails in windows > > > > with > > > > default download method > > > > > > > > > > > > > > > > On 08.08.2015 01:11, Dan Tenenbaum wrote: > > > > > Hi, > > > > > > > > > >> url <- > > > > >> "ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/All/GCF_01405.13.assembly.txt"; > > > > >> download.file(url, tempfile()) > > > > > trying URL > > > > > 'ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/All/GCF_01405.13.assembly.txt' > > > > > Error in download.file(url, tempfile()) : > > > > >cannot open URL > > > > > > > > > > 'ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/All/GCF_01405.13.assembly.txt' > > &g
Re: [Rd] List S3 methods and defining packages
On 07/07/2015 02:05 AM, Renaud Gaujoux wrote: Hi, from the man page ?methods, I expected to be able to build pairs (class,package) for a given S3 method, e.g., print, using attr(methods(print), 'info'). However all the methods, except the ones defined in base or S4 methods, get the 'from' value "registered S3method for print", instead of the actual package name (see below for the first rows). Is this normal behaviour? If so, is there a way to get what I want: a character vector mapping class to package (ideally in loading order, but this I can re-order from search()). It's the way it has always been, so normal in that sense. There could be two meanings of 'from' -- the namespace in which the generic to which the method belongs is defined, and the namespace in which the method is defined. I think the former is what you're interested in, but the latter likely what methods() might be modified return. For your use case, maybe something like .S3methodsInNamespace <- function(envir, pattern) { mtable <- get(".__S3MethodsTable__.", envir = asNamespace(envir)) methods <- ls(mtable, pattern = pattern) env <- vapply(methods, function(x) { environmentName(environment(get(x, mtable))) }, character(1)) setNames(names(env), unname(env)) } followed by nmspc = loadedNamespaces() lapply(setNames(nmspc, nmspc), .S3methodsInNamespace, "^plot.") which reveals the different meanings of 'from', e.g., > lapply(setNames(nmspc, nmspc), .S3methodsInNamespace, "^plot.")["graphics"] $graphics stats graphicsstats "plot.acf""plot.data.frame" "plot.decomposed.ts" graphicsstatsstats "plot.default""plot.dendrogram" "plot.density" stats graphics graphics "plot.ecdf""plot.factor" "plot.formula" graphicsstats graphics "plot.function""plot.hclust" "plot.histogram" statsstatsstats "plot.HoltWinters""plot.isoreg""plot.lm" statsstatsstats "plot.medpolish" "plot.mlm" "plot.ppr" statsstatsstats "plot.prcomp" "plot.princomp" "plot.profile.nls" graphicsstatsstats "plot.raster" "plot.spec" "plot.stepfun" stats graphicsstats "plot.stl" "plot.table""plot.ts" statsstats "plot.tskernel" "plot.TukeyHSD" Also this is for loaded, rather than attached, namespaces. Martin Morgan Thank you. Bests, Renaud visible from generic isS4 print.abbrev FALSE registered S3method for print print FALSE print.acf FALSE registered S3method for print print FALSE print.AES FALSE registered S3method for print print FALSE print.agnesFALSE registered S3method for print print FALSE print.anovaFALSE registered S3method for print print FALSE print.AnovaFALSE registered S3method for print print FALSE print.anova.loglm FALSE registered S3method for print print FALSE print,ANY-methodTRUE base print TRUE print.aov FALSE registered S3method for print print FALSE __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] S4 inheritance and old class
On 05/28/2015 02:49 AM, Julien Idé wrote: Hey everyone, I would like to develop a package using S4 classes. I have to define several S4 classes that inherits from each others as follow: # A <- B <- C <- D I also would like to define .DollarNames methods for these class so, if I have understood well, I also have to define an old class as follow: # AOld <- A <- B <- C <- D setOldClass(Classes = "AOld") setClass( Class = "A", contains = "AOld", slots = list(A = "character") ) .DollarNames.A <- function(x, pattern) grep(pattern, slotNames(x), value = TRUE) Instead of setOldClass, define a $ method on A setMethod("$", "A", function(x, name) slot(x, name)) And then a = new("A") a$ d = new("D") d$ I don't know about the setOldClass problem; it seems like a bug. Martin Morgan setClass( Class = "B", contains = "A", slots = list(B = "character"), validity = function(object){ cat("Testing an object of class '", class(object), "'' with valitity function of class 'B'", sep = "") cat("Validity test for class 'B': ", object@A, sep = "") return(TRUE) } ) setClass( Class = "C", contains = c("B"), slots = list(C = "character"), validity = function(object){ cat("Testing an object of class '", class(object), "'' with valitity function of class 'C'", sep = "") cat("Validity test for class 'C': ", object@A, sep = "") return(TRUE) } ) setClass( Class = "D", contains = "C", slots = list(D = "character"), validity = function(object){ cat("Testing an object of class '", class(object), "'' with valitity function of class 'D'", sep = "") cat("Validity test for class 'D': ", object@A, sep = "") return(TRUE) } ) My problem is that when I try to create an object of class "D" and test its validity validObject(new("D")) it seems that at some point the object is coerced to an object of class "AOld" and tested by the validity function of class "B". What am I missing here? Julien [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] example fails during R CMD CHECK but works interactively?
On 05/15/2015 05:05 AM, Charles Determan wrote: Does anyone else have any thoughts about troubleshooting the R CMD check environment? In the pkg.Rcheck directory there is a file pkg-Ex.R. LANGUAGE=en _R_CHECK_INTERNALS2_=1 $(R_HOME)/bin/R --vanilla pkge-Ex.R followed by the usual strategy of bisecting the file into smaller chunks that still reproduce the example. (this is based on my parsing of the complicated source, most relevant at https://github.com/wch/r-source/blob/trunk/src/library/tools/R/check.R#L2467 and https://github.com/wch/r-source/blob/trunk/src/library/tools/R/check.R#L36 ) Martin Charles On Wed, May 13, 2015 at 1:57 PM, Charles Determan wrote: Thank you Dan but it isn't my tests that are failing (all of them pass without problem) but one of the examples from the inst/examples directory. I did try, however, to start R with the environmental variables as you suggest but it had no effect on my tests. Charles On Wed, May 13, 2015 at 1:51 PM, Dan Tenenbaum wrote: - Original Message - From: "Charles Determan" To: r-devel@r-project.org Sent: Wednesday, May 13, 2015 11:31:36 AM Subject: [Rd] example fails during R CMD CHECK but works interactively? Greetings, I am collaborating with developing the bigmemory package and have run in to a strange problem when we run R CMD CHECK. For some reason that isn't clear to us one of the examples crashes stating: Error: memory could not be allocated for instance of type big.matrix You can see the output on the Travis CI page at https://travis-ci.org/kaneplusplus/bigmemory where the error starts at line 1035. This is completely reproducible when running devtools::check(args='--as-cran') locally. The part that is confusing is that the calls work perfectly when called interactively. Hadley comments on the 'check' page of his R packages website ( http://r-pkgs.had.co.nz/check.html) regarding test failing following R CMD check: Occasionally you may have a problem where the tests pass when run interactively with devtools::test(), but fail when in R CMD check. This usually indicates that you’ve made a faulty assumption about the testing environment, and it’s often hard to figure it out. Any thoughts on how to troubleshoot this problem? I have no idea what assumption we could have made. Note that R CMD check runs R with environment variables set as follows (at least on my system; you can check $R_HOME/bin/check to see what it does on yours): R_DEFAULT_PACKAGES= LC_COLLATE=C So try staring R like this: R_DEFAULT_PACKAGES= LC_COLLATE=C R And see if that reproduces the test failure. The locale setting could affect tests of sort order, and the default package setting could potentially affect other things. Dan Regards, Charles [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Creating a vignette which depends on a non-distributable file
On 05/14/2015 04:33 PM, Henrik Bengtsson wrote: On May 14, 2015 15:04, "January Weiner" wrote: Dear all, I am writing a vignette that requires a file which I am not allowed to distribute, but which the user can easily download manually. Moreover, it is not possible to download this file automatically from R: downloading requires a (free) registration that seems to work only through a browser. (I'm talking here about the MSigDB from the Broad Institute, http://www.broadinstitute.org/gsea/msigdb/index.jsp). In the vignette, I tell the user to download the file and then show how it can be parsed and used in R. Thus, I can compile the vignette only if this file is present in the vignettes/ directory of the package. However, it would then get included in the package -- which I am not allowed to do. What should I do? (1) finding an alternative to MSigDB is not a solution -- there simply is no alternative. (2) I could enter the code (and the results) in a verbatim environment instead of using Sweave. This has obvious drawbacks (for one thing, it would look incosistent). use the chunk argument eval=FALSE instead of placing the code in a verbatim argument. See ?RweaveLatex if you're compiling a PDF vignette from Rnw or the knitr documentation for (much nicer for users of your vignette, in my opinion) Rmd vignettes processed to HTML. A common pattern is to process chunks 1, 2, 3, 4, and then there is a 'leap of faith' in chunk 5 (with eval=FALSE) and a second chunk (maybe with echo=FALSE, eval=TRUE) that reads the _result_ that would have been produced by chunk 5 from a serialized instance into the R session for processing in chunks 6, 7, 8... Also very often while it might make sense to analyse an entire data set as part of a typical work flow, for illustrative purposes a much smaller subset or simulated data might be relevant; again a strategy would be to illustrate the problematic steps with simulated data, and then resume the narrative with the analyzed full data. A secondary consideration may be that if your package _requires_ MSigDB to function, then it can't be automatically tested by repository build machines -- you'll want to have unit tests or other approaches to ensure that 'bit rot' does not set in without you being aware of it. If this is a Bioconductor package, then it's appropriate to ask on the Bioconductor devel mailing list. http://bioconductor.org/developers/ http://bioconductor.org/packages/BiocStyle/ might be your friend for producing stylish vignettes. Martin (3) I could build vignette outside of the package and put it into the inst/doc directory. This also has obvious drawbacks. (4) Leaving this example out defies the purpose of my package. I am tending towards solution (2). What do you think? Not clear how big of a static piece you're taking about, but maybe you could set it up such that you use (2) as a fallback, i.e. have the vignette include a static/pre-generated piece (which is clearly marked as such) only if the external dependency is not available. Just a thought Henrik Kind regards, j. -- January Weiner -- [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] S4 method dispatch sometimes leads to incorrect when object loaded from file?
On 05/12/2015 05:31 AM, Martin Maechler wrote: Martin Morgan on Mon, 11 May 2015 10:18:07 -0700 writes: > On 05/10/2015 08:19 AM, Martin Morgan wrote: >> Loading an S4 object from a file without first loading the library sometimes (?, >> the example below and actual example involves a virtual base class and the show >> generic) leads to incorrect dispatch (to the base class method). "Of course", this is not as desired. Other code automatically does try and typically succeed to load the package (yes "package" ! ;-)) when 'needed', right, so show() is an exception here, no ? I added dim() methods, which also misbehave (differently) setMethod("dim", "A", function(x) "A-dim") setMethod("dim", "B", function(x) "B-dim") ~/tmp$ R --vanilla --slave -e "load('b.Rda'); dim(b)" Loading required package: PkgA NULL ~/tmp$ R --vanilla --slave -e "require('PkgA'); load('b.Rda'); dim(b)" [1] "B-dim" but sort of auto-heal (versus show, which is corrupted) ~/tmp$ R --vanilla --slave -e "load('b.Rda'); dim(b); dim(b)" Loading required package: PkgA NULL [1] "B-dim" ~/tmp$ R --vanilla --slave -e "load('b.Rda'); b; b" Loading required package: PkgA A A >> The attached package reproduces the problem. It has > The package was attached but stripped; a version is at > https://github.com/mtmorgan/PkgA > FWIW the sent mail was a multi-part MIME with the header on the package part > Content-Type: application/gzip; > name="PkgA.tar.gz" > Content-Transfer-Encoding: base64 > Content-Disposition: attachment; > filename="PkgA.tar.gz" > From http://www.r-project.org/mail.html#instructions "we allow application/pdf, > application/postscript, and image/png (and x-tar and gzip on R-devel)" so I > thought that this mime type would not be stripped? You were alright in your assumptions -- but unfortunately, the accepted type has been application/x-gzip instead of .../gzip. I now *have* added the 2nd one as well. Sorry for that. The other Martin M.. > Martin Morgan >> >> setClass("A") >> setClass("B", contains="A") >> setMethod("show", "A", function(object) cat("A\n")) >> setMethod("show", "B", function(object) cat("B\n")) >> >> with NAMESPACE >> >> import(methods) >> exportClasses(A, B) >> exportMethods(show) >> >> This creates the object and illustrated expected behavior >> >> ~/tmp$ R --vanilla --slave -e "library(PkgA); b = new('B'); save(b, >> file='b.Rda'); b" >> B >> >> Loading PkgA before the object leads to correct dispatch >> >> ~/tmp$ R --vanilla --slave -e "library(PkgA); load(file='b.Rda'); b" >> B >> >> but loading the object without first loading PkgA leads to dispatch to >> show,A-method. >> >> ~/tmp$ R --vanilla --slave -e "load(file='b.Rda'); b" >> Loading required package: PkgA >> A >> >> Martin Morgan > -- > Computational Biology / Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > Location: Arnold Building M1 B861 > Phone: (206) 667-2793 > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] S4 method dispatch sometimes leads to incorrect when object loaded from file?
On 05/10/2015 08:19 AM, Martin Morgan wrote: Loading an S4 object from a file without first loading the library sometimes (?, the example below and actual example involves a virtual base class and the show generic) leads to incorrect dispatch (to the base class method). The attached package reproduces the problem. It has The package was attached but stripped; a version is at https://github.com/mtmorgan/PkgA FWIW the sent mail was a multi-part MIME with the header on the package part Content-Type: application/gzip; name="PkgA.tar.gz" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="PkgA.tar.gz" From http://www.r-project.org/mail.html#instructions "we allow application/pdf, application/postscript, and image/png (and x-tar and gzip on R-devel)" so I thought that this mime type would not be stripped? Martin Morgan setClass("A") setClass("B", contains="A") setMethod("show", "A", function(object) cat("A\n")) setMethod("show", "B", function(object) cat("B\n")) with NAMESPACE import(methods) exportClasses(A, B) exportMethods(show) This creates the object and illustrated expected behavior ~/tmp$ R --vanilla --slave -e "library(PkgA); b = new('B'); save(b, file='b.Rda'); b" B Loading PkgA before the object leads to correct dispatch ~/tmp$ R --vanilla --slave -e "library(PkgA); load(file='b.Rda'); b" B but loading the object without first loading PkgA leads to dispatch to show,A-method. ~/tmp$ R --vanilla --slave -e "load(file='b.Rda'); b" Loading required package: PkgA A Martin Morgan -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] S4 method dispatch sometimes leads to incorrect when object loaded from file?
Loading an S4 object from a file without first loading the library sometimes (?, the example below and actual example involves a virtual base class and the show generic) leads to incorrect dispatch (to the base class method). The attached package reproduces the problem. It has setClass("A") setClass("B", contains="A") setMethod("show", "A", function(object) cat("A\n")) setMethod("show", "B", function(object) cat("B\n")) with NAMESPACE import(methods) exportClasses(A, B) exportMethods(show) This creates the object and illustrated expected behavior ~/tmp$ R --vanilla --slave -e "library(PkgA); b = new('B'); save(b, file='b.Rda'); b" B Loading PkgA before the object leads to correct dispatch ~/tmp$ R --vanilla --slave -e "library(PkgA); load(file='b.Rda'); b" B but loading the object without first loading PkgA leads to dispatch to show,A-method. ~/tmp$ R --vanilla --slave -e "load(file='b.Rda'); b" Loading required package: PkgA A Martin Morgan -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check and missing imports from base packages
On 04/28/2015 01:04 PM, Gábor Csárdi wrote: When a symbol in a package is resolved, R looks into the package's environment, and then into the package's imports environment. Then, if the symbol is still not resolved, it looks into the base package. So far so good. If still not found, it follows the 'search()' path, starting with the global environment and then all attached packages, finishing with base and recommended packages. This can be a problem if a package uses a function from a base package, but it does not formally import it via the NAMESPACE file. If another package on the search path also defines a function with the same name, then this second function will be called. E.g. if package 'ggplot2' uses 'stats::density()', and package 'igraph' also defines 'density()', and 'igraph' is on the search path, then 'ggplot2' will call 'igraph::density()' instead of 'stats::density()'. stats::density() is an S3 generic, so igraph would define an S3 method, right? And in general a developer would avoid masking a function in a base package, so as not to require the user to distinguish between stats::density() and igraph::density(). Maybe the example is not meant literally. Being able to easily flag non-imported, non-base symbols would definitely improve the robustness of package code, even if not helping the end user disambiguate duplicate symbols. Martin Morgan I think that for a better solution, either 1) the search path should not be used at all to resolve symbols in packages, or 2) only base packages should be searched. I realize that this is something that is not easy to change, especially 1) would break a lot of packages. But maybe at least 'R CMD check' could report these cases. Currently it reports missing imports for non-base packages only. Is it reasonable to have a NOTE for missing imports from base packages as well? [As usual, please fix me if I am missing or misunderstood something.] Thank you, Best, Gabor [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel does not update the C++ returned variables
On 03/02/2015 11:39 AM, Dirk Eddelbuettel wrote: On 2 March 2015 at 16:37, Martin Maechler wrote: | | > On 2 March 2015 at 09:09, Duncan Murdoch wrote: | > | I generally recommend that people use Rcpp, which hides a lot of the | > | details. It will generate your .Call calls for you, and generate the | > | C++ code that receives them; you just need to think about the real | > | problem, not the interface. It has its own learning curve, but I think | > | it is easier than using the low-level code that you need to work with .Call. | | > Thanks for that vote, and I second that. | | > And these days the learning is a lot flatter than it was a decade ago: | | > R> Rcpp::cppFunction("NumericVector doubleThis(NumericVector x) { return(2*x); }") | > R> doubleThis(c(1,2,3,21,-4)) | > [1] 2 4 6 42 -8 | > R> | | > That defined, compiled, loaded and run/illustrated a simple function. | | > Dirk | | Indeed impressive, ... and it also works with integer vectors | something also not 100% trivial when working with compiled code. | | When testing that, I've went a step further: As you may know, int can be 'casted up' to double which is what happens here. So in what follows you _always_ create a copy from an int vector to a numeric vector. For pure int, use eg Rcpp::cppFunction("IntegerVector doubleThis(IntegeerVector x) { return(2*x); }") and rename the function names as needed to have two defined concurrently. avoiding duplication, harmless in the doubleThis() case, comes at some considerable hazard in general > Rcpp::cppFunction("IntegerVector incrThisAndThat(IntegerVector x) { x[0] += 1; return x; }") > x = y = 1:5 > incrThisAndThat(x) [1] 2 2 3 4 5 > x [1] 2 2 3 4 5 > y [1] 2 2 3 4 5 (how often this happens in the now relatively large number of user-contributed packages using Rcpp?). It seems like 'one-liners' should really encourage something safer (sometimes at the expense of 'speed'), Rcpp::cppFunction("IntegerVector doubleThis(const IntegerVector x) { return x * 2; }") Rcpp::cppFunction("std::vector incrThis(std::vector x) { x[0] += 1; return x; }") or that Rcpp should become more careful (i.e., should not allow!) modifying arguments with NAMED != 0. Martin (Morgan) Dirk | | ## now "test": | require(microbenchmark) | i <- 1:10 | (mb <- microbenchmark(doubleThis(i), i*2, 2*i, i*2L, 2L*i, i+i, times=2^12)) | ## Lynne (i7; FC 20), R Under development ... (2015-03-02 r67924): | ## Unit: nanoseconds | ## expr min lq mean median uq max neval cld | ## doubleThis(i) 762 985 1319.5974 1124 1338 17831 4096 b | ## i * 2 124 151 258.4419164 221 4 4096 a | ## 2 * i 127 154 266.4707169 216 20213 4096 a | ## i * 2L 143 164 250.6057181 234 16863 4096 a | ## 2L * i 144 177 269.5015193 237 16119 4096 a | ## i + i 152 183 272.6179199 243 10434 4096 a | | plot(mb, log="y", notch=TRUE) | ## hmm, looks like even the simple arithm. differ slightly ... | ## | ## ==> zoom in: | plot(mb, log="y", notch=TRUE, ylim = c(150,300)) | | dev.copy(png, file="mbenchm-doubling.png") | dev.off() # [ <- why do I need this here for png ??? ] | ##--> see the appended *png graphic | | Those who've learnt EDA or otherwise about boxplot notches, will | know that they provide somewhat informal but robust pairwise tests on | approximate 5% level. | >From these, one *could* - possibly wrongly - conclude that | 'i * 2' is significantly faster than both 'i * 2L' and also | 'i + i' which I find astonishing, given that i is integer here... | | Probably no reason for deep thoughts here, but if someone is | enticed, this maybe slightly interesting to read. | | Martin Maechler, ETH Zurich | | [DELETED ATTACHMENT mbenchm-doubling.png, PNG image] -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] vapply definition question
On 12/16/2014 08:20 PM, Mick Jordan wrote: vapply <- function(X, FUN, FUN.VALUE, ..., USE.NAMES = TRUE) { FUN <- match.fun(FUN) if(!is.vector(X) || is.object(X)) X <- as.list(X) .Internal(vapply(X, FUN, FUN.VALUE, USE.NAMES)) } This is an implementor question. Basically, what happened to the '...' args in the call to the .Internal? cf lapply:, where the ... is passed. lapply <- function (X, FUN, ...) { FUN <- match.fun(FUN) ## internal code handles all vector types, including expressions ## However, it would be OK to have attributes which is.vector ## disallows. if(!is.vector(X) || is.object(X)) X <- as.list(X) ##TODO ## Note ... is not passed down. Rather the internal code ## evaluates FUN(X[i], ...) in the frame of this function .Internal(lapply(X, FUN, ...)) } Now both of these functions work when extra arguments are passed, so evidently the implementation can function whether the .Internal "call" contains the ... or not. I found other cases, notably in S3 generic methods where the ... is not passed down. Hi Mick -- You can see that the source code doesn't contain '...' in the final line ~/src/R-devel/src/library/base/R$ svn annotate lapply.R | grep Internal\(l 38631 ripley .Internal(lapply(X, FUN)) and that it's been there for a long time (I'd guess 'forever') ~/src/R-devel/src/library/base/R$ svn log -r38631 r38631 | ripley | 2006-07-17 14:30:55 -0700 (Mon, 17 Jul 2006) | 2 lines another attempt at a faster lapply so I guess you're looking at a modified version of the function... The implementation detail is in the comment -- FUN(X[i], ...) is evaluated in the frame of lapply. Martin Morgan So, essentially, my question is whether the vapply code "should" be changed or whether a .Internal implementation should always assume an implicit ... regardless of the code, if the semantics requires it. Thanks Mick __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R string comparisons may vary with platform (plain text)
For many scientific applications one is really dealing with ASCII characters and LC_COLLATE="C", even if the user is running in non-C locales. What robust approaches (if any?) are available to write code that sorts in a locale-independent way? The Note in ?Sys.setlocale is not overly optimistic about setting the locale within a session. Martin Morgan On 11/23/2014 03:44 AM, Prof Brian Ripley wrote: On 23/11/2014 09:39, peter dalgaard wrote: On 23 Nov 2014, at 01:05 , Henrik Bengtsson wrote: On Sat, Nov 22, 2014 at 12:42 PM, Duncan Murdoch wrote: On 22/11/2014, 2:59 PM, Stuart Ambler wrote: A colleague¹s R program behaved differently when I ran it, and we thought we traced it probably to different results from string comparisons as below, with different R versions. However the platforms also differed. A friend ran it on a few machines and found that the comparison behavior didn¹t correlate with R version, but rather with platform. I wonder if you¹ve seen this. If it¹s not some setting I¹m unaware of, maybe someone should look into it. Sorry I haven¹t taken the time to read the source code myself. Looks like a collation order issue. See ?Comparison. With the oddity that both platforms use what look like similar locales: LC_COLLATE=en_US.UTF-8 LC_COLLATE=en_US.utf8 It's the sort of thing thay I've tried to wrap my mind around multiple times and failed, but have a look at http://stackoverflow.com/questions/19967555/postgres-collation-differences-osx-v-ubuntu which seems to be essentially the same issue, just for Postgres. If you have the stamina, also look into the python question that it links to. As I understand it, there are two potential reasons: Either the two platforms are not using the same collation table for en_US, or at least one of them is not fully implementing the Unicode Collation Algorithm. And I have seen both with R. At the very least, check if ICU is being used (capabilities("ICU") in current R, maybe not in some of the obsolete versions seen in this thread). As a further possibility, there are choices in the UCA (in R, see ?icuSetCollate) and ICU can be compiled with different default choices. It is not clear to me what (if any) difference ICU versions make, but in R-devel extSoftVersion() reports that. In general, collation is a minefield: Some languages have the same letters in different order (e.g. Estonian with Z between S and T); accented characters sort with the unaccented counterpart in some languages but as separate characters in others; some locales sort ABab, others AaBb, yet others aAbB; sometimes punctuation is ignored, sometimes not; sometimes multiple characters count as one, etc. As ?Comparison has long said. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Changing style for the Sweave vignettes
On 11/13/2014 03:09 AM, January Weiner wrote: As a user, I am always annoyed beyond measure that Sweave vignettes precede the code by a command line prompt. It makes running examples by simple copying of the commands from the vignette to the console a pain. I know the idea is that it is clear what is the command, and what is the output, but I'd rather precede the output with some kind of marking. Is there any other solution possible / allowed in vignettes? I would much prefer to make my vignettes easier to use for people like me. Vignettes do not need to be generated by Sweave and to pdf documents. My current favorite (e.g., recent course material at http://bioconductor.org/help/course-materials/ which uses styling from the BiocStyle package http://bioconductor.org/packages/release/bioc/html/BiocStyle.html) uses the knitr package (see http://yihui.name/knitr/) to produce HTML vignettes (knitr will also process Rnw files to pdf with perhaps more appealing styling, see, e.g., http://bit.ly/117OLVl for an example of PDF output). The mechanics are discussed in Writing R Extensions (RShowDoc('R-exts')), section 1.4.2 Non-Sweave vignettes. There are three steps involved: specifying a \VignetteEngine in the vignette itself, specifying VignetteBuilder: field in the DESCRIPTION file, and including the package providing the engine (knitr, in my case) in the Suggests: field of the DESCRIPTION file. Brian mentioned processing the vignette to it's underlying code; see ?browseVignettes and ?vignette for installed packages, and ?Stangle in R and R CMD Stangle for extracting the R code from stand-alone vignettes to .R files. Martin Morgan Kind regards, j. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] How to maintain memory in R extension
On 11/12/2014 05:36 AM, Zheng Da wrote: Hello, I wrote a system to perform data analysis in C++. Now I am integrating it to R. I need to allocate memory for my own C++ data structures, which can't be represented by any R data structures. I create a global hashtable to keep a reference to the C++ data structures. Whenever I allocate one, I register it in the hashtable and return its key to the R code. So later on, the R code can access the C++ data structures with their keys. The problem is how to perform garbage collection on the C++ data structures. Once an R object that contains the key is garbage collected, the R code can no longer access the corresponding C++ data structure, so I need to deallocate it. Is there any way that the C++ code can get notification when an R object gets garbage collected? If not, what is the usual way to manage memory in R extensions? register a finalizer that runs when there are no longer references to the R object, see ?reg.finalizer or the interface to R and C finalizers in Rinternals.h. If you return more than one reference to a key, then of course you'll have to manage these in your own C++ code. Martin Morgan Thanks, Da __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] package vignettes build in the same R process?
If I understand correctly, all vignettes in a package are built in the same R process. Global options, loaded packages, etc., in an earlier vignette persist in later vignettes. This can introduce user confusion (e.g., when a later vignette builds successfully because a package is require()'ed in an earlier vignette, but not the current one), difficult-to-identify bugs (e.g., when a setting in an earlier vignette influences calculation in a latter vignette), and misleading information about reproducibility (e.g., when the sessionInfo() of a later vignette reflects packages used in earlier vignettes). I believe the relevant code is at src/library/tools/R/Vignettes.R:505 output <- tryCatch({ ## FIXME: run this in a separate process engine$weave(file, quiet = quiet) setwd(startdir) find_vignette_product(name, by = "weave", engine = engine) }, error = function(e) { stop(gettextf("processing vignette '%s' failed with diagnostics:\n%s", file, conditionMessage(e)), domain = NA, call. = FALSE) }) Is building of each vignette in separate processes a reasonable feature request? Martin -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Options that are local to the package that sets them
On 10/31/2014 05:55 PM, Gábor Csárdi wrote: On Fri, Oct 31, 2014 at 8:16 PM, William Dunlap wrote: You can put the following 3 objects, an environment and 2 functions that access it, in any package that need some package-specific storage (say your pkgB1 and pkgB2). .pkgLocalStorage <- new.env(parent = emptyenv()) assignInPkgLocalStorage <- function(name, object) { .pkgLocalStorage[[name]] <- object } getFromPkgLocalStorage <- function(name, object) { .pkgLocalStorage[[name]] } Leave the environment private and export the functions. Then a user can use them as pkgB1::assignInPkgLocalStorage("myPallete", makeAPallete(1,2,3)) pkgB2::assignInPkgLocalStorage("myPallete", makeAPallete(5,6,7)) pkgB1::getFromPkgLocalStorage("myPallete") # get the 1,2,3 pallete I am trying to avoid requiring pkgBn to do this kind of magic. I just want it to call function(s) from pkgA. But maybe something like this would work. In pkgBn: my_palettes <- pkgA::palette_factory() and my_palettes is a function or an environment that has the API functions to modify my_palettes itself (via closure if it is a function), e.g. my_palettes$add_palette(...) my_palettes$get_palette(...) or if it is a function, then my_palettes(add(...), ...) my_palettes(get(...), ...) etc. This would work, right? I'll try it in a minute. You'll need pkgA to be able to know that pkgB1's invokation is to use pkgB1's parameters, so coupling state (parameters) with function, i.e., a class with methods. So a solution is to use an S4 or reference class and generator to encapsulate state and dispatch to appropriate functions, E.g., .Plotter <- setRefClass("Plotter", fields=list(palette="character"), methods=list( update(palette) { .self$palette <- palette }, plot=function(...) { graphics::plot(..., col=.self$palette) })) APlotter <- function(palette=c("red", "green", "blue")) .Plotter(palette=palette) PkgB1, 2 would then plt = APlotter() plt$plot(mpg ~ disp, mtcars) plt$update(c("blue", "green")) plt$plot(mpg ~ disp, mtcars) or .S4Plotter <- setClass("S4Plotter", representation(palette="character") S4Plotter <- function(palette=c("red", "blue", "green")) s4plot <- function(x, ...) graphics::plot(..., col=x@palette)) (make s4plot a generic with method for class S4Plotter to enforce type). Seems like this interface could be generated automatically in .onLoad() of pkgA, especially if adopting a naming convention of some sort. Martin Gabor If only one of pkgB1 and pkgB2 is loaded you can leave off the pkgBn::. A package writer can always leave off the pkgBn:: as well. Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Oct 31, 2014 at 4:34 PM, Gábor Csárdi wrote: Dear All, I am trying to do the following, and could use some hints. Suppose I have a package called pkgA. pkgA exposes an API that includes setting some options, e.g. pkgA works with color palettes, and the user of the package can define new palettes. pkgA provides an API to manipulate these palettes, including defining them. pkgA is intended to be used in other packages, e.g. in pkgB1 and pkgB2. Now suppose pkgB1 and pkgB2 both set new palettes using pkgA. They might set palettes with the same name, of course, they do not know about each other. My question is, is there a straightforward way to implement pkgA's API, such that pkgB1 and pkgB2 do not interfere? In other words, if pkgB1 and pkgB2 both define a palette 'foo', but they define it differently, each should see her own version of it. I guess this requires that I put something (a function?) in both pkgB1's and pkgB2's package namespace. As I see it, this can only happen when pkgA's API is called from pkgB1 (and pkgB2). So at this time I could just walk up the call tree and put the palette definition in the first environment that is not pkgA's. This looks somewhat messy, and I am probably missing some caveats. Is there a better way? I have a feeling that this is already supported somehow, I just can't find out how. Thanks, Best Regards, Gabor __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mpi.h errors on Mavericks packages
On 10/03/2014 04:58 PM, Martin Morgan wrote: On 10/03/2014 04:17 PM, Daniel Fuka wrote: Dear mac folks, I have started porting a large legacy toolset maintained in windows and heavily mpi laden so it can be used across platforms in R... so I am building a package out of it. On this note, I am noticing that almost all of the mpi dependent packages do not compile on the CRAN repositories with the basic issue that it appears it can not find mpi installed: configure: error: "Cannot find mpi.h header file" sorry for the noise! you're after mpi and not openMP. Arrgh Martin Hi Dan -- not a mac folk, or particularly expert on the subject, but have you looked at section 1.2.1.1 of RShowDoc("R-exts")? The basic idea is a) check for compiler support via a src/Makevars file that might be like PKG_CFLAGS = $(SHLIB_OPENMP_CFLAGS) PKG_LIBS = $(SHLIB_OPENMP_CFLAGS) b) conditionally include mpi header files and execute mpi code with #ifdef SUPPORT_OPENMP #include #endif and similarly for #pragma's and other mpi-isms littered through your code? Likely this gets quite tedious for projects making extensive use of openMP. Martin I do not see any chatter about mpi issues in the lists since the inception of mavericks.. and possibly this question should go to Simon.. but in case I missed a discussion, or if anyone has any suggestions on how to proceed, or what might be missing from the Rmpi, npRmpi, etc. packages for compilation on Mavericks, it would be greatly appreciated if you could let me know.. and maybe I can help fix the other packages as well. Thanks for any help or pointers to guide me! dan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mpi.h errors on Mavericks packages
On 10/03/2014 04:17 PM, Daniel Fuka wrote: Dear mac folks, I have started porting a large legacy toolset maintained in windows and heavily mpi laden so it can be used across platforms in R... so I am building a package out of it. On this note, I am noticing that almost all of the mpi dependent packages do not compile on the CRAN repositories with the basic issue that it appears it can not find mpi installed: configure: error: "Cannot find mpi.h header file" Hi Dan -- not a mac folk, or particularly expert on the subject, but have you looked at section 1.2.1.1 of RShowDoc("R-exts")? The basic idea is a) check for compiler support via a src/Makevars file that might be like PKG_CFLAGS = $(SHLIB_OPENMP_CFLAGS) PKG_LIBS = $(SHLIB_OPENMP_CFLAGS) b) conditionally include mpi header files and execute mpi code with #ifdef SUPPORT_OPENMP #include #endif and similarly for #pragma's and other mpi-isms littered through your code? Likely this gets quite tedious for projects making extensive use of openMP. Martin I do not see any chatter about mpi issues in the lists since the inception of mavericks.. and possibly this question should go to Simon.. but in case I missed a discussion, or if anyone has any suggestions on how to proceed, or what might be missing from the Rmpi, npRmpi, etc. packages for compilation on Mavericks, it would be greatly appreciated if you could let me know.. and maybe I can help fix the other packages as well. Thanks for any help or pointers to guide me! dan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] install.packages misleads about package availability?
In the context of installing a Bioconductor package using our biocLite() function, install.packages() warns > install.packages("RUVSeq", repos="http://bioconductor.org/packages/2.14/bioc";) Installing package into '/home/mtmorgan/R/x86_64-unknown-linux-gnu-library/3.1-2.14' (as 'lib' is unspecified) Warning message: package 'RUVSeq' is not available (for R version 3.1.1 Patched) but really the problem is that the package is not available at the specified repository (it is available, for the same version of R, in the Bioc devel repository http://bioconductor.org/packages/3.0/bioc). I can see the value of identifying the R version, and see that mentioning something about 'specified repositories' would not necessarily be helpful. Also, since the message is translated and our user base is international, it is difficult to catch and process by the biocLite() script. Is there a revised wording that could be employed to more accurately convey the reason for the failure, or is this an opportunity to use the condition system? Index: src/library/utils/R/packages2.R === --- src/library/utils/R/packages2.R (revision 66562) +++ src/library/utils/R/packages2.R (working copy) @@ -46,12 +46,12 @@ p0 <- unique(pkgs) miss <- !p0 %in% row.names(available) if(sum(miss)) { - warning(sprintf(ngettext(sum(miss), -"package %s is not available (for %s)", -"packages %s are not available (for %s)"), - paste(sQuote(p0[miss]), collapse=", "), - sub(" *\\(.*","", R.version.string)), -domain = NA, call. = FALSE) +txt <- ngettext(sum(miss), "package %s is not available (for %s)", +"packages %s are not available (for %s)") +msg <- simpleWarning(sprintf(txt, paste(sQuote(p0[miss]), collapse=", "), + sub(" *\\(.*","", R.version.string))) +class(msg) <- c("packageNotAvailable", class(msg)) +warning(msg) if (sum(miss) == 1L && !is.na(w <- match(tolower(p0[miss]), tolower(row.names(available) { -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Re R CMD check checking in development version of R
On 08/28/2014 05:52 AM, Hadley Wickham wrote: I'd say: Depends is a historical artefact from ye old days before package namespaces. Apart from depending on a specific version of R, you should basically never use depends. (The one exception is, as mentioned in R-exts, if you're writing something like latticeExtras that doesn't make sense unless lattice is already loaded). Keeping this nuance in mind when when discussing Depends vs Imports is important so as to not suggest that there isn't any reason to use Depends any longer. A common case in Bioconductor is that a package defines a class and methods intended for the user; this requires the package to be on the search path (else the user wouldn't be able to do anything with the returned object). A class and supporting methods can represent significant infrastructure, so that it makes sense to separate these in distinct packages. It is not uncommon to find 3-5 or more packages in the Depends: field of derived packages for this reason. For that scenario, is it reasonable to say that every package in depends must also be in imports? Important to pay attention to capitalization here. A package listed in Depends: _never_ needs to be listed in Imports:, but will often be import()'ed (in one way or another) in the NAMESPACE. Some would argue that listing a package in Depends: and Imports: in this case clarifies intent -- provides functionality available to the user, and important for the package itself. Others (such as R CMD check) view the replication as redundancy. I think one can imagine scenarios where a package in the Depends: fields does not actually have anything import()'ed, e.g., PkgA defines a class, PkgB provides some special functionality that returns the class PkgC use PkgB's special functionality without ever manipulating the object of PkgA. Martin Morgan Hadley -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Re R CMD check checking in development version of R
On 08/27/2014 08:33 PM, Gavin Simpson wrote: On Aug 27, 2014 5:24 PM, "Hadley Wickham" I'd say: Depends is a historical artefact from ye old days before package namespaces. Apart from depending on a specific version of R, you should basically never use depends. (The one exception is, as mentioned in R-exts, if you're writing something like latticeExtras that doesn't make sense unless lattice is already loaded). Keeping this nuance in mind when when discussing Depends vs Imports is important so as to not suggest that there isn't any reason to use Depends any longer. A common case in Bioconductor is that a package defines a class and methods intended for the user; this requires the package to be on the search path (else the user wouldn't be able to do anything with the returned object). A class and supporting methods can represent significant infrastructure, so that it makes sense to separate these in distinct packages. It is not uncommon to find 3-5 or more packages in the Depends: field of derived packages for this reason. Martin I am in full agreement that its use should be limited to exceptional situations, and have modified my packages accordingly. Cheers, G This check (whilst having found some things I should have imported and didn't - which is a good thing!) seems to be circumventing the intention of having something in Depends. Is Depends going to go away? I don't think it's going to go away anytime soon, but you should consider it to be largely deprecated and you should avoid it wherever possible. (And really you shouldn't have any packages in depends, they should all be in imports) I disagree with *any*; having say vegan loaded when one is using analogue is a design decision as the latter borrows heavily from and builds upon vegan. In general I have moved packages that didn't need to be in Depends into Imports; in the version I am currently doing final tweaks on before it goes to CRAN I have remove all but vegan from Depends. I think that is a reasonable use case for depends. Here's the exact text from R-exts: "Field ‘Depends’ should nowadays be used rarely, only for packages which are intended to be put on the search path to make their facilities available to the end user (and not to the package itself): for example it makes sense that a user of package latticeExtra would want the functions of package lattice made available." Personally I avoid even this use, requiring users of my packages to be explicit about exactly what packages are on the search path. You are of course welcome to your own approach, but I think you'll find it will become more and more difficult to maintain in time. I recommend that you bite the bullet now. Put another way, packages should be extremely conservative about global side effects (and modifying the search path is such a side-effect) Hadley -- http://had.co.nz/ [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Should a package that indirectly Suggests: a vignette engine pass R CMD check?
A package uses VignetteEngine: knitr; the package itself does not Suggests: knitr, but it Suggests: BiocStyle which in turn Suggests: knitr. Nonetheless, R CMD check fails indicating that a package required for checking is not declared. Is it really the intention that the original package duplicate Suggests: knitr? This is only with a recent R. In detail, with $ Rdev --version|head -3 R Under development (unstable) (2014-06-14 r65947) -- "Unsuffered Consequences" Copyright (C) 2014 The R Foundation for Statistical Computing Platform: x86_64-unknown-linux-gnu (64-bit) trying to check the Bioconductor genefilter package leads to $ Rdev --vanilla CMD check genefilter_1.47.5.tar.gz * using log directory ‘/home/mtmorgan/b/Rpacks/genefilter.Rcheck’ * using R Under development (unstable) (2014-06-13 r65941) * using platform: x86_64-unknown-linux-gnu (64-bit) * using session charset: UTF-8 * checking for file ‘genefilter/DESCRIPTION’ ... OK * this is package ‘genefilter’ version ‘1.47.5’ * checking package namespace information ... OK * checking package dependencies ... ERROR VignetteBuilder package not declared: ‘knitr’ See the information on DESCRIPTION files in the chapter ‘Creating R packages’ of the ‘Writing R Extensions’ manual. I interpret this to mean that knitr should be mentioned in Suggests: or other dependency field. The package does not Suggests: knitr, but it does Suggests: BiocStyle, which itself Suggests: knitr. The author knows that they are using the BiocStyle package for their vignette, and the BiocStyle package suggests the appropriate builder. Martin Morgan -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check for the R code from vignettes
On 05/31/2014 03:52 PM, Yihui Xie wrote: Note the test has been done once in weave, since R CMD check will try to rebuild vignettes. The problem is whether the related tools in R should change their tangle utilities so we can **repeat** the test, and it seems the answer is "no" in my eyes. Regards, Yihui -- Yihui Xie Web: http://yihui.name On Sat, May 31, 2014 at 4:54 PM, Gabriel Becker wrote: On Fri, May 30, 2014 at 9:22 PM, Yihui Xie wrote: Hi Kevin, I tend to adopt Henrik's idea, i.e., to provide vignette engines that just ignore tangle. At the moment, it seems R CMD check It is very useful, pedagogically and when reproducing analyses, to be able to source() the tangled .R code into an R session, analogous to running example code with example(). The documentation for ?Stangle does read (Code inside '\Sexpr{}' statements is ignored by 'Stangle'.) So my 'vote' (recognizing that I don't have one of those) is to incorporate \Sexpr{} expressions into the tangled code, or to continue to flag use of Sexpr with side effects as errors (indirectly, by source()ing the tangled code), rather than writing engines that ignore tangle. It is very valuable to all parties to write a vignette with code that is fully evaluated; otherwise, it is too easy for bit rot to seep in, or to 'fake' it in a way that seems innocent but is misleading. Martin Morgan is comfortable with vignettes that do not have corresponding R scripts, and I hope these R scripts will not become mandatory in the future. I'm not sure this is the right approach. This would essentially make the test optional based on decisions by the package author. I'm not arguing in favor if this particular test, but if package authors are able to turn a test off then the test loses quite a bit of it's value. I think that R CMD check has done a great deal for the R community by presenting a uniform, minimum "barrier to entry" for R packages. Allowing package developers to alter the tests it does (other than the obvious case of their own unit tests) would remove that. That having been said, it seems to me that tangle-like utilities should have the option of extracting inline code, and that during R CMD check that option should *always* be turned on. That would solve the problem in question while retaining the test would it not? ~G __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] citEntry handling of encoded URLs
On 05/23/2014 05:35 AM, Achim Zeileis wrote: On Thu, 22 May 2014, Martin Morgan wrote: The following citEntry includes a url with %3A and other encodings citEntry(entry="article", title = "Software for Computing and Annotating Genomic Ranges", author = personList( as.person("Michael Lawrence" )), year = 2013, journal = "{PLoS} Computational Biology", volume = "9", issue = "8", doi = "10.1371/journal.pcbi.1003118", url = "http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118";, textVersion = "Lawrence M..." ) Evaluating this as R code doesn't parse correctly and generates a warning The citEntry (or bibentry) itself is parsed without problem. Some printing styles cause the warning, specifically when the Rd parser is used for formatting. Depending on how you want to print it, the warning doesn't occur though. Using bibentry() directly, we can do: b <- bibentry("Article", title = "Software for Computing and Annotating Genomic Ranges", author = "Michael Lawrence and others", year = "2013", journal = "PLoS Comptuational Biology", volume = "9", number = "8", doi = "10.1371/journal.pcbi.1003118", url = "http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118";, textVersion = "Lawrence M et al. (2013) ..." ) Then the default print(b) issues a warning because the Rd parser thinks that the % are comments. However, print(b, style = "BibTeX") print(b, style = "citation") don't issue warnings and also produce output that one might expect. Thanks for clarifying. For what it's worth, I was aiming for print(b, style="html") A work-around is, apparently, to quote the %, \\%3A etc., but is this the intention? In that case the default print(b) yields the desired output without warning but print(b, style = "BibTeX") or print(b, style = "citation") are possibly not in the desired format. I'm not sure though how the different BibTeX style files actually handle the URLs. I think some .bst files handle the "url" field verbatim (i.e., don't need escaping) while others treat it as text (i.e., need escaping). Personally, I would hence avoid the problem and only use the DOI URL here as this will be robust across BibTeX styles. Nevertheless it is not ideal that there is a discrepancy between the different printing styles. I think currently this can only be avoided if custom macros are employed. But Duncan might be able to say more about this. A similar situation occurs if you use commands that are not part of the Rd markup, e.g. n01 <- bibentry("Misc", title = "The $\\mathcal{N}(0, 1)$ Distribution", author = "Foo Bar", year = "2014") print(n01) # warning print(n01, style = "BibTeX") # ok Also, citEntry points to bibentry points to *Entry Fields*, but the 'url' tag is not mentioned there, even though url appears in the examples; if the list of supported tags is not easy to enumerate, perhaps some insight can be provided at this point as to how the supported tags are determined? This follows the BibTeX conventions. Thus, you can use any tag that you wish to use and it will depend on the style whether it is displayed or not. The only restriction is that certain bibtypes require certain fields, e.g., an "Article" has to specify: author, title, journal, year. But beyond that you can add any additional field. For example, in your bibentry above you used the "issue" field which is ignored by most BibTeX styles. My adaptation uses the "number" field instead which is processed by most standard BibTeX styles. The default print(..., style = "text") uses a bibstyle that is modeled after jss.bst, the BibTeX style employed by the Journal of Statistical Software. But you could plug in other .bibstyle arguments, e.g. one that processes the "issue" field etc. Hope that helps, Yes, that helps a lot, thanks, Martin Z Thanks Martin Morgan -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] citEntry handling of encoded URLs
The following citEntry includes a url with %3A and other encodings citEntry(entry="article", title = "Software for Computing and Annotating Genomic Ranges", author = personList( as.person("Michael Lawrence" )), year = 2013, journal = "{PLoS} Computational Biology", volume = "9", issue = "8", doi = "10.1371/journal.pcbi.1003118", url = "http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118";, textVersion = "Lawrence M..." ) Evaluating this as R code doesn't parse correctly and generates a warning Lawrence M (2013). “Software for Computing and Annotating Genomic Ranges.” _PLoS Computational Biology_, *9*. http://dx.doi.org/10.1371/journal.pcbi.1003118>, http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118}.> Warning message: In parse_Rd(Rd, encoding = encoding, fragment = fragment, ...) : :5: unexpected END_OF_INPUT ' ' A work-around is, apparently, to quote the %, \\%3A etc., but is this the intention? Also, citEntry points to bibentry points to *Entry Fields*, but the 'url' tag is not mentioned there, even though url appears in the examples; if the list of supported tags is not easy to enumerate, perhaps some insight can be provided at this point as to how the supported tags are determined? Thanks Martin Morgan -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel