[Rd] Error "Warning in read_symbols_from_dll(so, rarch): this requires 'objdump.exe' to be on the PATH
Hi all, I try to compile my package kml and I get the message Warning in read_symbols_from_dll(so,rarch): this requires 'objdump.exe' to be on the PATH I check 'Writing R Extensions' but I did not find any reference to this error. Does someone know how to fix that? Thank you very much for your help. Christophe __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [PATCH] Improve utf8clen and remove utf8_table4
Some of the code that uses utf8clen checks the validity of the utf8 string before making the call. However, there were some hairy areas where I felt that the new semantics may cause issues (if not now, then in future changes). I've attached two patches: * new_semantics.diff keeps the new semantics and updates those hairy areas above. * old_semantics.diff maintains the old semantics (return 1 even for continuation bytes). I don't think the new semantics will cause issues, especially with the updates, but we can err on the side of caution and keep the old semantics. I feel that the new semantics provide a clearer interface though (the function expects a start byte and should return an error if a start byte is not supplied). In either case, the utf8_table4 array has been removed. Sahil On 03/19/2017 05:38 AM, Duncan Murdoch wrote: On 19/03/2017 2:31 AM, Sahil Kang wrote: Given a char `c' which should be the start byte of a utf8 character, the utf8clen function returns the byte length of the utf8 character. Before this patch, the utf8clen function would return either: * 1 if `c' was an ascii character or a utf8 continuation byte * An int in the range [2, 6] indicating the byte length of the utf8 character With this patch, the utf8clen function will now return either: * -1 if `c' is not a valid utf8 start byte * The byte length of the utf8 character (the number of leading 1's, really) I believe returning -1 for continuation bytes makes utf8clen less error prone. The utf8_table4 array is no longer needed and has been removed. utf8clen is used internally by R in more than a dozen places, and is likely used in packages as well. Have you checked that this change in semantics won't break any of those uses? Duncan Murdoch Index: src/main/util.c === --- src/main/util.c (revision 72365) +++ src/main/util.c (working copy) @@ -1183,18 +1183,23 @@ return TRUE; } -/* Number of additional bytes */ -static const unsigned char utf8_table4[] = { - 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, - 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, - 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, - 3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5 }; - +/* + * If `c' is not a valid utf8 start byte, return 1. + * Otherwise, return the number of bytes in the utf8 string with start byte `c' + */ int attribute_hidden utf8clen(char c) { -/* This allows through 8-bit chars 10xx, which are invalid */ -if ((c & 0xc0) != 0xc0) return 1; -return 1 + utf8_table4[c & 0x3f]; +int n = 0; /* number of leading 1's */ +int m = 0x80; /* byte mask */ + +while (c & m) { +++n; +m >>= 1; +} + +if (n == 0) return 1; /* an ascii char of the form 0xxx */ +else if (n == 1) return 1; /* invalid start byte of the form 10xx */ +else return n; } /* These return the result in wchar_t, but does assume Index: src/main/valid_utf8.h === --- src/main/valid_utf8.h (revision 72365) +++ src/main/valid_utf8.h (working copy) @@ -75,7 +75,7 @@ if (c < 0xc0) return 1; /* Isolated 10xx byte */ if (c >= 0xfe) return 1; /* Invalid 0xfe or 0xff bytes */ - ab = utf8_table4[c & 0x3f]; /* Number of additional bytes */ + ab = utf8clen(c) - 1; /* Number of additional bytes */ if (length < ab) return 1; length -= ab; /* Length remaining */ Index: src/main/character.c === --- src/main/character.c (revision 72365) +++ src/main/character.c (working copy) @@ -276,7 +276,9 @@ if (ienc == CE_UTF8) { const char *end = str + strlen(str); for (i = 0; i < so && str < end; i++) { - int used = utf8clen(*str); + used = utf8clen(*str); + if (used < 0) used = 1; /* gobble up invalid utf8 start byte */ + if (i < sa - 1) { str += used; continue; } for (j = 0; j < used; j++) *buf++ = *str++; } @@ -459,10 +461,18 @@ int i, in = 0, out = 0; if (ienc == CE_UTF8) { - for (i = 1; i < sa; i++) buf += utf8clen(*buf); + int len; + for (i = 1; i < sa; i++) { + len = utf8clen(*buf); + buf += len < 0 ? 1 : len; + } for (i = sa; i <= so && in < strlen(str); i++) { - in += utf8clen(str[in]); - out += utf8clen(buf[out]); + len = utf8clen(str[in]); + in += len < 0 ? 1 : len; + + len = utf8clen(buf[out]); + out += len < 0 ? 1 : len; + if (!str[in]) break; } if (in != out) memmove(buf+in, buf+out, strlen(buf+out)+1); Index: src/main/connections.c === --- src/main/connections.c (revision 72365) +++ src/main/connections.c (working copy) @@ -4408,6 +4408,7 @@ if (iread >= nbytes) break; q = bytes + iread; clen = utf8clen(*q); + if (clen < 0) clen = 1; /* gobble up invalid utf8 start byte */ if
Re: [Rd] Experimental CXX_STD problem in R 3.4
C++ support across different platforms is now very heterogeneous. The standard is evolving rapidly but there are also platforms in current use that do not support the recent iterations of the standard. Our goal for R 3.4.0 is to give as much flexibility as possible. The default compiler is whatever you get "out of the box" without setting the "-std=" flag. This means different things on different platforms. If you need a specific standard there are various ways to request one, as described in the R-exts manual. On unix-alikes, the capabilities of the compiler are tested at configure time and appropriate flags chosen to support each standard. On Windows, the capabilities are hard-coded and correspond to the current version of Rtools, i.e. only C++98 and C++11 are currently supported. C++17 support is experimental and was added very recently. Clang 4.0.0, which was released last week, passes the configuration tests for C++17, and so does gcc 7.0.1, the pre-release version of gcc 7.1.0 which is due out later this year. The tests for C++17 features are, however, incomplete. I have just added some code to ensure that the compilation fails with an informative error message if a specific C++ standard is requested but the corresponding compiler has not been defined. Please test this. Martyn From: R-develon behalf of Dirk Eddelbuettel Sent: 18 March 2017 15:55 To: Jeroen Ooms Cc: r-devel Subject: Re: [Rd] Experimental CXX_STD problem in R 3.4 On 18 March 2017 at 14:21, Jeroen Ooms wrote: | R 3.4 has 'experimental' support for setting CXX_STD to CXX98 / CXX11 | / CXX14 / CXX17. R 3.1.0 introduced CXX11 support. R 3.4.0 will have CXX14 support. So I would only refer to the CXX17 part as experimental. | However on most platforms, the R configuration seems to leave the | CXX1Y and CXX1Z fields blank in "${R_HOME}/etc/Makeconf" (rather than | falling back on default CXX). Therefore specifying e.g CXX_STD= CXX14 | will fail build with cryptic errors (due to compiling with CXX="") That depends of course on the compiler found on the system. On my box (with g++ being g++-6.2 which _defaults_ to C++14) all is well up to CXX1Y. But I also have CXX1Z empty. | I don't think this is intended? Some examples from r-devel on Windows: | | CXX11: https://win-builder.r-project.org/R8gg703OQSq5/ | CXX98: https://win-builder.r-project.org/mpVfXxk79FaN/ | CXX14: https://win-builder.r-project.org/L3BSMgAk4cQ7/ | CXX17: https://win-builder.r-project.org/3ETZXrgkg77I/ You can't expect CXX14 and CXX17 to work with the only available compiler there, g++-4.9.3. | Similar problems appear on Linux. I think the problem is that Makeconf | contains e.g: | | CXX1Z = | CXX1ZFLAGS = | CXX1ZPICFLAGS = | CXX1ZSTD = | | When CXX_STD contains any other unsupported value (e.g. CXX24) R | simply falls back on the default CXX configuration. The same should | probably happen for e.g. CXX17 when CXX1Z is unset in Makeconf? Probably. Dirk -- http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel --- This message and its attachments are strictly confidenti...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RFC: (in-principle) native unquoting for standard evaluation
On Mon, Mar 20, 2017 at 7:36 AM, Radford Nealwrote: > Michael Lawrence (as last in long series of posters)... > >> Yes, it would bind the language object to the environment, like an >> R-level promise (but "promise" of course refers specifically to just >> _lazy_ evaluation). >> >> For the uqs() thing, expanding calls like that is somewhat orthogonal >> to NSE. It would be nice in general to be able to write something like >> mean(x, extra_args...) without resorting to do.call(mean, c(list(x), >> extra_args)). If we had that then uqs() would just be the combination >> of unquote and expansion, i.e., mean(x, @extra_args...). The "..." >> postfix would not work since it's still a valid symbol name, but we >> could come up with something. > > > I've been trying to follow this proposal, though without tracking down > all the tweets, etc. that are referenced. I suspect I'm not the only > reader who isn't clear exactly what is being proposed. I think a > detailed, self-contained proposal would be useful. We have a working implementation (which I'm calling tidyeval) in https://github.com/hadley/rlang, but we have yet to write it up. We'll spend some time documenting since it seems to be of broader interest. > One thing I'm not clear on is whether the proposal would add anything > semantically beyond what the present "eval" and "substitute" functions > can do fairly easily. If not, is there really any need for a slightly > more concise syntax? Is it expected that the new syntax would be used > lots by ordinary users, or is it only for the convenience of people > who are writing fairly esoteric functions (which might then be used by > many)? If the later, it seems undesirable to me. I accidentally responded off list to Michael, but I think there are three legs to "tidy" style of NSE: 1) capturing a quosure from a promise 2) quasiquotation (unquote + unquote-splice) 3) pronouns, so you can be explicit about where a variable should be looked up (.data vs .end) These are largely orthogonal, but I don't think you can solve the most important NSE problems without all three. Just having 1) in base R would be a big step forward. > There is an opportunity cost to grabbing the presently-unused unary @ > operator for this, in that it might otherwise be used for some other > extension. For example, see the last five slides in my talk at > http://www.cs.utoronto.ca/~radford/ftp/R-lang-ext.pdf for a different > proposal for a new unary @ operator. I'm not necessarily advocating > that particular use (my ideas in this respect are still undergoing > revisions), but the overall point is that there may well be several > good uses of a unary @ operator (and there aren't many other good > characters to use for a unary operator besides @). It is unclear to > me that the current proposal is the highest-value use of @. A further extension would be to allow binary @ in function argument names; then the LHS could be an arbitrary string used as an extension mechanism. Hadley -- http://hadley.nz __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RFC: (in-principle) native unquoting for standard evaluation
Michael Lawrence (as last in long series of posters)... > Yes, it would bind the language object to the environment, like an > R-level promise (but "promise" of course refers specifically to just > _lazy_ evaluation). > > For the uqs() thing, expanding calls like that is somewhat orthogonal > to NSE. It would be nice in general to be able to write something like > mean(x, extra_args...) without resorting to do.call(mean, c(list(x), > extra_args)). If we had that then uqs() would just be the combination > of unquote and expansion, i.e., mean(x, @extra_args...). The "..." > postfix would not work since it's still a valid symbol name, but we > could come up with something. I've been trying to follow this proposal, though without tracking down all the tweets, etc. that are referenced. I suspect I'm not the only reader who isn't clear exactly what is being proposed. I think a detailed, self-contained proposal would be useful. One thing I'm not clear on is whether the proposal would add anything semantically beyond what the present "eval" and "substitute" functions can do fairly easily. If not, is there really any need for a slightly more concise syntax? Is it expected that the new syntax would be used lots by ordinary users, or is it only for the convenience of people who are writing fairly esoteric functions (which might then be used by many)? If the later, it seems undesirable to me. There is an opportunity cost to grabbing the presently-unused unary @ operator for this, in that it might otherwise be used for some other extension. For example, see the last five slides in my talk at http://www.cs.utoronto.ca/~radford/ftp/R-lang-ext.pdf for a different proposal for a new unary @ operator. I'm not necessarily advocating that particular use (my ideas in this respect are still undergoing revisions), but the overall point is that there may well be several good uses of a unary @ operator (and there aren't many other good characters to use for a unary operator besides @). It is unclear to me that the current proposal is the highest-value use of @. Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Bioc-devel] Writing examples for accessing web API
A belated thanks for this update. Martin On 03/08/2017 09:53 AM, Welliton Souza wrote: Martin, I am using the "Template for Resource Queries" (http://bioconductor.org/developers/how-to/web-query/). I think the correct is: while (N.TRIES > 0L) { # line 5 instead of while (N.TRIES >= 0L) { # line 5 Because it will run twice when N.TRIES = 1L. At the end of the loop N.TRIES should be 0 not -1. Or this line will fail. if (N.TRIES == 0L) { # line 12 Welliton On Tue, Mar 7, 2017 at 5:15 PM Welliton Souza> wrote: Thank you Martin for the response, it was very helpful. I am facing difficulties to write code examples, unit tests and vignettes for accessing web APIs. I understand that the use of \dontrun{} is not the best solution for examples. I am using testthat::skip_on_bioc() to avoid time consuming and internet connection during execution of unit tests. I also using pre-cached response data (Rda file) for vignettes because I didn't find a cache solution for knitr/Rmarkdown. It would be very useful a package that helps us caching web response data and time consuming results that can be used in examples, unit tests and vignettes. Welliton On Tue, Mar 7, 2017 at 4:49 PM Martin Morgan > wrote: On 03/07/2017 02:35 PM, Welliton Souza wrote: > Thank you Nan Xiao, I think it is a good solution for my case. I will put > the definition of host outside the \dontrun{}. Actually, generally, one doesn't want to work around tests with hacks, but either (a) address the underlying problem or (b) justify what the current behavior is and live with the warning. Writing \dontrun{} doesn't help at all -- it might work now, but will suffer from 'bit rot' in the future; example and vignette code is littered with examples like this. The usual problem with web resources is that the functions that access them assume that the resource is reliably available, and that band width is infinite. In reality, the web resource is likely to experience periodic down time, and the band width can be cripplingly slow. There are a few notes here http://bioconductor.org/developers/how-to/web-query/ discussing better practice for querying web resources. Avoiding infinitely querying a web resource and including a time out are two important steps. Ideally one would like unit tests that verify that the web resource is in fact alive, and that the expected API is available. One would not necessarily test the entire API, but rely instead on serialized instances to ensure that one parses the correct information. With examples, there is a combination of judicious use of web calls coupled with re-use of data from previous calls. The forthcoming BiocFileCache (https://github.com/Bioconductor/BiocFileCache) might be used to create a temporary cache used during the duration of the build / check, limiting the number of unique calls required while still exposing substantial functionality in the example pages. Martin > > Welliton > > Em ter, 7 de mar de 2017 16:27, Nan Xiao > escreveu: > >> - this check can be bypassed by >> writing "partially working" examples, >> for instance: >> >> #' token = "foo" >> #' \dontrun{ >> #' api(..., token)} >> >> Best, >> -Nan >> >> On Tue, Mar 7, 2017 at 2:13 PM, Welliton Souza > wrote: >> >> Hi Sean, >> >> It doesn't require authentication. I've been using this server ( >> http://1kgenomes.ga4gh.org) to provide working examples and do unit tests >> but I am not responsible for this server. However the package was developed >> to access any server endpoint that use the same API. There are many web >> resources to cover, the package takes some time to run all examples. >> >> Welliton >> >> >> Em ter, 7 de mar de 2017 16:03, Sean Davis > escreveu: >> >> Hi, Welliton. >> >> Great question. Just out of curiosity, what are the internet connection >> requirements that preclude running examples? Is authentication required? >> Or are you connecting to a server that runs only intermittently? >> >> Sean >> >>
Re: [Rd] [PATCH] Improve utf8clen and remove utf8_table4
On 19/03/2017 2:31 AM, Sahil Kang wrote: Given a char `c' which should be the start byte of a utf8 character, the utf8clen function returns the byte length of the utf8 character. Before this patch, the utf8clen function would return either: * 1 if `c' was an ascii character or a utf8 continuation byte * An int in the range [2, 6] indicating the byte length of the utf8 character With this patch, the utf8clen function will now return either: * -1 if `c' is not a valid utf8 start byte * The byte length of the utf8 character (the number of leading 1's, really) I believe returning -1 for continuation bytes makes utf8clen less error prone. The utf8_table4 array is no longer needed and has been removed. utf8clen is used internally by R in more than a dozen places, and is likely used in packages as well. Have you checked that this change in semantics won't break any of those uses? Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] outer not applying a constant function
Hi, the function outer can not apply a constant function as in the last line of the following example: > xg <- 1:4 > yg <- 1:4 > fxyg <- outer(xg, yg, function(x,y) x*y) > fconstg <- outer(xg, yg, function(x,y) 1.0) Error in outer(xg, yg, function(x, y) 1) : dims [product 16] do not match the length of object [1] Of course there are simpler ways to construct a constant matrix, that is not my point. It happens for me in the context of generating matrices of partial derivatives, and if on of these partial derivatives happens to be constant it fails. So e.g this works: library(Deriv) f <- function(x,y) (x-1.5)*(y-1)*(x-1.8)+(y-1.9)^2*(x-1.1)^3 fx <- Deriv(f,"x") fy <- Deriv(f,"y") fxy <- Deriv(Deriv(f,"y"),"x") fxx <- Deriv(Deriv(f,"x"),"x") fyy <- Deriv(Deriv(f,"y"),"y") fg <- outer(xg,yg,f) fxg <- outer(xg,yg,fx) fyg <- outer(xg,yg,fy) fxyg <- outer(xg,yg,fxy) fxxg <- outer(xg,yg,fxx) fyyg <- outer(xg,yg,fyy) And with f <- function(x,y) x+y it stops working. Of course I can manually fix this for that special case, but thats not my point. I simply thought "outer" should be able to handle constant functions. Best regards Albrecht [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] [PATCH] Improve utf8clen and remove utf8_table4
Given a char `c' which should be the start byte of a utf8 character, the utf8clen function returns the byte length of the utf8 character. Before this patch, the utf8clen function would return either: * 1 if `c' was an ascii character or a utf8 continuation byte * An int in the range [2, 6] indicating the byte length of the utf8 character With this patch, the utf8clen function will now return either: * -1 if `c' is not a valid utf8 start byte * The byte length of the utf8 character (the number of leading 1's, really) I believe returning -1 for continuation bytes makes utf8clen less error prone. The utf8_table4 array is no longer needed and has been removed. Sahil Index: src/main/util.c === --- src/main/util.c (revision 72365) +++ src/main/util.c (working copy) @@ -1183,18 +1183,23 @@ return TRUE; } -/* Number of additional bytes */ -static const unsigned char utf8_table4[] = { - 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, - 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, - 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, - 3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5 }; - +/* + * If `c' is not a valid utf8 start byte, return -1. + * Otherwise, return the number of bytes in the utf8 string with start byte `c' + */ int attribute_hidden utf8clen(char c) { -/* This allows through 8-bit chars 10xx, which are invalid */ -if ((c & 0xc0) != 0xc0) return 1; -return 1 + utf8_table4[c & 0x3f]; +int n = 0; /* number of leading 1's */ +int m = 0x80; /* byte mask */ + +while (c & m) { +++n; +m >>= 1; +} + +if (n == 0) return 1; /* an ascii char of the form 0xxx */ +else if (n == 1) return -1; /* invalid start byte of the form 10xx */ +else return n; } /* These return the result in wchar_t, but does assume Index: src/main/valid_utf8.h === --- src/main/valid_utf8.h (revision 72365) +++ src/main/valid_utf8.h (working copy) @@ -75,7 +75,7 @@ if (c < 0xc0) return 1; /* Isolated 10xx byte */ if (c >= 0xfe) return 1; /* Invalid 0xfe or 0xff bytes */ - ab = utf8_table4[c & 0x3f]; /* Number of additional bytes */ + ab = utf8clen(c) - 1; /* Number of additional bytes */ if (length < ab) return 1; length -= ab; /* Length remaining */ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel