Re: [Rd] [EXTERNAL] Re: NOTE: multiple local function definitions for ?fun? with different formal arguments

2024-02-06 Thread Martin Morgan
I went looking and found this in codetools, where it's been for 20 years

https://gitlab.com/luke-tierney/codetools/-/blame/master/R/codetools.R?ref_type=heads#L951

I think the call stack in codetools is checkUsagePackage -> checkUsageEnv -> 
checkUsage, and these are similarly established. The call from the tools 
package 
https://github.com/wch/r-source/blame/95146f0f366a36899e4277a6a722964a51b93603/src/library/tools/R/QC.R#L4585
 is also quite old.

I'm not sure this had been said explicitly, but perhaps the original intent was 
to protect against accidentally redefining a local function. Obviously one 
could do this with a local variable too, though that might less often be an 
error…

toto <- function(mode) {
tata <- function(a, b) a * b  # intended
tata <- function(a, b) a / b  # oops
…
}

Another workaround is to actually name the local functions

toto <- function(mode) {
tata <- function(a, b) a * b
titi <- function(u, v, w) (u + v) / w
if (mode == 1)
tata
else
titi
}

… or to use a switch statement

toto <- function(mode) {
## fun <- switch(…) for use of `fun()` in toto
switch(
mode,
tata = function(a, b) a * b,
titi = function(u, v, w) (u + v) / w,
stop("unknown `mode = '", mode, "'`")
)
}

… or similarly to write `fun <- if … else …`, assigning the result of the `if` 
to `fun`. I guess this last formulation points to the fact that a more careful 
analysis of Hervé's original code means that `fun` can only take one value 
(only one branch of the `if` can be taken) so there can only be one version of 
`fun` in any invocation of `toto()`.

Perhaps the local names (and string-valued 'mode') are suggestive of special 
case, so serve as implicit documentation?

Adding `…` to `tata` doesn't seem like a good idea; toto(1)(3, 5, 7) no longer 
signals an error.

There seems to be a lot in common with S3 and S4 methods, where `toto` 
corresponds to the generic, `tata` and `titi` to methods. This 'dispatch' is 
brought out by using `switch()`. There is plenty of opportunity for thinking 
that you're invoking one method but actually you're invoking the other. For 
instance with dplyr, I like that I can tbl |> print(n = 2) so much that I find 
myself doing this with data.frame df |> print(n = 2), which is an error (`n` 
partially matches `na.print`, and 2 is not a valid value); both methods 
silently ignore the typo print(m = 2).

Martin Morgan

From: R-devel  on behalf of Henrik Bengtsson 

Date: Tuesday, February 6, 2024 at 4:34 PM
To: Izmirlian, Grant (NIH/NCI) [E] 
Cc: r-devel@r-project.org 
Subject: Re: [Rd] [EXTERNAL] Re: NOTE: multiple local function definitions for 
?fun? with different formal arguments
Here's a dummy example that I think illustrates the problem:

toto <- function() {
  if (runif(1) < 0.5)
function(a) a
  else
function(a,b) a+b
}

> fcn <- toto()
> fcn(1,2)
[1] 3
> fcn <- toto()
> fcn(1,2)
[1] 3
> fcn <- toto()
> fcn(1,2)
Error in fcn(1, 2) : unused argument (2)

How can you use the returned function, if you get different arguments?

In your example, you cannot use the returned function without knowing
'mode', or by inspecting the returned function.  So, the warning is
there to alert you to a potential bug.  Anecdotally, I'm pretty sure
this R CMD check NOTE has caught at least one such bug in one of
my/our packages.

If you want to keep the current design pattern, one approach could be
to add ... to your function definitions:

toto <- function(mode)
{
 if (mode == 1)
 fun <- function(a, b, ...) a*b
 else
 fun <- function(u, v, w) (u + v) / w
 fun
}

to make sure that toto() returns functions that accept the same
minimal number of arguments.

/Henrik

On Tue, Feb 6, 2024 at 1:15 PM Izmirlian, Grant (NIH/NCI) [E] via
R-devel  wrote:
>
> Because functions get called and therefore, the calling sequence matters. 
> It’s just protecting you from yourself, but as someone pointed out, there’s a 
> way to silence such notes.
> G
>
>
> From: Hervé Pagès 
> Sent: Tuesday, February 6, 2024 2:40 PM
> To: Izmirlian, Grant (NIH/NCI) [E] ; Duncan Murdoch 
> ; r-devel@r-project.org
> Subject: Re: [EXTERNAL] Re: [Rd] NOTE: multiple local function definitions 
> for ?fun? with different formal arguments
>
>
> On 2/6/24 11:19, Izmirlian, Grant (NIH/NCI) [E] wrote:
> The note refers to the fact that the function named ‘fun’ appears to be 
> defined in two different ways.
>
> Sure I get that. But how is that any different from a variable being defined 
> in two different ways like in
>
> if (mode == 1)
> x <- -8
> else
> x <- 55
>
> This is such a common and perfectly fine pattern. Why would 

Re: [Rd] Building R from source always fails on tools:::sysdata2LazyLoadDB

2023-05-31 Thread Martin Morgan
Thank you, especially for the R-admin link and link to the underlying issue.

This is macOS Monterey version 12.6.5, so I am stuck with rebuilding -- 
empirically it seems like make distclean &&  && make all is 
needed. This doesn't seem to make sense (shouldn't `make clean` remove all the 
compiled libraries?) so I'll investigate a bit more; maybe it is because I do 
not `make install`? I know there is no value in papering over upstream issues, 
but wonder if I were to unlink all (or some?) *so / *dylib before `make all` 
would be effective?

After the fact I realized that the R-SIG-mac mailing list might have been more 
appropriate, but I did not see the issue discussed there.

Martin

From: Prof Brian Ripley 
Date: Wednesday, May 31, 2023 at 3:46 AM
To: Martin Morgan 
Cc: Tomas Kalibera , R-devel 
Subject: Re: [Rd] Building R from source always fails on 
tools:::sysdata2LazyLoadDB
On 30/05/2023 22:57, Martin Morgan wrote:
> Thanks Ivan & Tomas
>
> A simpler way to trigger the problem is library(tools) or 
> library.dynam("tools", "tools", ".") so I guess it is loading src/tools.so
>
> Ivan, adding -d lldb I need to tell lldb where to find the R library
>
> (lldb) process launch --environment 
> DYLD_LIBRARY_PATH=/Users/ma38727/bin/R-devel/lib
>
> And then `library(tools)` works. To run lldb I needed to grant Xcode 
> permissions using my local administrator account.
>
> @Thomas I can't see anything in the Console app logs, but this might be 
> partly my ineptitude.

Which version of macOS is this?

- Prior to Ventura it is the known behaviour.

- With Ventura the same happens if any R process from the current build
is in use, even a crashed one. (The latter happened to me this morning:
there were reports under Crash Reports in the Console App.)

As the R-admin manual says

"Updating an �arm64� build may fail because of the bug described at
https://openradar.appspot.com/FB8914243 but ab initio builds work. This
has been far rarer since macOS 13."

Once it happens, you need to rebuild (make clean;make all should suffice).

>
> Martin
>
> From: Tomas Kalibera 
> Date: Tuesday, May 30, 2023 at 4:54 PM
> To: Martin Morgan , R-devel 
> Subject: Re: [Rd] Building R from source always fails on 
> tools:::sysdata2LazyLoadDB
>
> On 5/30/23 22:09, Martin Morgan wrote:
>> I build my own R from source on an M1 mac. I have a clean svn checkout in 
>> one directory ~/src/R-devel. I switch to ~/bin/R-devel and the first time run
>>
>> cd ~/bin/R-devel
>> ~/src/R-devel/configure --enable-R-shlib 'CFLAGS=-g -O0' 
>> CPPFLAGS=-I/opt/R/arm64/include 'CXXFLAGS=-g -O0'
>> make -j
>>
>> At some point in the future I svn update src/R-devel, then
>>
>> cd ~/bin/R-devel
>> make -j
>>
>> This always ends with
>>
>> installing 'sysdata.rda'
>> /bin/sh: line 1: 99497 Doneecho 
>> "tools:::sysdata2LazyLoadDB(\"/Users/XXX/src/R-devel/src/library/utils/R/sysdata.rda\",\"../../../library/utils/R\")"
>>99498 Killed: 9   | R_DEFAULT_PACKAGES=NULL LC_ALL=C 
>> ../../../bin/R --vanilla --no-echo
>> make[4]: *** [sysdata] Error 137
>> make[3]: *** [all] Error 2
>> make[2]: *** [R] Error 1
>> make[1]: *** [R] Error 1
>> make: *** [R] Error 1
>>
>> what am I doing wrong? Is there a graceful way to fix this (my current 
>> solution is basically to start over, with `make distclean`)? If I cd into 
>> ~/bin/R-devel/src/library/utils I can start an interactive session and 
>> reproduce the error
>>
>> ~/bin/R-devel/src/library/utils $   R_DEFAULT_PACKAGES=NULL ../../../bin/R 
>> --vanilla
>>> tools:::sysdata2LazyLoadDB("/Users/ma38727/src/R-devel/src/library/utils/R/sysdata.rda","../../../library/utils/R")
>> zsh: killed R_DEFAULT_PACKAGES=NULL ../../../bin/R --vanilla
>>
>> or simply
>>
>>> tools:::sysdata2LazyLoadDB
>> zsh: killed R_DEFAULT_PACKAGES=NULL LC_ALL=C R_ENABLE_JIT=0 TZ=UTC 
>> ../../../bin/R
>
> If it is macOS, it might be worth checking the system logs (Console
> app). It may be some system security feature.
>
> Tomas

--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Building R from source always fails on tools:::sysdata2LazyLoadDB

2023-05-30 Thread Martin Morgan
Thanks Ivan & Tomas

A simpler way to trigger the problem is library(tools) or 
library.dynam("tools", "tools", ".") so I guess it is loading src/tools.so

Ivan, adding -d lldb I need to tell lldb where to find the R library

(lldb) process launch --environment 
DYLD_LIBRARY_PATH=/Users/ma38727/bin/R-devel/lib

And then `library(tools)` works. To run lldb I needed to grant Xcode 
permissions using my local administrator account.

@Thomas I can't see anything in the Console app logs, but this might be partly 
my ineptitude.

Martin

From: Tomas Kalibera 
Date: Tuesday, May 30, 2023 at 4:54 PM
To: Martin Morgan , R-devel 
Subject: Re: [Rd] Building R from source always fails on 
tools:::sysdata2LazyLoadDB

On 5/30/23 22:09, Martin Morgan wrote:
> I build my own R from source on an M1 mac. I have a clean svn checkout in one 
> directory ~/src/R-devel. I switch to ~/bin/R-devel and the first time run
>
> cd ~/bin/R-devel
> ~/src/R-devel/configure --enable-R-shlib 'CFLAGS=-g -O0' 
> CPPFLAGS=-I/opt/R/arm64/include 'CXXFLAGS=-g -O0'
> make -j
>
> At some point in the future I svn update src/R-devel, then
>
> cd ~/bin/R-devel
> make -j
>
> This always ends with
>
> installing 'sysdata.rda'
> /bin/sh: line 1: 99497 Doneecho 
> "tools:::sysdata2LazyLoadDB(\"/Users/XXX/src/R-devel/src/library/utils/R/sysdata.rda\",\"../../../library/utils/R\")"
>   99498 Killed: 9   | R_DEFAULT_PACKAGES=NULL LC_ALL=C 
> ../../../bin/R --vanilla --no-echo
> make[4]: *** [sysdata] Error 137
> make[3]: *** [all] Error 2
> make[2]: *** [R] Error 1
> make[1]: *** [R] Error 1
> make: *** [R] Error 1
>
> what am I doing wrong? Is there a graceful way to fix this (my current 
> solution is basically to start over, with `make distclean`)? If I cd into 
> ~/bin/R-devel/src/library/utils I can start an interactive session and 
> reproduce the error
>
> ~/bin/R-devel/src/library/utils $   R_DEFAULT_PACKAGES=NULL ../../../bin/R 
> --vanilla
>> tools:::sysdata2LazyLoadDB("/Users/ma38727/src/R-devel/src/library/utils/R/sysdata.rda","../../../library/utils/R")
> zsh: killed R_DEFAULT_PACKAGES=NULL ../../../bin/R --vanilla
>
> or simply
>
>> tools:::sysdata2LazyLoadDB
> zsh: killed R_DEFAULT_PACKAGES=NULL LC_ALL=C R_ENABLE_JIT=0 TZ=UTC 
> ../../../bin/R

If it is macOS, it might be worth checking the system logs (Console
app). It may be some system security feature.

Tomas

>
>[[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Building R from source always fails on tools:::sysdata2LazyLoadDB

2023-05-30 Thread Martin Morgan
I build my own R from source on an M1 mac. I have a clean svn checkout in one 
directory ~/src/R-devel. I switch to ~/bin/R-devel and the first time run

cd ~/bin/R-devel
~/src/R-devel/configure --enable-R-shlib 'CFLAGS=-g -O0' 
CPPFLAGS=-I/opt/R/arm64/include 'CXXFLAGS=-g -O0'
make -j

At some point in the future I svn update src/R-devel, then

cd ~/bin/R-devel
make -j

This always ends with

installing 'sysdata.rda'
/bin/sh: line 1: 99497 Doneecho 
"tools:::sysdata2LazyLoadDB(\"/Users/XXX/src/R-devel/src/library/utils/R/sysdata.rda\",\"../../../library/utils/R\")"
 99498 Killed: 9   | R_DEFAULT_PACKAGES=NULL LC_ALL=C 
../../../bin/R --vanilla --no-echo
make[4]: *** [sysdata] Error 137
make[3]: *** [all] Error 2
make[2]: *** [R] Error 1
make[1]: *** [R] Error 1
make: *** [R] Error 1

what am I doing wrong? Is there a graceful way to fix this (my current solution 
is basically to start over, with `make distclean`)? If I cd into 
~/bin/R-devel/src/library/utils I can start an interactive session and 
reproduce the error

~/bin/R-devel/src/library/utils $   R_DEFAULT_PACKAGES=NULL ../../../bin/R 
--vanilla
> tools:::sysdata2LazyLoadDB("/Users/ma38727/src/R-devel/src/library/utils/R/sysdata.rda","../../../library/utils/R")
zsh: killed R_DEFAULT_PACKAGES=NULL ../../../bin/R --vanilla

or simply

> tools:::sysdata2LazyLoadDB
zsh: killed R_DEFAULT_PACKAGES=NULL LC_ALL=C R_ENABLE_JIT=0 TZ=UTC 
../../../bin/R

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] gsub() hex character range problems in R-devel?

2022-01-06 Thread Martin Morgan
Thanks Tomas and 'Brodie' for your expert explanation; it provides great help 
in understanding and solving my immediate problem.

Thomas' observation to 'do something like e.g. "only keep ASCII digits, ASCII 
space, ASCII underscore, but remove all other characters"' points to a basic 
weakness in the code I'm looking at. E.g., removing non-breaking space is 
probably not appropriate ('foo\ua0bar' is probably cleaned to 'foo bar' and not 
'foobar'). And more generally other non-ASCII characters ('fancy' quotes, 
em-dashes, ...) would require special treatment. It seems like the right thing 
to do is to handle the raw data in its original encoding, rather than to try to 
clean it to ASCII.

Martin

On 1/5/22, 4:17 AM, "Tomas Kalibera"  wrote:

Hi Martin,

I'd add few comments to the excellent analysis of Brodie.

- \xhh is allowed and defined in Perl regular expressions, see ?regex 
(would need perl=TRUE), but to enter that in an R string, you need to 
escape the backslash.

- \xhh is not defined by POSIX for extended regular expressions, neither 
it is documented in ?regex for those; TRE supports it, but still 
portable programs should not rely on that

- literal \xhh in an R string is turned to the byte by R, but I would 
say this should not be used at all by users, because the result is 
encoding specific

- use of \u and \U in an R string is fine, it has well defined semantics 
and the corresponding string will then be flagged UTF-8 in R (so e.g. 
\ua0 is fine to represent the Unicode no-break space)

- see caveats of using character ranges with POSIX extended regular 
expressions in ?regex re encodings, using Perl regular expressions in 
UTF-8 mode is more reliable for those

So, a variant of your example might be:

 > gsub("[\\x7f-\\xff]", "", "fo\ua0o", perl=TRUE)
[1] "foo"

(note that the \ua0 ensures that the text is UTF-8, and hence the UTF-8 
mode for regular expressions is used, ?regex has more)

However, I think it is better to formulate regular expressions to cover 
all of Unicode, so do something like e.g. "only keep ASCII digits, ASCII 
space, ASCII underscore, but remove all other characters".

Best
Tomas

On 1/4/22 8:35 PM, Martin Morgan wrote:

> I'm not very good at character encoding / etc so this might be user 
error. The following code is meant to replace extended ASCII characters, in 
particular a non-breaking space, with "", and it works in R-4-1-branch
>
>> R.version.string
> [1] "R version 4.1.2 Patched (2022-01-04 r81445)"
>> gsub("[\x7f-\xff]", "", "fo\xa0o")
> [1] "foo"
>
> but fails in R-devel
>
>> R.version.string
> [1] "R Under development (unstable) (2022-01-04 r81445)"
>> gsub("[\x7f-\xff]", "", "fo\xa0o")
> Error in gsub("[\177-\xff]", "", "fo\xa0o") : invalid regular expression 
'[-�]', reason 'Invalid character range'
> In addition: Warning message:
> In gsub("[\177-\xff]", "", "fo\xa0o") :
>TRE pattern compilation error 'Invalid character range'
>
> There are other oddities, too, like
>
>> gsub("[[:alnum:]]", "", "fo\xa0o")  # R-4-1-branch
> [1] "\xfc\xbe\x8c\x86\x84\xbc"
>
>> gsub("[[:alnum:]]", "", "fo\xa0o")  # R-devel
> [1] "<>"
>
> The R-devel sessionInfo is
>
>> sessionInfo()
> R Under development (unstable) (2022-01-04 r81445)
> Platform: x86_64-apple-darwin19.6.0 (64-bit)
> Running under: macOS Catalina 10.15.7
>
> Matrix products: default
    > BLAS:   /Users/ma38727/bin/R-devel/lib/libRblas.dylib
> LAPACK: /Users/ma38727/bin/R-devel/lib/libRlapack.dylib
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_4.2.0
>
> (I have built my own R on macOS; similar behavior is observed on a Linux 
machine)
>
> Any hints welcome,
>
> Martin Morgan
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] gsub() hex character range problems in R-devel?

2022-01-04 Thread Martin Morgan
I'm not very good at character encoding / etc so this might be user error. The 
following code is meant to replace extended ASCII characters, in particular a 
non-breaking space, with "", and it works in R-4-1-branch

> R.version.string
[1] "R version 4.1.2 Patched (2022-01-04 r81445)"
> gsub("[\x7f-\xff]", "", "fo\xa0o")
[1] "foo"

but fails in R-devel

> R.version.string
[1] "R Under development (unstable) (2022-01-04 r81445)"
> gsub("[\x7f-\xff]", "", "fo\xa0o")
Error in gsub("[\177-\xff]", "", "fo\xa0o") : invalid regular expression 
'[-�]', reason 'Invalid character range'
In addition: Warning message:
In gsub("[\177-\xff]", "", "fo\xa0o") :
  TRE pattern compilation error 'Invalid character range'

There are other oddities, too, like

> gsub("[[:alnum:]]", "", "fo\xa0o")  # R-4-1-branch
[1] "\xfc\xbe\x8c\x86\x84\xbc"

> gsub("[[:alnum:]]", "", "fo\xa0o")  # R-devel
[1] "<>"

The R-devel sessionInfo is

> sessionInfo()
R Under development (unstable) (2022-01-04 r81445)
Platform: x86_64-apple-darwin19.6.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /Users/ma38727/bin/R-devel/lib/libRblas.dylib
LAPACK: /Users/ma38727/bin/R-devel/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.2.0

(I have built my own R on macOS; similar behavior is observed on a Linux 
machine)

Any hints welcome,

Martin Morgan
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] unicode in R documentation

2021-07-13 Thread Martin Morgan
I have options(useFancyQuotes = FALSE) in my ~/.Rprofile.

Martin Morgan

On 7/13/21, 11:37 AM, "R-devel on behalf of Frederick Eaton" 
 wrote:

Dear R Team,

I am running R from the terminal command line (not RStudio). I've noticed 
that R has been using Unicode quotes in its documentation for some time, maybe 
since before I started using it.

I am wondering if it is possible to compile the documentation to use normal 
quotes instead.

I find it useful to be able to search documentation for strings with 
quotes, for example when reading "?options" I might search for "'dev" to find 
an option starting with the letters "dev". Without the single-quote at the 
front, there would be a lot of matches that I'm not interested in, but the 
single-quote at the front helps narrow it down to the parameters that are being 
indexed in the documentation. However, I can't actually search for "'dev" in 
"?options" because it is written with curly quotes "‘device’" and "'" does not 
match "‘" on my machine.

Similarly, when I read manual pages for commands on Linux, I sometimes 
search for "-r" instead of "r" because "-r" is likely to find documentation for 
the option "-r", while searching for "r" will match almost every line.

I'm wondering what other people do when reading through documentation. Do 
you search for things at all or just read it straight through? Is there a 
hyperlinked version that just lets you jump to the "device" entry in "?options" 
or do you have to type out a search string? What search string do you use? Do 
you have a way to enter Unicode quotes when doing this, or does your pager 
provide a special regular expression syntax which makes it easier to match them?

Thanks,

Frederick

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How to get utf8 string using R externals

2021-06-02 Thread Morgan Morgan
On Wed, 2 Jun 2021, 22:31 Duncan Murdoch,  wrote:

> On 02/06/2021 4:33 p.m., xiaoyan yu wrote:
> > I have a R Script Predict.R:
> >  set.seed(42)
> >  C <- seq(1:1000)
> >  A <- rep(seq(1:200),5)
> >  E <- (seq(1:1000) * (0.8 + (0.4*runif(50, 0, 1
> >  L <- ifelse(runif(1000)>.5,1,0)
> >  df <- data.frame(cbind(C, A, E, L))
> > load("C:/Temp/tree.RData")#  load the model for scoring
> >
> >P <- as.character(predict(tree_model_1,df,type='class'))
> >
> > Then in a C++ program
> > I call eval to evaluate the script and then findVar the P variable.
> > After get each class label from P using string_elt and then
> > Rf_translateChar, the characters are unicodes () instead
> of
> > utf8 encoding of the korean characters 부실.
> > Can I know how to get UTF8 by using R externals?
> >
> > I also found the same script giving utf8 characters in RGui but unicode
> in
> > Rterm.
> > I tried to attach a screenshot but got message "The message's content
> type
> > was not explicitly allowed"
> > In RGui, I saw the output 부실, while in Rterm, .
>
> Sounds like you're using Windows.  Stop doing that.
>
> Duncan Murdoch
>

Could as well say: "Sounds like you are using R. Stop doing that." Start
using Julia. ;-)



> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R Console Bug?

2021-04-17 Thread Morgan Morgan
Hi Simon,
Thank you for the feedback.
It is really strange that you have a different output.
I have attached a picture of my R console.
I am just trying to port some pure C code that prints progress bars to R
but it does not seem to be printing properly.
It seems I am doing something wrong with REprintf and R_FlushConsole.
Best regards,
Morgan

On Sat, Apr 17, 2021 at 12:36 AM Simon Urbanek 
wrote:

> Sorry, unable to reproduce on macOS, in R console:
>
> > dyn.load("test.so")
> > .Call("printtest",1e4L)
>
>Processing data chunk 1 of 3
>  [==] 100%
>
>Processing data chunk 2 of 3
>  [==] 100%
>
>Processing data chunk 3 of 3
>  [==] 100%
> NULL
>
> But honestly I'm not sure sure I understand the report. R_FlushConsole is
> a no-op for terminal console and your code just prints on stderr anyway
> (which is not buffered). All this does is just a lot of \r output (which is
> highly inefficient anywhere but in Terminal by definition). Can you clarify
> what the code tries to trigger?
>
> Cheers,
> Simon
>
>
> > On Apr 16, 2021, at 23:11, Morgan Morgan 
> wrote:
> >
> > Hi,
> >
> > I am getting a really weird behaviour with the R console.
> > Here is the code to reproduce it.
> >
> > 1/ C code: ---
> >
> > SEXP printtest(SEXP x) {
> >  const int PBWIDTH = 30, loop = INTEGER(x)[0];
> >  int val, lpad;
> >  double perc;
> >  char PBSTR[PBWIDTH], PBOUT[PBWIDTH];
> >  memset(PBSTR,'=', sizeof(PBSTR));
> >  memset(PBOUT,'-', sizeof(PBOUT));
> >  for (int k = 0; k < 3; ++k) {
> >REprintf("\n   Processing data chunk %d of 3\n",k+1);
> >for (int i = 0; i < loop; ++i) {
> >  perc = (double) i/(loop-1);
> >  val  = (int) (perc * 100);
> >  lpad = (int) (perc * PBWIDTH);
> >  REprintf("\r [%.*s%.*s] %3d%%", lpad, PBSTR, PBWIDTH - lpad, PBOUT,
> > val);
> >  R_FlushConsole();
> >}
> >REprintf("\n");
> >  }
> >  return R_NilValue;
> > }
> >
> > 2/ Build so/dll: ---
> >
> > R CMD SHLIB
> >
> > 3/ Run code :  ---
> >
> > dyn.load("test.so")
> > .Call("printtest",1e4L)
> > dyn.unload("test.so")
> >
> > 4/ Issue:  ---
> > If you run the above code in RStudio, it works well both on Mac and
> Windows.
> > If you run it in Windows cmd, it is slow.
> > If you run it in Windows RGui, it is slow but also all texts are flushed.
> > If you run it in Mac terminal, it runs perfectly.
> > If you run it in Mac R Console, it prints something like :
> >> .Call("printtest",1e4L)
> > [==] 100%NULL]
>  0%
> >
> > I am using R 4.0.4 (Mac) / 4.0.5 (Windows)
> >
> > Is that a bug or am I doing something wrong?
> >
> > Thank you
> > Best regards,
> > Morgan
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R Console Bug?

2021-04-16 Thread Morgan Morgan
Hi,

I am getting a really weird behaviour with the R console.
Here is the code to reproduce it.

1/ C code: ---

SEXP printtest(SEXP x) {
  const int PBWIDTH = 30, loop = INTEGER(x)[0];
  int val, lpad;
  double perc;
  char PBSTR[PBWIDTH], PBOUT[PBWIDTH];
  memset(PBSTR,'=', sizeof(PBSTR));
  memset(PBOUT,'-', sizeof(PBOUT));
  for (int k = 0; k < 3; ++k) {
REprintf("\n   Processing data chunk %d of 3\n",k+1);
for (int i = 0; i < loop; ++i) {
  perc = (double) i/(loop-1);
  val  = (int) (perc * 100);
  lpad = (int) (perc * PBWIDTH);
  REprintf("\r [%.*s%.*s] %3d%%", lpad, PBSTR, PBWIDTH - lpad, PBOUT,
val);
  R_FlushConsole();
}
REprintf("\n");
  }
  return R_NilValue;
}

2/ Build so/dll: ---

R CMD SHLIB

3/ Run code :  ---

dyn.load("test.so")
.Call("printtest",1e4L)
dyn.unload("test.so")

4/ Issue:  ---
If you run the above code in RStudio, it works well both on Mac and Windows.
If you run it in Windows cmd, it is slow.
If you run it in Windows RGui, it is slow but also all texts are flushed.
If you run it in Mac terminal, it runs perfectly.
If you run it in Mac R Console, it prints something like :
> .Call("printtest",1e4L)
 [==] 100%NULL]   0%

I am using R 4.0.4 (Mac) / 4.0.5 (Windows)

Is that a bug or am I doing something wrong?

Thank you
Best regards,
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Faster sorting algorithm...

2021-03-21 Thread Morgan Morgan
My apologies to Professor Neal.
Thank you for correcting me.
Best regards
Morgan


On Mon, 22 Mar 2021, 05:05 ,  wrote:

> I think it is "Professor Neal" :)
>
> I also appreciate the pqR comparisons.
>
> On Wed, Mar 17, 2021 at 09:23:15AM +, Morgan Morgan wrote:
> >Thank you Neal. This is interesting. I will have a look at pqR.
> >Indeed radix only does C collation, I believe that is why it is not the
> >default choice for character ordering and sorting.
> >Not sure but I believe it can help address the following bugzilla item:
> >https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17400
> >
> >On the same topic of collation, there is an experimental sorting function
> >"psort" in package kit that might help address this issue.
> >
> >> library(kit)
> >Attaching kit 0.0.7 (OPENMP enabled using 1 thread)
> >> x <- c("b","A","B","a","\xe4")
> >> Encoding(x) <- "latin1"
> >> identical(psort(x, c.locale=FALSE), sort(x))
> >[1] TRUE
> >> identical(psort(x, c.locale=TRUE), sort(x, method="radix"))
> >[1] TRUE
> >
> >Coming back to the topic of fsort, I have just finished the implementation
> >for double, integer, factor and logical.
> >The implementation takes into account NA, Inf.. values. Values can be
> >sorted in a decreasing order or increasing order.
> >Comparing benchmark with the current implementation in data.table, it is
> >currently over 30% faster.
> >There might bugs but I am sure performance can be further improved as I
> did
> >not really try hard.
> >If there is interest in both the implementation and cross community
> >sharing, please let know
> >
> >Best regards,
> >Morgan
> >
> >On Wed, 17 Mar 2021, 00:37 Radford Neal,  wrote:
> >
> >> Those interested in faster sorting may want to look at the merge sort
> >> implemented in pqR (see pqR-project.org).  It's often used as the
> >> default, because it is stable, and does different collations, while
> >> being faster than shell sort (except for small vectors).
> >>
> >> Here are examples, with timings, for pqR-2020-07-23 and R-4.0.2,
> >> compiled identically:
> >>
> >> -
> >> pqR-2020-07-23 in C locale:
> >>
> >> > set.seed(1)
> >> > N <- 100
> >> > x <- as.character (sample(N,N,replace=TRUE))
> >> > print(system.time (os <- order(x,method="shell")))
> >>user  system elapsed
> >>   1.332   0.000   1.334
> >> > print(system.time (or <- order(x,method="radix")))
> >>user  system elapsed
> >>   0.092   0.004   0.096
> >> > print(system.time (om <- order(x,method="merge")))
> >>user  system elapsed
> >>   0.363   0.000   0.363
> >> > print(identical(os,or))
> >> [1] TRUE
> >> > print(identical(os,om))
> >> [1] TRUE
> >> >
> >> > x <- c("a","~")
> >> > print(order(x,method="shell"))
> >> [1] 1 2
> >> > print(order(x,method="radix"))
> >> [1] 1 2
> >> > print(order(x,method="merge"))
> >> [1] 1 2
> >>
> >> -
> >> R-4.0.2 in C locale:
> >>
> >> > set.seed(1)
> >> > N <- 100
> >> > x <- as.character (sample(N,N,replace=TRUE))
> >> > print(system.time (os <- order(x,method="shell")))
> >>user  system elapsed
> >>   2.381   0.004   2.387
> >> > print(system.time (or <- order(x,method="radix")))
> >>user  system elapsed
> >>   0.138   0.000   0.137
> >> > #print(system.time (om <- order(x,method="merge")))
> >> > print(identical(os,or))
> >> [1] TRUE
> >> > #print(identical(os,om))
> >> >
> >> > x <- c("a","~")
> >> > print(order(x,method="shell"))
> >> [1] 1 2
> >> > print(order(x,method="radix"))
> >> [1] 1 2
> >> > #print(order(x,method="merge"))
> >>
> >> 
> >> pqR-2020-07-23 in fr_CA.utf8 locale:
> >>
> >> > set.seed(1)
> >> > N <- 100
> >> > x <- as.character (sample(N,N,replace=TRUE))
> >> > print(system.t

Re: [Rd] Faster sorting algorithm...

2021-03-17 Thread Morgan Morgan
Thank you Neal. This is interesting. I will have a look at pqR.
Indeed radix only does C collation, I believe that is why it is not the
default choice for character ordering and sorting.
Not sure but I believe it can help address the following bugzilla item:
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17400

On the same topic of collation, there is an experimental sorting function
"psort" in package kit that might help address this issue.

> library(kit)
Attaching kit 0.0.7 (OPENMP enabled using 1 thread)
> x <- c("b","A","B","a","\xe4")
> Encoding(x) <- "latin1"
> identical(psort(x, c.locale=FALSE), sort(x))
[1] TRUE
> identical(psort(x, c.locale=TRUE), sort(x, method="radix"))
[1] TRUE

Coming back to the topic of fsort, I have just finished the implementation
for double, integer, factor and logical.
The implementation takes into account NA, Inf.. values. Values can be
sorted in a decreasing order or increasing order.
Comparing benchmark with the current implementation in data.table, it is
currently over 30% faster.
There might bugs but I am sure performance can be further improved as I did
not really try hard.
If there is interest in both the implementation and cross community
sharing, please let know

Best regards,
Morgan

On Wed, 17 Mar 2021, 00:37 Radford Neal,  wrote:

> Those interested in faster sorting may want to look at the merge sort
> implemented in pqR (see pqR-project.org).  It's often used as the
> default, because it is stable, and does different collations, while
> being faster than shell sort (except for small vectors).
>
> Here are examples, with timings, for pqR-2020-07-23 and R-4.0.2,
> compiled identically:
>
> -
> pqR-2020-07-23 in C locale:
>
> > set.seed(1)
> > N <- 100
> > x <- as.character (sample(N,N,replace=TRUE))
> > print(system.time (os <- order(x,method="shell")))
>user  system elapsed
>   1.332   0.000   1.334
> > print(system.time (or <- order(x,method="radix")))
>user  system elapsed
>   0.092   0.004   0.096
> > print(system.time (om <- order(x,method="merge")))
>user  system elapsed
>   0.363   0.000   0.363
> > print(identical(os,or))
> [1] TRUE
> > print(identical(os,om))
> [1] TRUE
> >
> > x <- c("a","~")
> > print(order(x,method="shell"))
> [1] 1 2
> > print(order(x,method="radix"))
> [1] 1 2
> > print(order(x,method="merge"))
> [1] 1 2
>
> -
> R-4.0.2 in C locale:
>
> > set.seed(1)
> > N <- 100
> > x <- as.character (sample(N,N,replace=TRUE))
> > print(system.time (os <- order(x,method="shell")))
>user  system elapsed
>   2.381   0.004   2.387
> > print(system.time (or <- order(x,method="radix")))
>user  system elapsed
>   0.138   0.000   0.137
> > #print(system.time (om <- order(x,method="merge")))
> > print(identical(os,or))
> [1] TRUE
> > #print(identical(os,om))
> >
> > x <- c("a","~")
> > print(order(x,method="shell"))
> [1] 1 2
> > print(order(x,method="radix"))
> [1] 1 2
> > #print(order(x,method="merge"))
>
> 
> pqR-2020-07-23 in fr_CA.utf8 locale:
>
> > set.seed(1)
> > N <- 100
> > x <- as.character (sample(N,N,replace=TRUE))
> > print(system.time (os <- order(x,method="shell")))
> utilisateur système  écoulé
>   2.960   0.000   2.962
> > print(system.time (or <- order(x,method="radix")))
> utilisateur système  écoulé
>   0.083   0.008   0.092
> > print(system.time (om <- order(x,method="merge")))
> utilisateur système  écoulé
>   1.143   0.000   1.142
> > print(identical(os,or))
> [1] TRUE
> > print(identical(os,om))
> [1] TRUE
> >
> > x <- c("a","~")
> > print(order(x,method="shell"))
> [1] 2 1
> > print(order(x,method="radix"))
> [1] 1 2
> > print(order(x,method="merge"))
> [1] 2 1
>
> 
> R-4.0.2 in fr_CA.utf8 locale:
>
> > set.seed(1)
> > N <- 100
> > x <- as.character (sample(N,N,replace=TRUE))
> > print(system.time (os <- order(x,method="shell")))
> utilisateur système  écoulé
>   4.222   0.016   4.239
> > print(system.time (or <- order(x,method="radix"

Re: [Rd] Faster sorting algorithm...

2021-03-15 Thread Morgan Morgan
Default method for sort is not radix(especially for character vector). You
might want to read the documentation of sort.
For your second question, I invite you to look at the code of fsort. It is
implemented only for positive finite double, and default to
data.table:::forder ... when the types are different than positive double...
Please read the pdf link I sent, everything is explained in it.
Thank you
Morgan

On Mon, 15 Mar 2021, 16:52 Avraham Adler,  wrote:

> Isn’t the default method now “radix” which is the data.table sort, and
> isn’t that already parallel using openmp where available?
>
> Avi
>
> On Mon, Mar 15, 2021 at 12:26 PM Morgan Morgan 
> wrote:
>
>> Hi,
>> I am not sure if this is the right mailing list, so apologies in advance
>> if
>> it is not.
>>
>> I found the following link/presentation:
>> https://www.r-project.org/dsc/2016/slides/ParallelSort.pdf
>>
>> The implementation of fsort is interesting but incomplete (not sure why?)
>> and can be improved or made faster (at least 25%  I believe). I might be
>> wrong but there are maybe a couple of bugs as well.
>>
>> My questions are:
>>
>> 1/ Is the R Core team interested in a faster sorting algo? (Multithread or
>> even single threaded)
>>
>> 2/ I see an issue with the license, which is MPL-2.0, and hence not
>> compatible with base R, Python and Julia. Is there an interest to change
>> the license of fsort so all 3 languages (and all the people using these
>> languages) can benefit from it? (Like suggested on the first page)
>>
>> Please let me know if there is an interest to address the above points, I
>> would be happy to look into it (free of charge of course!).
>>
>> Thank you
>> Best regards
>> Morgan
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> --
> Sent from Gmail Mobile
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Faster sorting algorithm...

2021-03-15 Thread Morgan Morgan
Hi,
I am not sure if this is the right mailing list, so apologies in advance if
it is not.

I found the following link/presentation:
https://www.r-project.org/dsc/2016/slides/ParallelSort.pdf

The implementation of fsort is interesting but incomplete (not sure why?)
and can be improved or made faster (at least 25%  I believe). I might be
wrong but there are maybe a couple of bugs as well.

My questions are:

1/ Is the R Core team interested in a faster sorting algo? (Multithread or
even single threaded)

2/ I see an issue with the license, which is MPL-2.0, and hence not
compatible with base R, Python and Julia. Is there an interest to change
the license of fsort so all 3 languages (and all the people using these
languages) can benefit from it? (Like suggested on the first page)

Please let me know if there is an interest to address the above points, I
would be happy to look into it (free of charge of course!).

Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Something is wrong with the unserialize function

2020-10-29 Thread Martin Morgan
This

Index: src/main/altrep.c
===
--- src/main/altrep.c   (revision 79385)
+++ src/main/altrep.c   (working copy)
@@ -275,10 +275,11 @@
SEXP psym = ALTREP_SERIALIZED_CLASS_PKGSYM(info);
SEXP class = LookupClass(csym, psym);
if (class == NULL) {
-   SEXP pname = ScalarString(PRINTNAME(psym));
+   SEXP pname = PROTECT(ScalarString(PRINTNAME(psym)));
R_tryCatchError(find_namespace, pname,
handle_namespace_error, NULL);
class = LookupClass(csym, psym);
+   UNPROTECT(1);
}
return class;
 }

seems to remove the warning; I'm guessing that the other SEXP already exist so 
don't need protecting?

Martin Morgan


On 10/29/20, 12:47 PM, "R-devel on behalf of luke-tier...@uiowa.edu" 
 wrote:

Thanks for the report. Will look into it when I get a chance unless
someone else gets there first.

A simpler reprex:

## create and serialize a memmory-mapped file object
filePath <- "x.dat"
con <- file(filePath, "wrb")
writeBin(rep(0.0,10),con)
close(con)

library(simplemmap)
x <- mmap(filePath, "double")
saveRDS(x, file = "x.Rds")

## in a separate R process:
gctorture()
readRDS("x.Rds")

Looks like a missing PROTECT somewhere.

Best,

luke

On Thu, 29 Oct 2020, Jiefei Wang wrote:

> Hi all,
>
> I am not able to export an ALTREP object when `gctorture` is on in the
> worker. The package simplemmap can be used to reproduce the problem. See
> the example below
> ```
> ## Create a temporary file
> filePath <- tempfile()
> con <- file(filePath, "wrb")
> writeBin(rep(0.0,10),con)
> close(con)
>
> library(simplemmap)
> library(parallel)
> cl <- makeCluster(1)
> x <- mmap(filePath, "double")
> ## Turn gctorture on
> clusterEvalQ(cl, gctorture())
> clusterExport(cl, "x")
> ## x is an 0-length vector on the worker
> clusterEvalQ(cl, x)
> stopCluster(cl)
> ```
>
> you can find more info on the problem if you manually build a connection
> between two R processes and export the ALTREP object. See output below
> ```
>> con <- socketConnection(port = 1234,server = FALSE)
>> gctorture()
>> x <- unserialize(con)
> Warning message:
> In unserialize(con) :
>  cannot unserialize ALTVEC object of class 'mmap_real' from package
> 'simplemmap'; returning length zero vector
> ```
> It seems like  simplemmap did not get loaded correctly on the worker. If
> you run `library( simplemmap)` before unserializing the ALTREP, there will
> be no problem. But I suppose we should be able to unserialize objects
> without preloading the library?
>
> This issue can be reproduced on Ubuntu with R version 4.0.2 (2020-06-22)
> and Windows with R Under development (unstable) (2020-09-03 r79126).
>
> Here is the link to simplemmap:
> https://github.com/ALTREP-examples/Rpkg-simplemmap
>
> Best,
> Jiefei
>
>   [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is it possible to simply the use of NULL slots (or at least improve the help files)?

2020-09-24 Thread Martin Morgan
Answering to convey the 'rules' as I know them, rather than to address the 
underlying issues that I guess you are really after...

The S4 practice is to use setOldClass() to explicitly treat an S3 character() 
vector of classes as an assertion of linear inheritance

> x <- structure (sqrt (37), class = c ("sqrt.prime", "numeric") )
> is(x, "maybeNumber")
[1] FALSE
> setOldClass(class(x))
> is(x, "maybeNumber")
[1] TRUE

There are some quite amusing things that can go on with S3 classes, since the 
class attribute is just a character vector. So

> x <- structure ("September", class = c ("sqrt.prime", "numeric") )
> is(x, "numeric")  ## similarly, inherits()
[1] TRUE
> x <- structure (1, class = c ("numeric", "character"))
> is(x, "numeric")
[1] TRUE
> is(x, "character")
[1] TRUE

Perhaps the looseness of the S3 system motivated the use of setOldClass() for 
anything more than assertion of simple relationships? At least in this context 
setOldClass() provides some type checking sanity

> setOldClass(c("character", "numeric"))
Error in setOldClass(c("character", "numeric")) :
  inconsistent old-style class information for "character"; the class is 
defined but does not extend "numeric" and is not valid as the data part
In addition: Warning message:
In .validDataPartClass(cl, where, dataPartClass) :
  more than one possible class for the data part: using "numeric" rather than 
"character"

Martin Morgan

On 9/24/20, 4:51 PM, "Abby Spurdle"  wrote:

Hi Martin,
Thankyou for your response.

I suspect that we're not going to agree on the main point.
Making it trivially simple (as say Java) to set slots to NULL.
So, I'll move on to the other points here.

***Note that cited text uses excerpts only.***

>   setClassUnion("character_OR_NULL", c("character", "NULL"))
>   A = setClass("A", slots = c(x = "character_OR_NULL"))

I think the above construct needs to be documented much more clearly.
i.e. In the introductory and details pages for S4 classes.
This is something that many people will want to do.
And BasicClasses or NULL-class, are not the most obvious place to
start looking, either.

Also, I'd recommend the S4 authors, go one step further.
Include character_OR_NULL, numeric_OR_NULL, etc, or something similar,
in S4's predefined basic classes.
Otherwise, contributed packages will (eventually) end up with hundreds
of copies of these.

> setClassUnion("maybeNumber", c("numeric", "logical"))
> every instance of numeric _is_ a maybeNumber, e.g.,
> > is(1, "maybeNumber")
> [1] TRUE

> which I think is consistent with the use of 'superclass'

Not quite.

x <- structure (sqrt (37), class = c ("sqrt.prime", "numeric") )
is (x, "numeric") #TRUE
is (x, "maybeNumber") #FALSE

So now, an object x, is a numeric but not a maybeNumber.
Perhaps a class union should be described as a partial imitation of a
superclass, for the purpose of making slots more flexible.


B.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is it possible to simply the use of NULL slots (or at least improve the help files)?

2020-09-24 Thread Martin Morgan
I did ?"NULL at the command line and was lead to ?"NULL-class" and the 
BasicClasses help page in the methods package.

getClass("NULL"), getClass("character") show that these objects are unrelated, 
so a class union is the way to define a class that is the union of these. The 
essence of the behavior you would like is

  setClassUnion("character_OR_NULL", c("character", "NULL"))
  .A = setClass("A", slots = c(x = "character_OR_NULL"))

with

> .A(x = NULL)
An object of class "A"
Slot "x":
NULL

> .A(x = month.abb)
An object of class "A"
Slot "x":
 [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"

> .A(x = 1:5)
Error in validObject(.Object) :
  invalid class "A" object: invalid object for slot "x" in class "A": got class 
"integer", should be or extend class "character_OR_NULL"

I understand there are situations where NULL is desired, perhaps to indicate 
'not yet initialized' and distinct from character(0) or NA_character_, but want 
to mention those often appropriate alternatives.

With

  setClassUnion("maybeNumber", c("numeric", "logical"))

every instance of numeric _is_ a maybeNumber, e.g.,

> is(1, "maybeNumber")
[1] TRUE
> is(1L, "maybeNumber")
[1] TRUE
> is(numeric(), "maybeNumber")
[1] TRUE
> is(NA_integer_, "maybeNumber")
[1] TRUE

which I think is consistent with the use of 'superclass' on the setClassUnion 
help page.

Martin Morgan
 

On 9/23/20, 5:20 PM, "R-devel on behalf of Abby Spurdle" 
 wrote:

As far as I can tell, there's no trivial way to set arbitrary S4 slots to 
NULL.

Most of the online examples I can find, use setClassUnion and are
about 10 years old.
Which, in my opinion, is defective.
There's nothing "robust" about making something that should be
trivially simple, really complicated.

Maybe there is a simpler way, and I just haven't worked it out, yet.
But either way, could the documentation for the methods package be improved?
I can find any obvious info on NULL slots:

Introduction
Classes
Classes_Details
setClass
slot

Again, maybe I missed it.
Even setClassUnion, which is what's used in the online examples,
doesn't contain a NULL slot example.

One more thing:
The help file for setClassUnion, uses the term "superclass", incorrectly.

Its examples include the following:
setClassUnion("maybeNumber", c("numeric", "logical"))

If maybeNumber was the superclass of numeric, then every instance of
numeric would also be an instance of maybeNumber...

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Garbage collection of seemingly PROTECTed pairlist

2020-09-13 Thread Martin Morgan
I put your code into a file tmp.R and eliminated the need for a package by 
compiling this to a shared object

  R CMD SHLIB tmp.c

I'm then able to use a simple script 'tmp.R'

  dyn.load("/tmp/tmp.so")

  fullocate <- function(int_mat)
.Call("C_fullocate", int_mat)

  int_mat <- rbind(c(5L, 6L), c(7L, 10L), c(20L, 30L))

  while(TRUE)
res <- fullocate(int_mat)

to generate a segfault.

Looking at your code, it seemed like I could get towards a simpler reproducible 
example by eliminating most of the 'while' loop and then functions and code 
branches that are not used

  #include 
  
  SEXP C_int_mat_nth_row_nrnc(int *int_mat_int, int nr, int nc, int n) {
  SEXP out = PROTECT(Rf_allocVector(INTSXP, nc));
  int *out_int = INTEGER(out);
  for (int i = 0; i != nr; ++i) {
 out_int[i] = int_mat_int[n - 1 + i * nr];
  }
  UNPROTECT(1);
  return out;}
  
  SEXP C_fullocate(SEXP int_mat) {
  int nr = Rf_nrows(int_mat), *int_mat_int = INTEGER(int_mat);
  int row_num = 2;  // row_num will be 1-indexed
  SEXP prlst0cdr = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, 1));
  SEXP prlst = PROTECT(Rf_list1(prlst0cdr));
  SEXP row = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, row_num));
  Rf_PrintValue(prlst);  // This is where the error occurs
  UNPROTECT(3);
  
  return R_NilValue;
  }
  
my script still gives an error, but not a segfault, and the values printed 
sometimes differ between calls

...

[[1]]
[1] 5 6

.
[[1]]
NULL

...

Error in FUN(X[[i]], ...) :
  cannot coerce type 'NULL' to vector of type 'character'
Calls: message -> .makeMessage -> lapply
Execution halted

The differing values in particular, and the limited PROTECTion in the call and 
small allocations (hence limited need / opportunity for garbage collection), 
suggest that you're corrupting memory, rather than having a problem with 
garbage collection. Indeed,

SEXP prlst0cdr = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, 1));

allocates a vector of length 2 at

SEXP out = PROTECT(Rf_allocVector(INTSXP, nc));

but writes three elements (the 0th, 1st, and 2nd) at

  for (int i = 0; i != nr; ++i) {
  out_int[i] = int_mat_int[n - 1 + i * nr];
  }

Martin Morgan

On 9/11/20, 9:30 PM, "R-devel on behalf of Rory Nolan" 
 wrote:

I want to write an R function using R's C interface that takes a 2-column
matrix of increasing, non-overlapping integer intervals and returns a list
with those intervals plus some added intervals, such that there are no
gaps. For example, it should take the matrix rbind(c(5L, 6L), c(7L, 10L),
c(20L, 30L)) and return list(c(5L, 6L), c(7L, 10L), c(11L, 19L), c(20L,
30L)). Because the output is of variable length, I use a pairlist (because
it is growable) and then I call Rf_PairToVectorList() at the end to make it
into a regular list.

I'm getting a strange garbage collection error. My PROTECTed pairlist prlst
gets garbage collected away and causes a memory leak error when I try to
access it.

Here's my code.

#include 


SEXP C_int_mat_nth_row_nrnc(int *int_mat_int, int nr, int nc, int n) {
  SEXP out = PROTECT(Rf_allocVector(INTSXP, nc));
  int *out_int = INTEGER(out);
  if (n <= 0 | n > nr) {
for (int i = 0; i != nc; ++i) {
  out_int[i] = NA_INTEGER;
}
  } else {
for (int i = 0; i != nr; ++i) {
  out_int[i] = int_mat_int[n - 1 + i * nr];
}
  }
  UNPROTECT(1);
  return out;}

SEXP C_make_len2_int_vec(int first, int second) {
  SEXP out = PROTECT(Rf_allocVector(INTSXP, 2));
  int *out_int = INTEGER(out);
  out_int[0] = first;
  out_int[1] = second;
  UNPROTECT(1);
  return out;}

SEXP C_fullocate(SEXP int_mat) {
  int nr = Rf_nrows(int_mat), *int_mat_int = INTEGER(int_mat);
  int last, row_num;  // row_num will be 1-indexed
  SEXP prlst0cdr = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, 1));
  SEXP prlst = PROTECT(Rf_list1(prlst0cdr));
  SEXP prlst_tail = prlst;
  last = INTEGER(prlst0cdr)[1];
  row_num = 2;
  while (row_num <= nr) {
Rprintf("row_num: %i\n", row_num);
SEXP row = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, row_num));
Rf_PrintValue(prlst);  // This is where the error occurs
int *row_int = INTEGER(row);
if (row_int[0] == last + 1) {
  Rprintf("here1");
  SEXP next = PROTECT(Rf_list1(row));
  prlst_tail = SETCDR(prlst_tail, next);
  last = row_int[1];
  UNPROTECT(1);
  ++row_num;
} else {
  Rprintf("here2");
  SEXP next_car = PROTECT(C_make_len2_int_vec(last + 1, row_int[0] - 
1));
  SEXP next = PROTECT(Rf_list

Re: [Rd] lapply and vapply Primitive Documentation

2020-07-10 Thread Martin Morgan
Was hoping for an almost record old bug fix (older than some R users!), but 
apparently the documentation bug is only a decade old (maybe only older than 
some precious R users)

  
https://github.com/wch/r-source/blame/2118f1d0ff70c1ebd06148b6cb7659efe5ff4d99/src/library/base/man/lapply.Rd#L116

(I don't see lapply / vapply referenced as primitive in the original text 
changed by the commit).

Martin Morgan

On 7/10/20, 3:52 AM, "R-devel on behalf of Martin Maechler" 
 wrote:

>>>>> Cole Miller 
>>>>> on Thu, 9 Jul 2020 20:38:10 -0400 writes:

> The documentation of ?lapply includes:
>> lapply and vapply are primitive functions.

> However, both evaluate to FALSE in `is.primitive()`:

> is.primitive(vapply) #FALSE

> is.primitive(lapply) #FALSE

> It appears that they are not primitives and that the
> documentation might be outdated. Thank you for your time
> and work.

Thank you, Cole.
Indeed, they were primitive originally (but e.g. lapply() seems
to have become .Internal with
   r7885 | ripley | 2000-01-31 08:58:59 +0100 
i.e. about 4 weeks *before* release of R 1.0.0

Changes made to both 'R-devel' and 'R-patched'.
Martin


> Cole Miller

> P.S. During research, my favorite `help()` is
> `?.Internal()`: "Only true R wizards should even consider
> using this function..." Thanks again!

;-)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Build a R call at C level

2020-06-30 Thread Morgan Morgan
Sorry Dirk, I don't remember discussing this topic or alternatives with you
at all.
Have a nice day.

On Tue, 30 Jun 2020, 14:42 Morgan Morgan,  wrote:

> Thanks Jan and Tomas for the feedback.
> Answer from Jan is what I am looking for.
> Maybe I am not looking in the right place buy it is not easy to understand
> how these LCONS, CONS, SETCDR...etc works.
>
> Thank you
> Best regards
> Morgan
>
>
>
> On Tue, 30 Jun 2020, 12:36 Tomas Kalibera, 
> wrote:
>
>> On 6/30/20 1:06 PM, Jan Gorecki wrote:
>> > It is quite known that R documentation on R C api could be improved...
>>
>> Please see "5.11 Evaluating R expressions from C" from "Writing R
>> Extensions"
>>
>> Best
>> Tomas
>>
>> > Still R-package-devel mailing list should be preferred for this kind
>> > of questions.
>> > Not sure if that is the best way, but works.
>> >
>> > call_to_sum <- inline::cfunction(
>> >language = "C",
>> >sig = c(x = "SEXP"), body = "
>> >
>> > SEXP e = PROTECT(lang2(install(\"sum\"), x));
>> > SEXP r_true = PROTECT(CONS(ScalarLogical(1), R_NilValue));
>> > SETCDR(CDR(e), r_true);
>> > SET_TAG(CDDR(e), install(\"na.rm\"));
>> > Rf_PrintValue(e);
>> > SEXP ans = PROTECT(eval(e, R_GlobalEnv));
>> > UNPROTECT(3);
>> > return ans;
>> >
>> > ")
>> >
>> > call_to_sum(c(1L,NA,3L))
>> >
>> > On Tue, Jun 30, 2020 at 10:08 AM Morgan Morgan
>> >  wrote:
>> >> Hi All,
>> >>
>> >> I was reading the R extension manual section 5.11 ( Evaluating R
>> expression
>> >> from C) and I tried to build a simple call to the sum function. Please
>> see
>> >> below.
>> >>
>> >> call_to_sum <- inline::cfunction(
>> >>language = "C",
>> >>sig = c(x = "SEXP"), body = "
>> >>
>> >> SEXP e = PROTECT(lang2(install(\"sum\"), x));
>> >> SEXP ans = PROTECT(eval(e, R_GlobalEnv));
>> >> UNPROTECT(2);
>> >> return ans;
>> >>
>> >> ")
>> >>
>> >> call_to_sum(1:3)
>> >>
>> >> The above works. My question is how do I add the argument "na.rm=TRUE"
>> at C
>> >> level to the above call? I have tried various things based on what is
>> in
>> >> section 5.11 but I did not manage to get it to work.
>> >>
>> >> Thank you
>> >> Best regards
>> >>
>> >>  [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-devel@r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>> > __
>> > R-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Build a R call at C level

2020-06-30 Thread Morgan Morgan
Thanks Jan and Tomas for the feedback.
Answer from Jan is what I am looking for.
Maybe I am not looking in the right place buy it is not easy to understand
how these LCONS, CONS, SETCDR...etc works.

Thank you
Best regards
Morgan



On Tue, 30 Jun 2020, 12:36 Tomas Kalibera,  wrote:

> On 6/30/20 1:06 PM, Jan Gorecki wrote:
> > It is quite known that R documentation on R C api could be improved...
>
> Please see "5.11 Evaluating R expressions from C" from "Writing R
> Extensions"
>
> Best
> Tomas
>
> > Still R-package-devel mailing list should be preferred for this kind
> > of questions.
> > Not sure if that is the best way, but works.
> >
> > call_to_sum <- inline::cfunction(
> >language = "C",
> >sig = c(x = "SEXP"), body = "
> >
> > SEXP e = PROTECT(lang2(install(\"sum\"), x));
> > SEXP r_true = PROTECT(CONS(ScalarLogical(1), R_NilValue));
> > SETCDR(CDR(e), r_true);
> > SET_TAG(CDDR(e), install(\"na.rm\"));
> > Rf_PrintValue(e);
> > SEXP ans = PROTECT(eval(e, R_GlobalEnv));
> > UNPROTECT(3);
> > return ans;
> >
> > ")
> >
> > call_to_sum(c(1L,NA,3L))
> >
> > On Tue, Jun 30, 2020 at 10:08 AM Morgan Morgan
> >  wrote:
> >> Hi All,
> >>
> >> I was reading the R extension manual section 5.11 ( Evaluating R
> expression
> >> from C) and I tried to build a simple call to the sum function. Please
> see
> >> below.
> >>
> >> call_to_sum <- inline::cfunction(
> >>language = "C",
> >>sig = c(x = "SEXP"), body = "
> >>
> >> SEXP e = PROTECT(lang2(install(\"sum\"), x));
> >> SEXP ans = PROTECT(eval(e, R_GlobalEnv));
> >> UNPROTECT(2);
> >> return ans;
> >>
> >> ")
> >>
> >> call_to_sum(1:3)
> >>
> >> The above works. My question is how do I add the argument "na.rm=TRUE"
> at C
> >> level to the above call? I have tried various things based on what is in
> >> section 5.11 but I did not manage to get it to work.
> >>
> >> Thank you
> >> Best regards
> >>
> >>  [[alternative HTML version deleted]]
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Build a R call at C level

2020-06-30 Thread Morgan Morgan
Hi All,

I was reading the R extension manual section 5.11 ( Evaluating R expression
from C) and I tried to build a simple call to the sum function. Please see
below.

call_to_sum <- inline::cfunction(
  language = "C",
  sig = c(x = "SEXP"), body = "

SEXP e = PROTECT(lang2(install(\"sum\"), x));
SEXP ans = PROTECT(eval(e, R_GlobalEnv));
UNPROTECT(2);
return ans;

")

call_to_sum(1:3)

The above works. My question is how do I add the argument "na.rm=TRUE" at C
level to the above call? I have tried various things based on what is in
section 5.11 but I did not manage to get it to work.

Thank you
Best regards

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] subset data.frame at C level

2020-06-24 Thread Morgan Morgan
Thank you Jim for the feedback.

I actually implemented it the way I describe it in my first email and it
seems fast enough for me.

Just to give a bit of context I will need it at some point in package kit.
I also implemented subset by row which I actually need more as I am working
on a faster version of the unique and duplicated function. The function
unique is particularly slow for data.frame. So far I got a 100x speedup.

Best regards
Morgan


On Tue, 23 Jun 2020, 21:11 Jim Hester,  wrote:

> It looks to me like internally .subset2 uses `get1index()`, but this
> function is declared in Defn.h, which AFAIK is not part of the exported R
> API.
>
>  Looking at the code for `get1index()` it looks like it just loops over
> the (translated) names, so I guess I just do that [0].
>
> [0]:
> https://github.com/r-devel/r-svn/blob/1ff1d4197495a6ee1e1d88348a03ff841fd27608/src/main/subscript.c#L226-L235
>
> On Wed, Jun 17, 2020 at 6:11 AM Morgan Morgan 
> wrote:
>
>> Hi,
>>
>> Hope you are well.
>>
>> I was wondering if there is a function at C level that is equivalent to
>> mtcars$carb or .subset2(mtcars, "carb").
>>
>> If I have the index of the column then the answer would be VECTOR_ELT(df,
>> asInteger(idx)) but I was wondering if there is a way to do it directly
>> from the name of the column without having to loop over columns names to
>> find the index?
>>
>> Thank you
>> Best regards
>> Morgan
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] subset data.frame at C level

2020-06-17 Thread Morgan Morgan
Hi,

Hope you are well.

I was wondering if there is a function at C level that is equivalent to
mtcars$carb or .subset2(mtcars, "carb").

If I have the index of the column then the answer would be VECTOR_ELT(df,
asInteger(idx)) but I was wondering if there is a way to do it directly
from the name of the column without having to loop over columns names to
find the index?

Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Precision of function mean,bug?

2020-05-21 Thread Morgan Morgan
Sorry, posting back to the list.
Thank you all.
Morgan

On Thu, 21 May 2020, 16:33 Henrik Bengtsson, 
wrote:

> Hi.
>
> Good point and a good example. Feel free to post to the list. The purpose
> of my reply wasn't to take away Peter's point but to emphasize that
> base::mean() does a two-pass scan over the elements too lower the impact of
> addition of values with widely different values (classical problem in
> numerical analysis). But I can see how it may look like that.
>
> Cheers,
>
> Henrik
>
>
> On Thu, May 21, 2020, 03:21 Morgan Morgan 
> wrote:
>
>> Thank you Henrik for the feedback.
>> Note that for idx=4 and refine = TRUE,  your equality b==c is FALSE. I
>> think that as Peter said == can't be trusted with FP.
>> His example is good. Here is an even more shocking one.
>> a=0.786546798
>> b=a+ 1e6 -1e6
>> a==b
>> # [1] FALSE
>>
>> Best regards
>> Morgan Jacob
>>
>> On Wed, 20 May 2020, 20:18 Henrik Bengtsson, 
>> wrote:
>>
>>> On Wed, May 20, 2020 at 11:10 AM brodie gaslam via R-devel
>>>  wrote:
>>> >
>>> >  > On Wednesday, May 20, 2020, 7:00:09 AM EDT, peter dalgaard <
>>> pda...@gmail.com> wrote:
>>> > >
>>> > > Expected, see FAQ 7.31.
>>> > >
>>> > > You just can't trust == on FP operations. Notice also
>>> >
>>> > Additionally, since you're implementing a "mean" function you are
>>> testing
>>> > against R's mean, you might want to consider that R uses a two-pass
>>> > calculation[1] to reduce floating point precision error.
>>>
>>> This one is important.
>>>
>>> FWIW, matrixStats::mean2() provides argument refine=TRUE/FALSE to
>>> calculate mean with and without this two-pass calculation;
>>>
>>> > a <- c(x[idx],y[idx],z[idx]) / 3
>>> > b <- mean(c(x[idx],y[idx],z[idx]))
>>> > b == a
>>> [1] FALSE
>>> > b - a
>>> [1] 2.220446e-16
>>>
>>> > c <- matrixStats::mean2(c(x[idx],y[idx],z[idx]))  ## default to
>>> refine=TRUE
>>> > b == c
>>> [1] TRUE
>>> > b - c
>>> [1] 0
>>>
>>> > d <- matrixStats::mean2(c(x[idx],y[idx],z[idx]), refine=FALSE)
>>> > a == d
>>> [1] TRUE
>>> > a - d
>>> [1] 0
>>> > c == d
>>> [1] FALSE
>>> > c - d
>>> [1] 2.220446e-16
>>>
>>> Not surprisingly, the two-pass higher-precision version (refine=TRUE)
>>> takes roughly twice as long as the one-pass quick version
>>> (refine=FALSE).
>>>
>>> /Henrik
>>>
>>> >
>>> > Best,
>>> >
>>> > Brodie.
>>> >
>>> > [1]
>>> https://github.com/wch/r-source/blob/tags/R-4-0-0/src/main/summary.c#L482
>>> >
>>> > > > a2=(z[idx]+x[idx]+y[idx])/3
>>> > > > a2==a
>>> > > [1] FALSE
>>> > > > a2==b
>>> > > [1] TRUE
>>> > >
>>> > > -pd
>>> > >
>>> > > > On 20 May 2020, at 12:40 , Morgan Morgan <
>>> morgan.email...@gmail.com> wrote:
>>> > > >
>>> > > > Hello R-dev,
>>> > > >
>>> > > > Yesterday, while I was testing the newly implemented function
>>> pmean in
>>> > > > package kit, I noticed a mismatch in the output of the below R
>>> expressions.
>>> > > >
>>> > > > set.seed(123)
>>> > > > n=1e3L
>>> > > > idx=5
>>> > > > x=rnorm(n)
>>> > > > y=rnorm(n)
>>> > > > z=rnorm(n)
>>> > > > a=(x[idx]+y[idx]+z[idx])/3
>>> > > > b=mean(c(x[idx],y[idx],z[idx]))
>>> > > > a==b
>>> > > > # [1] FALSE
>>> > > >
>>> > > > For idx= 1, 2, 3, 4 the last line is equal to TRUE. For 5, 6 and
>>> many
>>> > > > others the difference is small but still.
>>> > > > Is that expected or is it a bug?
>>> >
>>> > __
>>> > R-devel@r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Precision of function mean,bug?

2020-05-20 Thread Morgan Morgan
Hello R-dev,

Yesterday, while I was testing the newly implemented function pmean in
package kit, I noticed a mismatch in the output of the below R expressions.

set.seed(123)
n=1e3L
idx=5
x=rnorm(n)
y=rnorm(n)
z=rnorm(n)
a=(x[idx]+y[idx]+z[idx])/3
b=mean(c(x[idx],y[idx],z[idx]))
a==b
# [1] FALSE

For idx= 1, 2, 3, 4 the last line is equal to TRUE. For 5, 6 and many
others the difference is small but still.
Is that expected or is it a bug?

Thank you
Best Regards
Morgan Jacob

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] psum/pprod

2020-05-16 Thread Morgan Morgan
Good morning All,

Just wanted to do quick follow-up on this thread:
https://r.789695.n4.nabble.com/There-is-pmin-and-pmax-each-taking-na-rm-how-about-psum-td4647841.html

For those (including the R-core team) of you who are interested in a C
implementation of psum and pprod there is one in the "kit" package (I am
the author) on CRAN.

I will continue working on the package in my spare time if I see that users
are missing basic functionalities not implemented in base R.

Have a great weekend.
Kind regards
Morgan Jacob

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] defining r audio connections

2020-05-06 Thread Martin Morgan
yep, you're right, after some initial clean-up and running with or without 
--as-cran R CMD check gives a NOTE

  *  checking compiled code
  File ‘socketeer/libs/socketeer.so’:
Found non-API calls to R: ‘R_GetConnection’,
   ‘R_new_custom_connection’
   
  Compiled code should not call non-API entry points in R.
   
  See 'Writing portable packages' in the 'Writing R Extensions' manual.

Connections in general seem more useful than ad-hoc functions, though perhaps 
for Frederick's use case Duncan's suggestion is sufficient. For non-CRAN 
packages I personally would implement a connection.

(I mistakenly thought this was a more specialized mailing list; I wouldn't have 
posted to R-devel on this topic otherwise)

Martin Morgan

On 5/6/20, 4:12 PM, "Gábor Csárdi"  wrote:

AFAIK that API is not allowed on CRAN. It triggers a NOTE or a
WARNING, and your package will not be published.

    Gabor

On Wed, May 6, 2020 at 9:04 PM Martin Morgan  
wrote:
>
> The public connection API is defined in
>
> https://github.com/wch/r-source/blob/trunk/src/include/R_ext/Connections.h
>
> I'm not sure of a good pedagogic example; people who want to write their 
own connections usually want to do so for complicated reasons!
>
> This is my own abandoned attempt 
https://github.com/mtmorgan/socketeer/blob/b0a1448191fe5f79a3f09d1f939e1e235a22cf11/src/connection.c#L169-L192
 where connection_local_client() is called from R and _connection_local() 
creates and populates the appropriate structure. Probably I have done things 
totally wrong (e.g., by not checking the version of the API, as advised in the 
header file!)
>
> Martin Morgan
>
> On 5/6/20, 2:26 PM, "R-devel on behalf of Duncan Murdoch" 
 wrote:
>
> On 06/05/2020 1:09 p.m., frede...@ofb.net wrote:
> > Dear R Devel,
> >
> > Since Linux moved away from using a file-system interface for 
audio, I think it is necessary to write special libraries to interface with 
audio hardware from various languages on Linux.
> >
> > In R, it seems like the appropriate datatype for a `snd_pcm_t` 
handle pointing to an open ALSA source or sink would be a "connection". 
Connection types are already defined in R for "file", "url", "pipe", "fifo", 
"socketConnection", etc.
> >
> > Is there a tutorial or an example package where a new type of 
connection is defined, so that I can see how to do this properly in a package?
> >
> > I can see from the R source that, for example, `do_gzfile` is 
defined in `connections.c` and referenced in `names.c`. However, I thought I 
should ask here first in case there is a better place to start, than trying to 
copy this code.
> >
> > I only want an object that I can use `readBin` and `writeBin` on, 
to read and write audio data using e.g. `snd_pcm_writei` which is part of the 
`alsa-lib` package.
>
> I don't think R supports user-defined connections, but probably 
writing
> readBin and writeBin equivalents specific to your library wouldn't be
> any harder than creating a connection.  For those, you will probably
> want to work with an "external pointer" (see Writing R Extensions).
> Rcpp probably has support for these if you're working in C++.
>
> Duncan Murdoch
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] defining r audio connections

2020-05-06 Thread Martin Morgan
The public connection API is defined in

https://github.com/wch/r-source/blob/trunk/src/include/R_ext/Connections.h

I'm not sure of a good pedagogic example; people who want to write their own 
connections usually want to do so for complicated reasons!

This is my own abandoned attempt 
https://github.com/mtmorgan/socketeer/blob/b0a1448191fe5f79a3f09d1f939e1e235a22cf11/src/connection.c#L169-L192
 where connection_local_client() is called from R and _connection_local() 
creates and populates the appropriate structure. Probably I have done things 
totally wrong (e.g., by not checking the version of the API, as advised in the 
header file!)

Martin Morgan

On 5/6/20, 2:26 PM, "R-devel on behalf of Duncan Murdoch" 
 wrote:

On 06/05/2020 1:09 p.m., frede...@ofb.net wrote:
> Dear R Devel,
> 
> Since Linux moved away from using a file-system interface for audio, I 
think it is necessary to write special libraries to interface with audio 
hardware from various languages on Linux.
> 
> In R, it seems like the appropriate datatype for a `snd_pcm_t` handle 
pointing to an open ALSA source or sink would be a "connection". Connection 
types are already defined in R for "file", "url", "pipe", "fifo", 
"socketConnection", etc.
> 
> Is there a tutorial or an example package where a new type of connection 
is defined, so that I can see how to do this properly in a package?
> 
> I can see from the R source that, for example, `do_gzfile` is defined in 
`connections.c` and referenced in `names.c`. However, I thought I should ask 
here first in case there is a better place to start, than trying to copy this 
code.
> 
> I only want an object that I can use `readBin` and `writeBin` on, to read 
and write audio data using e.g. `snd_pcm_writei` which is part of the 
`alsa-lib` package.

I don't think R supports user-defined connections, but probably writing 
readBin and writeBin equivalents specific to your library wouldn't be 
any harder than creating a connection.  For those, you will probably 
want to work with an "external pointer" (see Writing R Extensions). 
Rcpp probably has support for these if you're working in C++.

Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Hash functions at C level

2020-05-03 Thread Morgan Morgan
Dear R-dev,

Hope you are all well.
I would like to know if there is a hash function available for the R C API?
I noticed that there are hash structures and functions defined in the file
"unique.c". These would definitly suit my needs, however is there a way to
access them at C level?
Thank you for your time.
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Long vector support in data.frame

2020-01-23 Thread Morgan Morgan
Hi All,

Happy New Year!

I was wondering if there is a plan at some point to support long vectors in
data.frames?
I understand that it would need some internal changes to lift the current
limit.
If there is a plan what is currently preventing it from happening? Is it
time, resources? If so is there a way for people willing to help to
contribute or help the R-dev team? How?
I noticed that an increasing number of function are supporting long vectors
in base R. Is there more functions that need to support long vectors before
having long vectors support in data.frames?

Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Aggregate function FR

2019-11-20 Thread Morgan Morgan
Hi,

I was wondering if it would be possible to add an argument to the aggreagte
function to retain NA by categories?(default can not to in order to avoid
breaking code) Please see below example:

df = iris
df$Species[5] = NA
aggregate(`Petal.Width` ~ Species, df, sum) # does not include NA
aggregate(`Petal.Width` ~ addNA(Species), df, sum) # include NA

data.table and dplyr include NA by default.
Python pandas has an aggreagate function inspired by base R aggregate. An
option has been added to include NA.

Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Questions on the R C API

2019-11-04 Thread Morgan Morgan
Thank you for your reply Jiefei.
I think in theory your solution should work. I'll have to give them a try.


On Mon, 4 Nov 2019 23:41 Wang Jiefei,  wrote:

> Hi Morgan,
>
> My solutions might not be the best one(I believe it's not), but it should
> work for your question.
>
> 1. Have you considered Rf_duplicate function? If you want to change the
> value of `a` and reset it later, you have to have a duplication somewhere
> for resetting it. Instead of changing the value of `a` directly, why not
> changing the value of a duplicated `a`? So you do not have to reset it.
>
> 2. I think a pairlist behaves like a linked list(I might be wrong here and
> please correct me if so). Therefore, there is no simple way to locate an
> element in a pairlist. As for as I know, R defines a set of
> convenient functions for you to access a limited number of elements. See
> below
>
> ```
> #define CAR(e) ((e)->u.listsxp.carval)
> #define CDR(e) ((e)->u.listsxp.cdrval)
> #define CAAR(e) CAR(CAR(e))
> #define CDAR(e) CDR(CAR(e))
> #define CADR(e) CAR(CDR(e))
> #define CDDR(e) CDR(CDR(e))
> #define CDDDR(e) CDR(CDR(CDR(e)))
> #define CADDR(e) CAR(CDR(CDR(e)))
> #define CADDDR(e) CAR(CDR(CDR(CDR(e
> #define CAD4R(e) CAR(CDR(CDR(CDR(CDR(e)
> ```
>
> You can use them to get first a few arguments from a pairlist. Another
> solution would be converting the pairlist into a list so that you can use
> the methods defined for a list to access any element. I do not know which C
> function can achieve that but `as.list` at R level should be able to do
> this job, you can evaluate an R function at C level and get the list
> result( By calling `Rf_eval`). I think this operation is relatively low
> cost because the list should only contain a set of pointers pointing to
> each element. There is no object duplication(Again I might be wrong here).
>


So there is no way to reset a pairlist to its first element?


> 3. You can get unevaluated expression at the R level before you call the C
> function and pass it to your C function( by calling `substitute` function).
> However, from my vague memory, the expression would be eventually evaluated
> at the C level even you pass the expression to it. Therefore, I think you
> can create a list of unevaluated arguments before you enter the C function,
> so your C function can expect a list rather than a pairlist as its
> argument. This can solve both your second and third questions.
>

Correct me if I am wrong but does it mean that I will have to change "..."
to "list(...)" and use .Call instead of .External?

Also does it mean that to avoid expression to be evaluated at the R level,
I have to use "list" or "substitute"? The function "switch" in R does not
use them but manage to achieve that.

switch(1, "a", stop("a"))
#[1] "a"

It is a primitive but I don't understand how it manage to do that.

Best,
Morgan



> Best,
> Jiefei
>
>
> On Mon, Nov 4, 2019 at 2:41 PM Morgan Morgan 
> wrote:
>
>> Hi All,
>>
>> I have some questions regarding the R C API.
>>
>> Let's assume I have a function which is defined as follows:
>>
>> R file:
>>
>> myfunc <- function(a, b, ...) .External(Cfun, a, b, ...)
>>
>> C file:
>>
>> SEXP Cfun(SEXP args) {
>>   args = CDR(args);
>>   SEXP a = CAR(args); args = CDR(args);
>>   SEXP b = CAR(args); args = CDR(args);
>>   /* continue to do something with remaining arguments in "..." using the
>> same logic as above*/
>>
>>   return R_NilValue;
>> }
>>
>> 1/ Let's suppose that in my c function I change the value of a inside the
>> function but I want to reset it to what it was when I did SEXP a =
>> CAR(args); . How can I do that?
>>
>> 2/Is there a method to set "args" at a specific position so I can access a
>> specific value of my choice? If yes, do you have an simple example?
>>
>> 3/ Let's suppose now, I call the function in R. Is there a way to avoid
>> the
>> function to evaluate its arguments before going to the C call? Do I have
>> to
>> do it at the R level or can it be done at the C level?
>>
>> Thank you very much in advance.
>> Best regards
>> Morgan
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Questions on the R C API

2019-11-04 Thread Morgan Morgan
Hi All,

I have some questions regarding the R C API.

Let's assume I have a function which is defined as follows:

R file:

myfunc <- function(a, b, ...) .External(Cfun, a, b, ...)

C file:

SEXP Cfun(SEXP args) {
  args = CDR(args);
  SEXP a = CAR(args); args = CDR(args);
  SEXP b = CAR(args); args = CDR(args);
  /* continue to do something with remaining arguments in "..." using the
same logic as above*/

  return R_NilValue;
}

1/ Let's suppose that in my c function I change the value of a inside the
function but I want to reset it to what it was when I did SEXP a =
CAR(args); . How can I do that?

2/Is there a method to set "args" at a specific position so I can access a
specific value of my choice? If yes, do you have an simple example?

3/ Let's suppose now, I call the function in R. Is there a way to avoid the
function to evaluate its arguments before going to the C call? Do I have to
do it at the R level or can it be done at the C level?

Thank you very much in advance.
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] New matrix function

2019-10-11 Thread Morgan Morgan
Basically the problem is to find the position of a submatrix inside a
larger matrix. Here are some links describing the problem:

https://stackoverflow.com/questions/10529278/fastest-way-to-find-a-m-x-n-submatrix-in-m-x-n-matrix

https://stackoverflow.com/questions/16750739/find-a-matrix-in-a-big-matrix

Best
Morgan

On Fri, 11 Oct 2019 23:36 Gabor Grothendieck, 
wrote:

> The link you posted used the same inputs as in my example. If that is
> not what you meant maybe
> a different example is needed.
> Regards.
>
> On Fri, Oct 11, 2019 at 2:39 PM Pages, Herve  wrote:
> >
> > Has someone looked into the image processing area for this? That sounds
> > a little bit too high-level for base R to me (and I would be surprised
> > if any mainstream programming language had this kind of functionality
> > built-in).
> >
> > H.
> >
> > On 10/11/19 03:44, Morgan Morgan wrote:
> > > Hi All,
> > >
> > > I was looking for a function to find a small matrix inside a larger
> matrix
> > > in R similar to the one described in the following link:
> > >
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mathworks.com_matlabcentral_answers_194708-2Dindex-2Da-2Dsmall-2Dmatrix-2Din-2Da-2Dlarger-2Dmatrix&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=v96tqHMO3CLNBS7KTmdshM371i6W_v8_2H5bdVy_KHo&s=9Eu0WySIEzrWuYXFhwhHETpZQzi6hHLd84DZsbZsXYY&e=
> > >
> > > I couldn't find anything.
> > >
> > > The above function can be seen as a "generalisation" of the "which"
> > > function as well as the function described in the following post:
> > >
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__coolbutuseless.github.io_2018_04_03_finding-2Da-2Dlength-2Dn-2Dneedle-2Din-2Da-2Dhaystack_&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=v96tqHMO3CLNBS7KTmdshM371i6W_v8_2H5bdVy_KHo&s=qZ3SJ8t8zEDA-em4WT7gBmN66qvvCKKKXRJunoF6P3k&e=
> > >
> > > Would be possible to add such a function to base R?
> > >
> > > I am happy to work with someone from the R core team (if you wish) and
> > > suggest an implementation in C.
> > >
> > > Thank you
> > > Best regards,
> > > Morgan
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=v96tqHMO3CLNBS7KTmdshM371i6W_v8_2H5bdVy_KHo&s=tyVSs9EYVBd_dmVm1LSC23GhUzbBv8ULvtsveo-COoU&e=
> > >
> >
> > --
> > Hervé Pagès
> >
> > Program in Computational Biology
> > Division of Public Health Sciences
> > Fred Hutchinson Cancer Research Center
> > 1100 Fairview Ave. N, M1-B514
> > P.O. Box 19024
> > Seattle, WA 98109-1024
> >
> > E-mail: hpa...@fredhutch.org
> > Phone:  (206) 667-5791
> > Fax:(206) 667-1319
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] New matrix function

2019-10-11 Thread Morgan Morgan
Your answer makes much more sense to me.
I will probably end up adding the function to a package.
Some processes and decisions on how R is developed seems to be obscure to
me.

Thank you
Morgan

On Fri, 11 Oct 2019 15:30 Avraham Adler,  wrote:

> It’s rather difficult. For example, the base R Kendall tau is written with
> the naive O(n^2). The much faster O(n log n) implementation was programmed
> and is in the pcaPP package. When I say much faster, I mean that my
> implementation in Excel VBA was faster than R for 10,000 or so pairs.
> R-Core decided not to implement that code, and instead made a note about
> the faster implementation living in pcaPP in the help for “cor”. See [1]
> for the 2012 discussion. My point is it’s really really difficult to get
> something in Base R. Develop it well, put it in a package, and you have
> basically the same result.
>
> Avi
>
> [1] https://stat.ethz.ch/pipermail/r-devel/2012-June/064351.html
>
> On Fri, Oct 11, 2019 at 9:55 AM Morgan Morgan 
> wrote:
>
>> How do you prove usefulness of a feature?
>> Do you have an example of a feature that has been added after proving to
>> be
>> useful in the package space first?
>>
>> Thank you,
>> Morgan
>>
>> On Fri, 11 Oct 2019 13:53 Michael Lawrence, 
>> wrote:
>>
>> > Thanks for this interesting suggestion, Morgan. While there is no strict
>> > criteria for base R inclusion, one criterion relevant in this case is
>> that
>> > the usefulness of a feature be proven in the package space first.
>> >
>> > Michael
>> >
>> >
>> > On Fri, Oct 11, 2019 at 5:19 AM Morgan Morgan <
>> morgan.email...@gmail.com>
>> > wrote:
>> >
>> >> On Fri, 11 Oct 2019 10:45 Duncan Murdoch, 
>> >> wrote:
>> >>
>> >> > On 11/10/2019 6:44 a.m., Morgan Morgan wrote:
>> >> > > Hi All,
>> >> > >
>> >> > > I was looking for a function to find a small matrix inside a larger
>> >> > matrix
>> >> > > in R similar to the one described in the following link:
>> >> > >
>> >> > >
>> >> >
>> >>
>> https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix
>> >> > >
>> >> > > I couldn't find anything.
>> >> > >
>> >> > > The above function can be seen as a "generalisation" of the "which"
>> >> > > function as well as the function described in the following post:
>> >> > >
>> >> > >
>> >> >
>> >>
>> https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/
>> >> > >
>> >> > > Would be possible to add such a function to base R?
>> >> > >
>> >> > > I am happy to work with someone from the R core team (if you wish)
>> and
>> >> > > suggest an implementation in C.
>> >> >
>> >> > That seems like it would sometimes be a useful function, and maybe
>> >> > someone will point out a package that already contains it.  But if
>> not,
>> >> > why would it belong in base R?
>> >> >
>> >>
>> >> If someone already implemented it, that would great indeed. I think it
>> is
>> >> a
>> >> very general and basic function, hence base R could be a good place for
>> >> it?
>> >>
>> >> But this is probably not a good reason; maybe someone from the R core
>> team
>> >> can shed some light on how they decide whether or not to include a
>> >> function
>> >> in base R?
>> >>
>> >>
>> >> > Duncan Murdoch
>> >> >
>> >>
>> >> [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-devel@r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>> >>
>> >
>> >
>> > --
>> > Michael Lawrence
>> > Scientist, Bioinformatics and Computational Biology
>> > Genentech, A Member of the Roche Group
>> > Office +1 (650) 225-7760
>> > micha...@gene.com
>> >
>> > Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> --
> Sent from Gmail Mobile
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] New matrix function

2019-10-11 Thread Morgan Morgan
I think you are confusing package and function here. Plus some of the R
Core packages, that you mention, contain functions that should probably be
replaced by functions with better implementation from packages on CRAN.

Best regards
Morgan

On Fri, 11 Oct 2019 15:22 Joris Meys,  wrote:

>
>
> On Fri, Oct 11, 2019 at 3:55 PM Morgan Morgan 
> wrote:
>
>> How do you prove usefulness of a feature?
>> Do you have an example of a feature that has been added after proving to
>> be
>> useful in the package space first?
>>
>> Thank you,
>> Morgan
>>
>
> The parallel package (a base package like utils, stats, ...) was added as
> a drop-in replacement of the packages snow and multicore for parallel
> computing. That's one example, but sure there's more.
>
> Kind regards
> Joris
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
>
> <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>
>
> ---
> Biowiskundedagen 2018-2019
> http://www.biowiskundedagen.ugent.be/
>
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] New matrix function

2019-10-11 Thread Morgan Morgan
How do you prove usefulness of a feature?
Do you have an example of a feature that has been added after proving to be
useful in the package space first?

Thank you,
Morgan

On Fri, 11 Oct 2019 13:53 Michael Lawrence, 
wrote:

> Thanks for this interesting suggestion, Morgan. While there is no strict
> criteria for base R inclusion, one criterion relevant in this case is that
> the usefulness of a feature be proven in the package space first.
>
> Michael
>
>
> On Fri, Oct 11, 2019 at 5:19 AM Morgan Morgan 
> wrote:
>
>> On Fri, 11 Oct 2019 10:45 Duncan Murdoch, 
>> wrote:
>>
>> > On 11/10/2019 6:44 a.m., Morgan Morgan wrote:
>> > > Hi All,
>> > >
>> > > I was looking for a function to find a small matrix inside a larger
>> > matrix
>> > > in R similar to the one described in the following link:
>> > >
>> > >
>> >
>> https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix
>> > >
>> > > I couldn't find anything.
>> > >
>> > > The above function can be seen as a "generalisation" of the "which"
>> > > function as well as the function described in the following post:
>> > >
>> > >
>> >
>> https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/
>> > >
>> > > Would be possible to add such a function to base R?
>> > >
>> > > I am happy to work with someone from the R core team (if you wish) and
>> > > suggest an implementation in C.
>> >
>> > That seems like it would sometimes be a useful function, and maybe
>> > someone will point out a package that already contains it.  But if not,
>> > why would it belong in base R?
>> >
>>
>> If someone already implemented it, that would great indeed. I think it is
>> a
>> very general and basic function, hence base R could be a good place for
>> it?
>>
>> But this is probably not a good reason; maybe someone from the R core team
>> can shed some light on how they decide whether or not to include a
>> function
>> in base R?
>>
>>
>> > Duncan Murdoch
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
> --
> Michael Lawrence
> Scientist, Bioinformatics and Computational Biology
> Genentech, A Member of the Roche Group
> Office +1 (650) 225-7760
> micha...@gene.com
>
> Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] New matrix function

2019-10-11 Thread Morgan Morgan
On Fri, 11 Oct 2019 10:45 Duncan Murdoch,  wrote:

> On 11/10/2019 6:44 a.m., Morgan Morgan wrote:
> > Hi All,
> >
> > I was looking for a function to find a small matrix inside a larger
> matrix
> > in R similar to the one described in the following link:
> >
> >
> https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix
> >
> > I couldn't find anything.
> >
> > The above function can be seen as a "generalisation" of the "which"
> > function as well as the function described in the following post:
> >
> >
> https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/
> >
> > Would be possible to add such a function to base R?
> >
> > I am happy to work with someone from the R core team (if you wish) and
> > suggest an implementation in C.
>
> That seems like it would sometimes be a useful function, and maybe
> someone will point out a package that already contains it.  But if not,
> why would it belong in base R?
>

If someone already implemented it, that would great indeed. I think it is a
very general and basic function, hence base R could be a good place for it?

But this is probably not a good reason; maybe someone from the R core team
can shed some light on how they decide whether or not to include a function
in base R?


> Duncan Murdoch
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] New matrix function

2019-10-11 Thread Morgan Morgan
Hi All,

I was looking for a function to find a small matrix inside a larger matrix
in R similar to the one described in the following link:

https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix

I couldn't find anything.

The above function can be seen as a "generalisation" of the "which"
function as well as the function described in the following post:

https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/

Would be possible to add such a function to base R?

I am happy to work with someone from the R core team (if you wish) and
suggest an implementation in C.

Thank you
Best regards,
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Evaluate part of an expression at C level

2019-09-27 Thread Morgan Morgan
Hi,

I am wondering if the below is possible?
Let's assume I have the following expression:

1:10 < 5

Is there a way at the R C API level to only evaluate the 5th element (i.e 5
< 5) instead of evaluating the whole expression and then select the 5th
element in the logical vector?

Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Convert STRSXP or INTSXP to factor

2019-07-15 Thread Morgan Morgan
Hi,

Using the R C PAI, is there a way to convert to convert STRSXP or INTSXP to
factor.

The idea would be to do in C something similar to the "factor" function
(example below):

> letters[1:5]
# [1] "a" "b" "c" "d" "e"

> factor(letters[1:5])
# [1] a b c d e
# Levels: a b c d e

There is the function setAttrib the levels of a SXP however when returned
to R the object is of type character not factor. Ideally what i would like
to return from the C function is the same output as above when the input is
of type character.

Please let me if you need more informations.
Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R C API resize matrix

2019-06-17 Thread Morgan Morgan
Hi,

Is there a way to resize a matrix defined as follows:

SEXP a = PROTECT(allocMatrix(INTSXP, 10, 2));
int *pa  = INTEGER(a)

To row = 5 and col = 1 or do I have to allocate a second matrix "b" with
pointer *pb and do a "for" loop to transfer the value of a to b?

Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Package inclusion in R core implementation

2019-03-01 Thread Morgan Morgan
Hi,

It sometimes happens that some packages get included to R like for example
the parallel package.

I was wondering if there is a process to decide whether or not to include a
package in the core implementation of R?

For example, why not include the Rcpp package, which became for a lot of
user the main tool to extend R?

What is our view on the (not so well known) dotCall64 package which is an
interesting alternative for extending R?

Thank you
Best regards,
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Compiler + stopifnot bug

2019-01-03 Thread Martin Morgan
For what it's worth this also introduced

> df = data.frame(v = package_version("1.2"))
> rbind(df, df)$v
 [[1]]
 [1] 1 2

 [[2]]
 [1] 1 2

instead of

> rbind(df, df)$v
[1] '1.2' '1.2'

which shows up in Travis builds of Bioconductor packages

  https://stat.ethz.ch/pipermail/bioc-devel/2019-January/014506.html

and elsewhere

Martin Morgan

On 1/3/19, 7:05 PM, "R-devel on behalf of Duncan Murdoch" 
 wrote:

On 03/01/2019 3:37 p.m., Duncan Murdoch wrote:
> I see this too; by bisection, it seems to have first appeared in r72943.

Sorry, that was a typo.  I meant r75943.

Duncan Murdoch

> 
> Duncan Murdoch
> 
> On 03/01/2019 2:18 p.m., Iñaki Ucar wrote:
>> Hi,
>>
>> I found the following issue in r-devel (2019-01-02 r75945):
>>
>> `foo<-` <- function(x, value) {
>> bar(x) <- value * x
>> x
>> }
>>
>> `bar<-` <- function(x, value) {
>> stopifnot(all(value / x == 1))
>> x + value
>> }
>>
>> `foo<-` <- compiler::cmpfun(`foo<-`)
>> `bar<-` <- compiler::cmpfun(`bar<-`)
>>
>> x <- c(2, 2)
>> foo(x) <- 1
>> x # should be c(4, 4)
>> #> [1] 3 3
>>
>> If the functions are not compiled or the stopifnot call is removed,
>> the snippet works correctly. So it seems that something is messing
>> around with the references to "value" when the call to stopifnot gets
>> compiled, and the wrong "value" is modified. Note also that if "x <-
>> 2", then the result is correct, 4.
>>
>> Regards,
>>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] length of `...`

2018-05-03 Thread Martin Morgan

nargs() provides the number of arguments without evaluating them

> f = function(x, ..., y) nargs()
> f()
[1] 0
> f(a=1, b=2)
[1] 2
> f(1, a=1, b=2)
[1] 3
> f(x=1, a=1, b=2)
[1] 3
> f(stop())
[1] 1


On 05/03/2018 11:01 AM, William Dunlap via R-devel wrote:

In R-3.5.0 you can use ...length():
   > f <- function(..., n) ...length()
   > f(stop("one"), stop("two"), stop("three"), n=7)
   [1] 3

Prior to that substitute() is the way to go
   > g <- function(..., n) length(substitute(...()))
   > g(stop("one"), stop("two"), stop("three"), n=7)
   [1] 3

R-3.5.0 also has the ...elt(n) function, which returns
the evaluated n'th entry in ... , without evaluating the
other ... entries.
   > fn <- function(..., n) ...elt(n)
   > fn(stop("one"), 3*5, stop("three"), n=2)
   [1] 15

Prior to 3.5.0, eval the appropriate component of the output
of substitute() in the appropriate environment:
   > gn <- function(..., n) {
   +   nthExpr <- substitute(...())[[n]]
   +   eval(nthExpr, envir=parent.frame())
   + }
   > gn(stop("one"), environment(), stop("two"), n=2)
   




Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, May 3, 2018 at 7:29 AM, Dénes Tóth  wrote:


Hi,


In some cases the number of arguments passed as ... must be determined
inside a function, without evaluating the arguments themselves. I use the
following construct:

dotlength <- function(...) length(substitute(expression(...))) - 1L

# Usage (returns 3):
dotlength(1, 4, something = undefined)

How can I define a method for length() which could be called directly on
`...`? Or is it an intention to extend the base length() function to accept
ellipses?


Regards,
Denes

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




This email message may contain legally privileged and/or...{{dropped:2}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Martin Morgan


On 05/03/2018 05:48 AM, Joris Meys wrote:

Dear all,

I've been diving a bit deeper into this per request of Tomas Kalibra, and
found the following :

- the lock on the file is only after trying to read it using oligo, so
that's not a R problem in itself. The problem is independent of extrenal
packages.

- using Windows' fc utility and cygwin's cmp utility I found out that every
so often the download.file() function inserts an extra byte. There's no
real obvious pattern in how these bytes are added, but the file downloaded
using download.file() is actually larger (in this case by about 8 kb). The
file xxx_inR.CEL.gz is read in using:


I believe the difference in mode = "w" vs "wb", and the reason this is 
restricted to Windows downloads, is due to the difference in text file 
line endings, where with mode="w", download.file (and many other 
utilities outside R) recognize the "foo\n" as "foo\r\n". Obviously this 
messes up binary files.


I guess in the CEL.gz file there are about 8k "\n" characters.

Henrik's suggestion (default = "wb") would introduce the complementary 
problem -- text files would have incorrect line endings.


Martin





setwd("E:/Temp/genexpr/Compare")
id <- "GSM907854"
flink <- paste0("
https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907854&format=file&file=GSM907854%2ECEL%2Egz
")
fname <- paste0(id,"_inR.CEL.gz")
download.file(flink,
   destfile = fname)

The file xxx_direct.CEL.gz is downloaded from
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907854 (download link
at the bottom of the page).

Output of dir in CMD:

05/03/2018  11:02 AM 4,529,547 GSM907854_direct.CEL.gz
05/03/2018  11:17 AM 4,537,668 GSM907854_inR.CEL.gz

or from R :


diff(file.size(dir())) # contains both CEL files.

[1] 8121

Strangely enough I get the following message from download.file() :

Content type 'application/octet-stream' length 4529547 bytes (4.3 MB)
downloaded 4.3 MB

So the reported length is exactly the same as if I would download the file
directly, but the file on disk itself is larger. So it seems
download.file() is adding bytes when saving the data on disk.  This
behaviour is independent of antivirus and/or firewalls turned on or off.

Also keep in mind that these are NOT standard gzipped files. These files
are a specific format for Affymetrix Human Gene 1.0 ST Arrays.

If I need to run other tests, please let me know.
Kind regards

Joris

On Wed, May 2, 2018 at 9:21 PM, Joris Meys  wrote:


Dear all,

I've noticed by trying to download gz files from here :
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811

At the bottom one can download GSM907811.CEL.gz . If I download this
manually and try

oligo::read.celfiles("GSM907811.CEL.gz")

everything works fine. (oligo is a bioConductor package)

However, if I download using

download.file("https://www.ncbi.nlm.nih.gov/geo/download/
?acc=GSM907811&format=file&file=GSM907811%2ECEL%2Egz",
   destfile = "GSM907811.CEL.gz")

The file is downloaded, but oligo::read.celfiles() returns the following
error:

Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) :
   End of gz file reached unexpectedly. Perhaps this file is truncated.

Moreover, if I try to delete it after using download.file(), I get a
warning that permission is denied. I can only remove it using Windows file
explorer after I closed the R session, indicating that the connection is
still open. Yet, showConnections() doesn't show any open connections either.

Session info below. Note that I started from a completely fresh R session.
oligo is needed due to the specific file format of these gz files. They're
not standard tarred files.

Cheers
Joris

Session Info

-

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C

[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets
methods
[9] base

other attached packages:
  [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8
oligo_1.44.0
  [4] Biobase_2.39.2 oligoClasses_1.42.0
RSQLite_2.1.0
  [7] Biostrings_2.48.0  XVector_0.19.9
IRanges_2.13.28
[10] S4Vectors_0.17.42  BiocGenerics_0.25.3

loaded via a namespace (and not attached):
  [1] Rcpp_0.12.16compiler_3.5.0
  [3] BiocInstaller_1.30.0GenomeInfoDb_1.15.5
  [5] bitops_1.0-6iterators_1.0.9
  [7] tools_3.5.0 zlibbioc_1.25.0
  [9] digest_0.6.15   bit_1.1-12
[11] memoise_1.1.0   preprocessCore_1.41.0
[13] lattice_0.20-35 ff_2.2-13
[15] pkgconfig_2.0.1 Matrix_1.2-14
[17] foreach_1.4.4   DelayedArray_0.5.31

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Martin Morgan



On 05/02/2018 03:21 PM, Joris Meys wrote:

Dear all,

I've noticed by trying to download gz files from here :
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811

At the bottom one can download GSM907811.CEL.gz . If I download this
manually and try

oligo::read.celfiles("GSM907811.CEL.gz")

everything works fine. (oligo is a bioConductor package)

However, if I download using

download.file("
https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907811&format=file&file=GSM907811%2ECEL%2Egz
",
   destfile = "GSM907811.CEL.gz")


On windows, the 'mode' argument to download.file() needs to be "wb" 
(write binary) for binary files.


Martin



The file is downloaded, but oligo::read.celfiles() returns the following
error:

Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) :
   End of gz file reached unexpectedly. Perhaps this file is truncated.

Moreover, if I try to delete it after using download.file(), I get a
warning that permission is denied. I can only remove it using Windows file
explorer after I closed the R session, indicating that the connection is
still open. Yet, showConnections() doesn't show any open connections either.

Session info below. Note that I started from a completely fresh R session.
oligo is needed due to the specific file format of these gz files. They're
not standard tarred files.

Cheers
Joris

Session Info
-

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets
methods
[9] base

other attached packages:
  [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8
oligo_1.44.0
  [4] Biobase_2.39.2 oligoClasses_1.42.0
RSQLite_2.1.0
  [7] Biostrings_2.48.0  XVector_0.19.9
IRanges_2.13.28
[10] S4Vectors_0.17.42  BiocGenerics_0.25.3

loaded via a namespace (and not attached):
  [1] Rcpp_0.12.16compiler_3.5.0
  [3] BiocInstaller_1.30.0GenomeInfoDb_1.15.5
  [5] bitops_1.0-6iterators_1.0.9
  [7] tools_3.5.0 zlibbioc_1.25.0
  [9] digest_0.6.15   bit_1.1-12
[11] memoise_1.1.0   preprocessCore_1.41.0
[13] lattice_0.20-35 ff_2.2-13
[15] pkgconfig_2.0.1 Matrix_1.2-14
[17] foreach_1.4.4   DelayedArray_0.5.31
[19] yaml_2.1.18 GenomeInfoDbData_1.1.0
[21] affxparser_1.52.0   bit64_0.9-7
[23] grid_3.5.0  BiocParallel_1.13.3
[25] blob_1.1.1  codetools_0.2-15
[27] matrixStats_0.53.1  GenomicRanges_1.31.23
[29] splines_3.5.0   SummarizedExperiment_1.9.17
[31] RCurl_1.95-4.10 affyio_1.49.2





This email message may contain legally privileged and/or...{{dropped:2}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Why R should never move to git

2018-01-25 Thread Martin Morgan

On 01/25/2018 07:09 AM, Duncan Murdoch wrote:

On 25/01/2018 6:49 AM, Dirk Eddelbuettel wrote:


On 25 January 2018 at 06:20, Duncan Murdoch wrote:
| On 25/01/2018 2:57 AM, Iñaki Úcar wrote:
| > For what it's worth, this is my workflow:
| >
| > 1. Get a fork.
| > 2. From the master branch, create a new branch called 
fix-[something].

| > 3. Put together the stuff there, commit, push and open a PR.
| > 4. Checkout master and repeat from 2 to submit another patch.
| >
| > Sometimes, I forget the step of creating the new branch and I put my
| > fix on top of the master branch, which complicates things a bit. But
| > you can always rename your fork's master and pull it again from
| > upstream.
|
| I saw no way to follow your renaming suggestion.  Can you tell me the
| steps it would take?  Remember, there's already a PR from the master
| branch on my fork.  (This is for future reference; I already followed
| Gabor's more complicated instructions and have solved the immediate
| problem.)

1)  Via GUI: fork or clone at github so that you have URL to use in 2)


Github would not allow me to fork, because I already had a fork of the 
same repository.  I suppose I could have set up a new user and done it.


I don't know if cloning the original would have made a difference. I 
don't have permission to commit to the original, and the 
manipulateWidget maintainers wouldn't be able to see my private clone, 
so I don't see how I could create a PR that they could use.


Once again, let me repeat:  this should be an easy thing to do.  So far 
I'm pretty convinced that it's actually impossible to do it on the 
Github website without hacks like creating a new user.  It's not trivial 
but not that difficult for a git expert using command line git.


If R Core chose to switch the R sources to use git and used Github to 
host a copy, problems like mine would come up fairly regularly.  I don't 
think R Core would gain enough from the switch to compensate for the 
burden of dealing with these problems.


A different starting point gives R-core members write access to the 
R-core git, which is analogous to the current svn setup. A restricted 
set of commands are needed, mimicking svn


  git clone ...   # svn co
  git pull# svn up
  [...; git commit ...]
  git push ...# svn ci

Probably this would mature quickly into a better practice where new 
features / bug fixes are developed on a local branch.


A subset of R-core might participate in managing pull requests on a 
'read only' Github mirror. Incorporating mature patches would involve 
git, rather than the Github GUI. In one's local repository, create a new 
branch and pull from the repository making the request


  git checkout -b a-pull-request master
  git pull https://github.com/a-user/their.git their-branch

Check and modify, then merge locally and push to the R-core git

  ## identify standard / best practice for merging branches
  git checkout master
  git merge ... a-pull-request
  git push ...

Creating pull requests is a problem for the developer wanting to 
contribute to R, not for the R-core developer. As we've seen in this 
thread, R-core would not need to feel responsible for helping developers 
create pull requests.


Martin Morgan



Maybe Gitlab or some other front end would be better.

Duncan Murdoch



2)  Run
   git clone giturl
 to fetch local instance
3)  Run
   git checkout -b feature/new_thing_a
 (this is 2. above by Inaki)
4)  Edit, save, compile, test, revise, ... leading to 1 or more commits

5)  Run
   git push origin
 standard configuration should have remote branch follow local 
branch, I

 think the "long form" is
   git push --set-upstream origin feature/new_thing_a

6)  Run
   git checkout -
 or
   git checkout master
 and you are back in master. Now you can restart at my 3) above for
 branches b, c, d and create independent pull requests

I find it really to have a bash prompt that shows the branch:

 edd@rob:~$ cd git/rcpp
 edd@rob:~/git/rcpp(master)$ git checkout -b 
feature/new_branch_to_show

 Switched to a new branch 'feature/new_branch_to_show'
 edd@rob:~/git/rcpp(feature/new_branch_to_show)$ git checkout -
 Switched to branch 'master'
 Your branch is up-to-date with 'origin/master'.
 edd@rob:~/git/rcpp(master)$ git branch -d feature/new_branch_to_show
 Deleted branch feature/new_branch_to_show (was 5b25fe62).
 edd@rob:~/git/rcpp(master)$

There are few tutorials out there about how to do it, I once got mine 
from
Karthik when we did a Software Carpentry workshop.  Happy to detail 
off-list,

it adds less than 10 lines to ~/.bashrc.

Dirk

|
| Duncan Murdoch
|
| > Iñaki
| >
| >
| >
| > 2018-01-25 0:17 GMT+01:00 Duncan Murdoch :
| >> Lately I've been doing so

Re: [Rd] How to address the following: CRAN packages not using Suggests conditionally

2018-01-22 Thread Martin Morgan

On 01/22/2018 08:40 AM, Ulrich Bodenhofer wrote:
Thanks a lot, Iñaki, this is a perfect solution! I already implemented 
it and it works great. I'll wait for 2 more days before I submit the 
revised package to CRAN - in order to give others to comment on it.


It's very easy for 'pictures of code' (unevaluated code chunks in 
vignettes) to drift from the actual implementation. So I'd really 
encourage your conditional evaluation to be as narrow as possible -- 
during CRAN or even CRAN fedora checks. Certainly trying to use 
uninstalled Suggest'ed packages in vignettes should provide an error 
message that is informative to users. Presumably the developer or user 
intends actually to execute the code, and needs to struggle through 
whatever issues come up. I'm not sure whether my comments are consistent 
with Writing R Extensions or not.


There is a fundamental tension between the CRAN and Bioconductor release 
models. The Bioconductor 'devel' package repositories and nightly builds 
are meant to be a place where new features and breaking changes can be 
introduced and problems resolved before being exposed to general users 
as a stable 'release' branch, once every six months. This means that the 
Bioconductor devel branch periodically (as recently and I suspect over 
the next several days) contains considerable carnage that propagates to 
CRAN devel builds, creating additional work for CRAN maintainers.


Martin Morgan
Bioconductor



Best regards,
Ulrich


On 01/22/2018 10:16 AM, Iñaki Úcar wrote:
Re-sending, since I forgot to include the list, sorry. I'm including 
r-package-devel too this time, as it seems more appropriate for this 
list.



El 22 ene. 2018 10:11, "Iñaki Úcar" <mailto:i.uca...@gmail.com>> escribió:




    El 22 ene. 2018 8:12, "Ulrich Bodenhofer"
    mailto:bodenho...@bioinf.jku.at>> 
escribió:


    Dear colleagues, dear members of the R Core Team,

    This was an issue raised by Prof. Brian Ripley and sent
    privately to all developers of CRAN packages that suggest
    Bioconductor packages (see original message below). As
    mentioned in my message enclosed below, it was easy for me to
    fix the error in examples (new version not submitted to CRAN
    yet), but it might turn into a major effort for the warnings
    raised by the package vignette. Since I have not gotten any
    advice yet, I take the liberty to post it here on this list -
    hoping that we reach a conclusion here how to deal with this
    matter.


    Just disable code chunk evaluation if suggested packages are
    missing (see [1]). As explained by Prof. Ripley, it will only
    affect Fedora checks on r-devel, i.e., your users will still see
    fully evaluated vignettes on CRAN.

    [1] https://www.enchufa2.es/archives/suggests-and-vignettes.html
    <https://www.enchufa2.es/archives/suggests-and-vignettes.html>

    Iñaki


    Thanks in advance for your kind assistance,
    Ulrich Bodenhofer



     Forwarded Message 
    Subject:        Re: CRAN packages not using Suggests 
conditionally

    Date:   Mon, 15 Jan 2018 08:44:40 +0100
    From:   Ulrich Bodenhofer mailto:bodenho...@bioinf.jku.at>>
    To:     Prof Brian Ripley mailto:rip...@stats.ox.ac.uk>>
    CC:     [...stripped for the sake of privacy ...]



    Dear Prof. Ripley,

    Thank you very much for bringing this important issue to my
    attention. I
    am the maintainer of the 'apcluster' package. My package 
refers to

    'Biostrings' in an example section of a help page (a quite
    insignificant
    one, by the way), which creates errors on some platforms. It
    also refers
    to 'kebabs' in the package vignette, which leads to warnings.

    I could fix the first, more severe, problem quite easily, (1)
    since it
    is relatively easy to wrap an entire examples section in a
    conditional,
    and (2), as I have mentioned, it is not a particularly
    important help page.

    Regarding the vignette, I want to ask for your advice now,
    since the
    situation appears more complicated to me. While it is, of
    course, only
    one code chunk that loads the 'kebabs' package, five more code
    chunks
    depend on the package (more specifically, the data objects
    created by a
    method implemented in the package) - with quite some text in
    between. So
    the handling of the conditional loading of the package would
    propagate
    to multiple code chunks and also affect the validity of the
    explanations
    in between. I would see the following options:

    1. Remove the entire section of the vignette. That would be a
    pity,
    sin

Re: [Rd] R CMD check warning about compiler warning flags

2017-12-21 Thread Martin Morgan

On 12/21/2017 01:02 PM, Winston Chang wrote:

On recent builds of R-devel, R CMD check gives a WARNING when some
compiler warning flags are detected, such as -Werror, because they are
non-portable. This appears to have been added in this commit:
https://github.com/wch/r-source/commit/2e80059


That is not the canonical R sources.


Yes, that is obvious. The main page for that repository says it is a
mirror of the R sources, right at the top. I know that because I put
the message there, and because I see it every time I visit the
repository. If you have a good way of pointing people to the changes
made in a commit with the canonical R sources, please let us know. I
and many others would be happy to use it.


In case 'pointing to' is not to mean exclusively 'pointing a mouse at', 
'a good way' can include typing at the console and living with the 
merits and demerits of svn, and the question is not 
rhetorical(probably FALSE on all accounts, but one never knows...)


Check out or update the source (linux, mac, or Windows)

  svn co https://svn.r-project.org/R/trunk R-devel
  cd R-devel
  svn up

browse the commit history

  svn log | less

and review the change

  svn diff -c73909

Restrict by specifying a path

  svn diff -c73909 src/library/tools/R/check.R

(I don't think one gets finer resolution, other than referencing the 
line number in the diff)


View a range of revisions, e.g.,

  svn diff -r73908:73909

And find commits associated with lines of code

  svn annotate doc/manual/R-exts.texi | less

A quick google search (svn diff visual display) lead me to

  svn diff --diff-cmd meld -c73909

for my platform, which pops up the diffs in a visual context.

Martin Morgan




And your description seems wrong:
there is now an _optional_ check controlled by an environment variable,
primarily for CRAN checks.


The check is "optional", but not for packages submitted to CRAN.



I'm working on a package where these compiler warning flags are
present in a Makefile generated by a configure script -- that is, the
configure script detects whether the compiler supports these flags,
and if so, puts them in the Makefile. (The configure script is for a
third-party C library which is in a subdirectory of src/.)

Because the flags are added only if the system supports them, there
shouldn't be any worries about portability in practice.



Please read the explanation in the manual: there are serious concerns about
such flags which have bitten CRAN users several times.

To take your example, you cannot know what -Werror does on all compilers
(past, present or future) where it is supported (and -W flags do do
different things on different compilers).  On current gcc it does

-Werror
Make all warnings into errors.

and so its effect depends on what other flags are used (people typically use
-Wall, and most new versions of both gcc and clang add more warnings to
-Wall -- I read this week exactly such a discussion about the interaction of
-Werror with -Wtautological-constant-compare as part of -Wall in clang
trunk).


Is there a way to get R CMD check to not raise warnings in cases like
this? I know I could modify the C library's configure.ac (which is
used to generate the configure script) but I'd prefer to leave the
library's code untouched if possible.


You don't need to (and most likely should not) use the C[XX]FLAGS it
generates ... just use the flags which R passes to the package to use.


It turns out that there isn't even a risk of these compiler flags
being used -- I learned from of my colleagues that the troublesome
compiler flags, like -Werror, never actually appear in the Makefile.
The configure script prints out those compiler flags out when it
checks for them, but in the end it creates a Makefile with the CFLAGS
inherited from R. So there's no chance that the library would be
compiled using those flags (unless R passed them along).

His suggested workaround is to silence the output of the configure
script. That also hides some useful information, but it does work for
this issue.

-Winston

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




This email message may contain legally privileged and/or...{{dropped:2}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Are Rprintf and REprintf thread-safe?

2017-11-21 Thread Martin Morgan

On 11/21/2017 04:12 PM, Winston Chang wrote:

Thanks - I'll find another way to send messages to the main thread for printing.


The CRAN synchronicity and Bioconductor BiocParallel packages provide 
inter-process locks that you could use to surround writes (instead of 
sending message to the main thread), also easy enough to incorporate at 
the C level using the BH package as source for relevant boost header.


Martin



-Winston

On Tue, Nov 21, 2017 at 12:42 PM,   wrote:

On Tue, 21 Nov 2017, Winston Chang wrote:


Is it safe to call Rprintf and REprintf from a background thread? I'm
working on a package that makes calls to fprintf(stderr, ...) on a
background thread when errors happen, but when I run R CMD check, it
says:

  Compiled code should not call entry points which might terminate R nor
  write to stdout/stderr instead of to the console, nor the system RNG.

Is it safe to replace these calls with REprintf()?



Only if you enjoy race conditions or segfaults.

Rprintf and REprintf are not thread-safe.

Best,

luke




-Winston

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




This email message may contain legally privileged and/or...{{dropped:2}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [PATCH] Fix bad free in connections

2017-07-21 Thread Martin Morgan

On 07/20/2017 05:04 PM, Steve Grubb wrote:

Hello,

There are times when b points to buf which is a stack variable. This
leads to a bad free. The current test actually guarantees the stack
will try to get freed. Simplest to just drop the variable and directly
test if b should get freed.


Signed-off-by: Steve Grubb 


Index: src/main/connections.c
===
--- src/main/connections.c  (revision 72935)
+++ src/main/connections.c  (working copy)
@@ -421,7 +421,6 @@
  char buf[BUFSIZE], *b = buf;
  int res;
  const void *vmax = NULL; /* -Wall*/
-int usedVasprintf = FALSE;
  va_list aq;
  
  va_copy(aq, ap);

@@ -434,7 +433,7 @@
b = buf;
buf[BUFSIZE-1] = '\0';
warning(_("printing of extremely long output is truncated"));
-   } else usedVasprintf = TRUE;
+   }
  }
  #else
  if(res >= BUFSIZE) { /* res is the desired output length */
@@ -481,7 +480,7 @@
  } else
con->write(b, 1, res, con);
  if(vmax) vmaxset(vmax);
-if(usedVasprintf) free(b);
+if(b != buf) free(b);


The code can be exercised with

  z = paste(rep("a", 11000), collapse="")
  f = fifo("foo", "w+")
  writeLines(z, f)

If the macro HAVE_VASPRINTF is not defined, then b is the result of 
R_alloc(), and it is not appropriate to free(b).


If the macro is defined we go through

res = vasprintf(&b, format, ap);
if (res < 0) {
b = buf;
buf[BUFSIZE-1] = '\0';
warning(_("printing of extremely long output is truncated"));
} else usedVasprintf = TRUE;

b gets reallocated when

res = vasprintf(&b, format, ap);

is successful and res >= 0. usedVasprintf is then set to TRUE, and 
free(b) called.


It seems like the code is correct as written?

Martin Morgan (the real other Martin M*)


  return res;
  }

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




This email message may contain legally privileged and/or...{{dropped:2}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [PATCH] Fix status in main

2017-07-21 Thread Martin Morgan

On 07/20/2017 05:31 PM, Steve Grubb wrote:

Hello,

This is a patch to fix what appears to be a simple typo. The warning says
"invalid status assuming 0", but then instead sets runLast to 0.

Signed-of-by: Steve Grubb 


fixed in 72938 / 39.

This seemed not to have consequence, since exit() reports NA & 0377 
(i.e., 0) and the incorrect assignment to runLast is immediately 
over-written by the correct value.


Martin Morgan



Index: src/main/main.c
===
--- src/main/main.c (revision 72935)
+++ src/main/main.c (working copy)
@@ -1341,7 +1341,7 @@
  status = asInteger(CADR(args));
  if (status == NA_INTEGER) {
warning(_("invalid 'status', 0 assumed"));
-   runLast = 0;
+   status = 0;
  }
  runLast = asLogical(CADDR(args));
  if (runLast == NA_LOGICAL) {

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




This email message may contain legally privileged and/or...{{dropped:2}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [PATCH] Fix missing break

2017-07-21 Thread Martin Morgan

On 07/20/2017 05:02 PM, Steve Grubb wrote:

Hello,

There appears to be a break missing in the switch/case for the LISTSXP case.
If this is supposed to fall through, I'd suggest a comment so that others
know its by design.

Signed-off-by: Steve Grubb 


An example is

$ R --vanilla -e "pl = pairlist(1, 2); length(pl) = 1; pl"
> pl = pairlist(1, 2); length(pl) = 1; pl
Error in length(pl) = 1 :
  SET_VECTOR_ELT() can only be applied to a 'list', not a 'pairlist'
Execution halted

fixed in r72936 (R-devel) / 72937 (R-3-4-branch).

Martin Morgan



Index: src/main/builtin.c
===
--- src/main/builtin.c  (revision 72935)
+++ src/main/builtin.c  (working copy)
@@ -888,6 +888,7 @@
SETCAR(t, CAR(x));
SET_TAG(t, TAG(x));
}
+   break;
  case VECSXP:
for (i = 0; i < len; i++)
if (i < lenx) {

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




This email message may contain legally privileged and/or...{{dropped:2}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Question about R developpment

2017-06-12 Thread Morgan
Thank you all for these explanations.
Kind regards,
Morgan

On 11 Jun 2017 02:47, "Duncan Murdoch"  wrote:

> On 10/06/2017 6:09 PM, Duncan Murdoch wrote:
>
>> On 10/06/2017 2:38 PM, Morgan wrote:
>>
>>> Hi,
>>>
>>> I had a question that might not seem obvious to me.
>>>
>>> I was wondering why there was no patnership between microsoft the R core
>>> team and eventually other developpers to improve R in one unified version
>>> instead of having different teams developping their own version of R.
>>>
>>
>> As far as I know, there's only one version of R currently being
>> developed.  Microsoft doesn't offer anything different; they just offer
>> a build of a slightly older version of base R, and a few packages that
>> are not in the base version.
>>
>
> Actually, I think my first sentence above is wrong.  Besides the base R
> that the core R team works on, there are a few other implementations of the
> language:  pqR, for instance.  But as others have said, the Microsoft
> product is simply a repackaging of the core R, so my second sentence is
> right.
>
> Duncan Murdoch
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Question about R developpment

2017-06-10 Thread Morgan
Hi,

I had a question that might not seem obvious to me.

I was wondering why there was no patnership between microsoft the R core
team and eventually other developpers to improve R in one unified version
instead of having different teams developping their own version of R.

Is it because they don't want to team up? Is it because you don't want? Any
particular reasons? Different philosophies?

Thank you
Kind regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R libcurl does not recognize server certs

2017-03-27 Thread Martin Morgan

On 03/27/2017 03:09 PM, Roman, John wrote:

Dirk,
ive changed the subject given the nature of the present debugging.  Im aware i 
can extend extras from download.file to install.packages however
im curious to know why libcurl in the R invocation does not honor the CA bundle 
on my system.

how would I pass a CA bundle to install.packages?  the function has numerous 
arguments before the extras are taken.



A little shot-in-the-dark but on Linux I have

$ curl-config --ca
/etc/ssl/certs/ca-certificates.crt

and in R ?download.file I'm told (the documentation may read as 
window-specific, but I don't think that's the case)


 set environment variable 'CURL_CA_BUNDLE' to the path to a
 certificate bundle file, usually named 'ca-bundle.crt' or
 'curl-ca-bundle.crt'.  (This is normally done for a binary

So if I were having trouble I might say (or set the environment variable 
in some other way, e.g., as part of an alias to R)


> Sys.setenv(CURL_CA_BUNDLE="/etc/ssl/certs/ca-certificates.crt")
> download.file("https://, tempfile())

Maybe with more info about your OS and R installation a more transparent 
solution would offer itself; I'd guess that the bundle location is 
inferred when R is built from source, and somehow there has been a 
disconnect between your R installation and certificate location, e.g., 
moving the certificate location after R installation.


Martin Morgan


John Roman
Linux System Administrator
RAND Corporation
joro...@rand.org
X7302


From: Dirk Eddelbuettel [dirk.eddelbuet...@gmail.com] on behalf of Dirk 
Eddelbuettel [e...@debian.org]
Sent: Monday, March 27, 2017 11:33 AM
To: Roman, John
Cc: Dirk Eddelbuettel; R-devel@r-project.org
Subject: RE: [Rd] R fails to read repo index on NGINX

On 27 March 2017 at 18:27, Roman, John wrote:
| Thank you for your elaboration.  This issue is related to curl trusting a CA 
cert as its called by R.
| curl called from bash recognizes the system cert bundle for CA's, curl called 
from R does not.
|
| may I know how to trust the system certificate bundle from within R?

See 'help(download.file)' -- it's a little hidden but you can just make the
external curl (which, as you say, works in your particular circumstances) the
default for remote file access from R too.

Next time please try to be a little more specific with your questions and
their subject line.  Methinks nothing here has anything to do with the httpd
server you employ.

Dirk

--
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org


__

This email message is for the sole use of the intended...{{dropped:10}}


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] different compilers and mzR build fails

2016-12-21 Thread Martin Morgan

On 12/21/2016 01:56 PM, lejeczek wrote:


I do this on a vanilla-clean R installation, simply:

biocLite("mzR")

it pulls some deps in which compile fine, only mzR fails.
... meanwhile...
I grabbed devtools and comiled github master - still fails.
Should I attach build log? One should not send attachments to the list..
I don't suppose?


My opinion is that the appropriate forum is the Bioconductor support 
site. I think you should EDIT your question on the Bioconductor support 
site to add the compiler output. If you feel like you can spot where 
things are going wrong, then edited to include those parts otherwise 
post the output in its entirety; the support site can mangle formatting, 
so I'd copy-and-paste the compiler output, and then select it and format 
it as 'code'.


If you feel that the current forum is more appropriate, then 
cut-and-paste the compiler output into an email message, avoding 
attachments.


Martin



On 21/12/16 17:06, Martin Morgan wrote:

mzR is a Bioconductor package, so better to ask on the Bioconductor
support forum

  https://support.bioconductor.org

Oh, I see you did, and then the advice is to avoid cross-posting!

The missing .o files would have been produced in an earlier
compilation step; they likely failed in some way, so you need to
provide the complete compilation output.

Did you do this on a version of the package that did not have any
previous build artifacts (e.g., via biocLite() or from a fresh svn
checkout)?

Martin

On 12/21/2016 12:00 PM, lejeczek via R-devel wrote:

I'm not sure if I should bother you team with this, apologies in case
it's a bother.

I'm trying gcc 6.2.1 (from devtoolset-6) with R, everything seems to
work just fine, except for mzR.
Here is failed build:

g++ -m64 -shared -L/usr/lib64/R/lib -Wl,-z,relro -o mzR.so cramp.o
ramp_base64.o ramp.o RcppRamp.o RcppRampModule.o rnetCDF.o RcppPwiz.o
RcppPwizModule.o RcppIdent.o RcppIdentModule.o
./boost/system/src/error_code.o ./boost/regex/src/posix_api.o
./boost/regex/src/fileiter.o ./boost/regex/src/regex_raw_buffer.o
./boost/regex/src/cregex.o ./boost/regex/src/regex_debug.o
./boost/regex/src/instances.o ./boost/regex/src/icu.o
./boost/regex/src/usinstances.o ./boost/regex/src/regex.o
./boost/regex/src/wide_posix_api.o
./boost/regex/src/regex_traits_defaults.o ./boost/regex/src/winstances.o
./boost/regex/src/wc_regex_traits.o ./boost/regex/src/c_regex_traits.o
./boost/regex/src/cpp_regex_traits.o ./boost/regex/src/static_mutex.o
./boost/regex/src/w32_regex_traits.o ./boost/iostreams/src/zlib.o
./boost/iostreams/src/file_descriptor.o ./boost/thread/pthread/once.o
./boost/thread/pthread/thread.o ./boost/filesystem/src/operations.o
./boost/filesystem/src/path.o
./boost/filesystem/src/utf8_codecvt_facet.o ./boost/chrono/src/chrono.o
./boost/chrono/src/process_cpu_clocks.o
./boost/chrono/src/thread_clock.o ./pwiz/data/msdata/Version.o
./pwiz/data/common/MemoryIndex.o ./pwiz/data/common/CVTranslator.o
./pwiz/data/common/cv.o ./pwiz/data/common/ParamTypes.o
./pwiz/data/common/BinaryIndexStream.o ./pwiz/data/common/diff_std.o
./pwiz/data/common/Unimod.o ./pwiz/data/msdata/SpectrumList_MGF.o
./pwiz/data/msdata/DefaultReaderList.o
./pwiz/data/msdata/ChromatogramList_mzML.o ./pwiz/data/msdata/examples.o
./pwiz/data/msdata/Serializer_mzML.o ./pwiz/data/msdata/Serializer_MSn.o
./pwiz/data/msdata/Reader.o ./pwiz/data/msdata/Serializer_MGF.o
./pwiz/data/msdata/Serializer_mzXML.o
./pwiz/data/msdata/SpectrumList_mzML.o
./pwiz/data/msdata/SpectrumList_MSn.o
./pwiz/data/msdata/BinaryDataEncoder.o ./pwiz/data/msdata/Diff.o
./pwiz/data/msdata/MSData.o ./pwiz/data/msdata/References.o
./pwiz/data/msdata/SpectrumList_mzXML.o ./pwiz/data/msdata/IO.o
./pwiz/data/msdata/SpectrumList_BTDX.o ./pwiz/data/msdata/SpectrumInfo.o
./pwiz/data/msdata/RAMPAdapter.o ./pwiz/data/msdata/LegacyAdapter.o
./pwiz/data/msdata/SpectrumIterator.o ./pwiz/data/msdata/MSDataFile.o
./pwiz/data/msdata/MSNumpress.o ./pwiz/data/msdata/SpectrumListCache.o
./pwiz/data/msdata/Index_mzML.o
./pwiz/data/msdata/SpectrumWorkerThreads.o
./pwiz/data/identdata/IdentDataFile.o ./pwiz/data/identdata/IdentData.o
./pwiz/data/identdata/DefaultReaderList.o ./pwiz/data/identdata/Reader.o
./pwiz/data/identdata/Serializer_protXML.o
./pwiz/data/identdata/Serializer_pepXML.o
./pwiz/data/identdata/Serializer_mzid.o ./pwiz/data/identdata/IO.o
./pwiz/data/identdata/References.o ./pwiz/data/identdata/MascotReader.o
./pwiz/data/proteome/Modification.o ./pwiz/data/proteome/Digestion.o
./pwiz/data/proteome/Peptide.o ./pwiz/data/proteome/AminoAcid.o
./pwiz/utility/minimxml/XMLWriter.o ./pwiz/utility/minimxml/SAXParser.o
./pwiz/utility/chemistry/Chemistry.o
./pwiz/utility/chemistry/ChemistryData.o
./pwiz/utility/chemistry/MZTolerance.o ./pwiz/utility/misc/IntegerSet.o
./pwiz/utility/misc/Base64.o ./pwiz/utility/misc/IterationListener.o
./pwiz/utility/misc/MSIHandler.o ./pwiz/utility/misc/Filesystem.o
./pwiz

Re: [Rd] different compilers and mzR build fails

2016-12-21 Thread Martin Morgan
mzR is a Bioconductor package, so better to ask on the Bioconductor 
support forum


  https://support.bioconductor.org

Oh, I see you did, and then the advice is to avoid cross-posting!

The missing .o files would have been produced in an earlier compilation 
step; they likely failed in some way, so you need to provide the 
complete compilation output.


Did you do this on a version of the package that did not have any 
previous build artifacts (e.g., via biocLite() or from a fresh svn 
checkout)?


Martin

On 12/21/2016 12:00 PM, lejeczek via R-devel wrote:

I'm not sure if I should bother you team with this, apologies in case
it's a bother.

I'm trying gcc 6.2.1 (from devtoolset-6) with R, everything seems to
work just fine, except for mzR.
Here is failed build:

g++ -m64 -shared -L/usr/lib64/R/lib -Wl,-z,relro -o mzR.so cramp.o
ramp_base64.o ramp.o RcppRamp.o RcppRampModule.o rnetCDF.o RcppPwiz.o
RcppPwizModule.o RcppIdent.o RcppIdentModule.o
./boost/system/src/error_code.o ./boost/regex/src/posix_api.o
./boost/regex/src/fileiter.o ./boost/regex/src/regex_raw_buffer.o
./boost/regex/src/cregex.o ./boost/regex/src/regex_debug.o
./boost/regex/src/instances.o ./boost/regex/src/icu.o
./boost/regex/src/usinstances.o ./boost/regex/src/regex.o
./boost/regex/src/wide_posix_api.o
./boost/regex/src/regex_traits_defaults.o ./boost/regex/src/winstances.o
./boost/regex/src/wc_regex_traits.o ./boost/regex/src/c_regex_traits.o
./boost/regex/src/cpp_regex_traits.o ./boost/regex/src/static_mutex.o
./boost/regex/src/w32_regex_traits.o ./boost/iostreams/src/zlib.o
./boost/iostreams/src/file_descriptor.o ./boost/thread/pthread/once.o
./boost/thread/pthread/thread.o ./boost/filesystem/src/operations.o
./boost/filesystem/src/path.o
./boost/filesystem/src/utf8_codecvt_facet.o ./boost/chrono/src/chrono.o
./boost/chrono/src/process_cpu_clocks.o
./boost/chrono/src/thread_clock.o ./pwiz/data/msdata/Version.o
./pwiz/data/common/MemoryIndex.o ./pwiz/data/common/CVTranslator.o
./pwiz/data/common/cv.o ./pwiz/data/common/ParamTypes.o
./pwiz/data/common/BinaryIndexStream.o ./pwiz/data/common/diff_std.o
./pwiz/data/common/Unimod.o ./pwiz/data/msdata/SpectrumList_MGF.o
./pwiz/data/msdata/DefaultReaderList.o
./pwiz/data/msdata/ChromatogramList_mzML.o ./pwiz/data/msdata/examples.o
./pwiz/data/msdata/Serializer_mzML.o ./pwiz/data/msdata/Serializer_MSn.o
./pwiz/data/msdata/Reader.o ./pwiz/data/msdata/Serializer_MGF.o
./pwiz/data/msdata/Serializer_mzXML.o
./pwiz/data/msdata/SpectrumList_mzML.o
./pwiz/data/msdata/SpectrumList_MSn.o
./pwiz/data/msdata/BinaryDataEncoder.o ./pwiz/data/msdata/Diff.o
./pwiz/data/msdata/MSData.o ./pwiz/data/msdata/References.o
./pwiz/data/msdata/SpectrumList_mzXML.o ./pwiz/data/msdata/IO.o
./pwiz/data/msdata/SpectrumList_BTDX.o ./pwiz/data/msdata/SpectrumInfo.o
./pwiz/data/msdata/RAMPAdapter.o ./pwiz/data/msdata/LegacyAdapter.o
./pwiz/data/msdata/SpectrumIterator.o ./pwiz/data/msdata/MSDataFile.o
./pwiz/data/msdata/MSNumpress.o ./pwiz/data/msdata/SpectrumListCache.o
./pwiz/data/msdata/Index_mzML.o
./pwiz/data/msdata/SpectrumWorkerThreads.o
./pwiz/data/identdata/IdentDataFile.o ./pwiz/data/identdata/IdentData.o
./pwiz/data/identdata/DefaultReaderList.o ./pwiz/data/identdata/Reader.o
./pwiz/data/identdata/Serializer_protXML.o
./pwiz/data/identdata/Serializer_pepXML.o
./pwiz/data/identdata/Serializer_mzid.o ./pwiz/data/identdata/IO.o
./pwiz/data/identdata/References.o ./pwiz/data/identdata/MascotReader.o
./pwiz/data/proteome/Modification.o ./pwiz/data/proteome/Digestion.o
./pwiz/data/proteome/Peptide.o ./pwiz/data/proteome/AminoAcid.o
./pwiz/utility/minimxml/XMLWriter.o ./pwiz/utility/minimxml/SAXParser.o
./pwiz/utility/chemistry/Chemistry.o
./pwiz/utility/chemistry/ChemistryData.o
./pwiz/utility/chemistry/MZTolerance.o ./pwiz/utility/misc/IntegerSet.o
./pwiz/utility/misc/Base64.o ./pwiz/utility/misc/IterationListener.o
./pwiz/utility/misc/MSIHandler.o ./pwiz/utility/misc/Filesystem.o
./pwiz/utility/misc/TabReader.o
./pwiz/utility/misc/random_access_compressed_ifstream.o
./pwiz/utility/misc/SHA1.o ./pwiz/utility/misc/SHA1Calculator.o
./pwiz/utility/misc/sha1calc.o ./random_access_gzFile.o ./RcppExports.o
rampR.o R_init_mzR.o -lpthread -lnetcdf -L/usr/lib64/R/lib -lR
g++: error: cramp.o: No such file or directory
g++: error: ramp_base64.o: No such file or directory
g++: error: ramp.o: No such file or directory
g++: error: RcppRamp.o: No such file or directory
g++: error: RcppRampModule.o: No such file or directory
g++: error: rnetCDF.o: No such file or directory
g++: error: RcppPwiz.o: No such file or directory
g++: error: RcppPwizModule.o: No such file or directory
g++: error: RcppIdent.o: No such file or directory
g++: error: RcppIdentModule.o: No such file or directory
/usr/share/R/make/shlib.mk:6: recipe for target 'mzR.so' failed
make: *** [mzR.so] Error 1

It did compile with 5.2.x (from devtoolset-4) and worked fine.
I'm hoping you guys could confirm it is purely compiler problem? Or
point me(not a real programme

Re: [Rd] methods(`|`) lists all functions?

2016-12-08 Thread Martin Morgan

On 12/08/2016 05:16 PM, frede...@ofb.net wrote:

Dear R-Devel,

I was attempting an exercise in Hadley Wickam's book "Advanced R". The
exercise is to find the generic with the greatest number of methods.

I found that 'methods(`|`)' produces a list of length 2506, in R
3.3.1. Similar behavior is found in 3.4.0. It seems to include all
functions and methods. I imagine something is being passed to "grep"
without being escaped.


Exactly; I've fixed this in r71763 (R-devel).

Martin Morgan



I hope I didn't miss something in the documentation, and that I'm good
to report this as a bug. I can send it to Bugzilla if that's better.

By the way, how do I produce such a list of functions (or variables)
in a "normal" way? I used 'ls("package:base")' for the exercise,
because I saw this call used somewhere as an example, but I couldn't
find that "package:" syntax documented under ls()... Also found this
confusing:

> environmentName(globalenv())
[1] "R_GlobalEnv"
> ls("R_GlobalEnv")
Error in as.environment(pos) :
  no item called "R_GlobalEnv" on the search list

So I'm not sure if "package:base" is naming an environment, or if
there are different ways to name environments and ls() is using one
form while environmentName is returning another ... It might be good
to add some clarifying examples under "?ls".

Thanks,

Frederick

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




This email message may contain legally privileged and/or...{{dropped:2}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] On implementing zero-overhead code reuse

2016-10-03 Thread Martin Morgan

On 10/03/2016 01:51 PM, Kynn Jones wrote:

Thank you all for your comments and suggestions.

@Frederik, my reason for mucking with environments is that I want to
minimize the number of names that import adds to my current
environment.  For instance, if module foo defines a function bar, I
want my client code to look like this:

  import("foo")
  foo$bar(1,2,3)

rather than

  import("foo")
  bar(1,2,3)

(Just a personal preference.)

@Dirk, @Kasper, as I see it, the benefit of scripting languages like
Python, Perl, etc., is that they allow very quick development, with
minimal up-front cost.  Their main strength is precisely that one can,
without much difficulty, *immediately* start *programming
productively*, without having to worry at all about (to quote Dirk)
"repositories.  And package management.  And version control (at the
package level).  And ... byte compilation.  And associated
documentation.  And unit tests.  And continuous integration."

Of course, *eventually*, and for a fraction of one's total code base
(in my case, a *very small* fraction), one will want to worry about
all those things, but I see no point in burdening *all* my code with
all those concerns from the start.  Again, please keep in mind that
those concerns come into play for at most 5% of the code I write.

Also, I'd like to point out that the Python, Perl, etc. communities
are no less committed to all the concerns that Dirk listed (version
control, package management, documentation, testing, etc.) than the R
community is.  And yet, Python, Perl, etc. support the "zero-overhead"
model of code reuse.  There's no contradiction here.  Support for
"zero-overhead" code reuse does not preclude forms of code reuse with
more overhead.

One benefit the zero-overhead model is that the concerns of
documentation, testing, etc. can be addressed with varying degrees of
thoroughness, depending on the situation's demands.  (For example,
documentation that would be perfectly adequate for me as the author of
a function would not be adequate for the general user.)

This means that the transition from writing private code to writing
code that can be shared with the world can be made much more
gradually, according to the programmer's needs and means.

Currently, in the R world, the choice for programmers is much starker:
either stay writing little scripts that one sources from an
interactive session, or learn to implement packages.  There's too
little in-between.


I know it's flogging the same horse, but for the non-expert I create and 
attach a complete package


  devtools::create("myutils")
  library(myutils)

Of course it doesn't do anything, so I write my code by editing a plain 
text file myutils/R/foo.R to contain


  foo = function() "hello wirld"

then return to my still-running R session and install the updated 
package and use my new function


  devtools::install("myutils")
  foo()
  myutils::foo()  # same, but belt-and-suspenders

I notice my typo, update the file, and use the updated package

  devtools::install("myutils")
  foo()

The transition from here to a robust package can be gradual, updating 
the DESCRIPTION file, adding roxygen2 documentation, unit tests, using 
version control, etc... in a completely incremental way. At the end of 
it all, I'll still install and use my package with


  devtools::install("myutils")
  foo()

maybe graduating to

  devtools::install_github("mtmorgan/myutils")
  library(myutils)
  foo()

when it's time to share my work with the wirld.

Martin



Of course, from the point of view of someone who has already written
several packages, the barrier to writing a package may seem too small
to fret over, but adopting the expert's perspective is likely to
result in excluding the non-experts.

Best, kj


On Mon, Oct 3, 2016 at 12:06 PM, Kasper Daniel Hansen
 wrote:



On Mon, Oct 3, 2016 at 10:18 AM,  wrote:


Hi Kynn,

Thanks for expanding.

I wrote a function like yours when I first started using R. It's
basically the same up to your "new.env()" line, I don't do anything
with environmentns. I just called my function "mysource" and it's
essentially a "source with path". That allows me to find code I reuse
in standard locations.

I don't know why R does not have built-in support for such a thing.
You can get it in C compilers with CPATH, and as you say in Perl with
PERL5LIB, in Python, etc. Obviously when I use my "mysource" I have to
remember that my code is now not portable without copying over some
files from other locations in my home directory. However, as a
beginner I find this tool to be indispensable, as R lacks several
functions which I use regularly, and I'm not necessarily ready to
confront the challenges associated with creating a package.



I can pretty much guarantee that when you finally confront the "challenge"
of making your own package you'll realize (1) it is pretty easy if the
intention is only to use it yourself (and perhaps a couple of collaborators)
- by easy I mean I can make a pac

Re: [Rd] failed to assign RegisteredNativeSymbol for splitString

2016-07-18 Thread Martin Morgan

On 07/18/2016 03:45 PM, Andrew Piskorski wrote:

I saw a warning from R that I don't fully understand.  Here's one way
to reproduce it:

   $ /usr/local/pkg/R-3.2-branch-20160718/bin/R --version | head -n 3
   R version 3.2.5 Patched (2016-05-05 r70929) -- "Very, Very Secure Dishes"
   Copyright (C) 2016 The R Foundation for Statistical Computing
   Platform: x86_64-pc-linux-gnu/x86_64 (64-bit)

   $ /usr/local/pkg/R-3.2-branch-20160718/bin/R --vanilla --no-restore 
--no-save --silent
   > splitString <- function(...) { print("Test, do nothing") }
   > invisible(tools::toTitleCase)
   Warning message:
   failed to assign RegisteredNativeSymbol for splitString to splitString since 
splitString is already defined in the 'tools' namespace

Another way to trigger that warning is by loading the knitr package, e.g.:


or

  splitString = NULL; loadNamespace("tools")

Thanks, it's a bug fixed with

----
r70933 | morgan | 2016-07-18 16:35:39 -0400 (Mon, 18 Jul 2016) | 5 lines

assignNativeRoutines looks only in package namespace

- previously looked for symbols in inherited environments
- https://stat.ethz.ch/pipermail/r-devel/2016-July/072909.html






   > require("knitr")
   Loading required package: knitr
   Warning: failed to assign RegisteredNativeSymbol for splitString to 
splitString since splitString is already defined in the 'tools' namespace

The warning only happens the FIRST time I run any code that triggers it.
To get it to happen again, I need to restart R.

R 3.1.0 and all earlier versions do not throw that warning, because
they do not have any splitString C function (see below) at all.  R
3.2.5 does throw the warning, and I believe 3.3 and all later versions
of R do also (but I cannot currently test that on this machine).

In my case, normally I start R without "--vanilla", and load various
custom libraries of my own, one of which contained an R function
"splitString".  That gave the exact same symptoms as the simpler way
of reproducing the warning above.  In practice, I solved the problem
by renaming my "splitString" function to something else.  But I still
wonder what exactly was going on with that warning.

I noticed that the toTitleCase() R code calls .Call() with a bare
splitString identifier, no quotes around it:

   $ grep -n splitString R-3-[234]*/src/library/tools/R/utils.R
   R-3-2-branch/src/library/tools/R/utils.R:1988:xx <- .Call(splitString, x, 
' -/"()')
   R-3-3-branch/src/library/tools/R/utils.R:2074:xx <- .Call(splitString, x, 
' -/"()\n')
   R-3-4-trunk/src/library/tools/R/utils.R:2074:xx <- .Call(splitString, x, 
' -/"()\n')

   $ find R-3-4-trunk -name .svn -prune -o -type f -print0 | xargs -0 grep -n 
splitString
   R-3-4-trunk/src/library/tools/R/utils.R:2074:xx <- .Call(splitString, x, 
' -/"()\n')
   R-3-4-trunk/src/library/tools/src/text.c:264:SEXP splitString(SEXP string, 
SEXP delims)
   R-3-4-trunk/src/library/tools/src/tools.h:45:SEXP splitString(SEXP string, 
SEXP delims);
   R-3-4-trunk/src/library/tools/src/init.c:53:CALLDEF(splitString, 2),

Doing that is perfectly legal according to help(".Call"), and
interestingly, it apparently does NOT matter whether that code puts
quotes around the splitString or not - I tried it, and it made no
difference.

Is it generally the case the users MUST NOT define R functions with
the same names as "registered" C functions?  Will something break if
we do?




This email message may contain legally privileged and/or...{{dropped:2}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] dowload.file(method="libcurl") and GET vs. HEAD requests

2016-06-22 Thread Morgan, Martin

No I don't think there is a way to avoid the HEAD request.

From: Winston Chang 
Sent: Wednesday, June 22, 2016 12:01:39 PM
To: Morgan, Martin
Cc: R Devel List
Subject: Re: [Rd] dowload.file(method="libcurl") and GET vs. HEAD requests

Thanks for looking into it. Is there a way to avoid the HEAD request
in R 3.3.0? I'm asking because if there isn't, then I'll add a
workaround in a package I'm working on.

-Winston

On Tue, Jun 21, 2016 at 9:45 PM, Martin Morgan
 wrote:
> On 06/21/2016 09:35 PM, Winston Chang wrote:
>>
>> In R 3.2.4, if you ran download.file(method="libcurl"), it issues a
>> HTTP GET request for the file. However, in R 3.3.0, it issues a HTTP
>> HEAD request first, and then a GET requet. This can result in problems
>> when the web server gives an error for a HEAD request, even if the
>> file is available with a GET request.
>>
>> Is it possible to tell download.file to simply send a GET request,
>> without first sending a HEAD request?
>>
>>
>> In theory, web servers should give the same response for HEAD and GET
>> requests, except that for a HEAD request, it sends only headers, and
>> not the content. However, not all web servers do this for all files.
>> I've seen this problem come up in two different places.
>>
>> The first is from an issue that someone filed for the downloader
>> package. The following works in R 3.2.4, but in R 3.3.0, it fails with
>> a 404 (tested on a Mac):
>>options(internet.info=1) # Show verbose download info
>>url <-
>> "https://census.edina.ac.uk/ukborders/easy_download/prebuilt/shape/England_lad_2011_gen.zip";
>>   download.file(url, destfile = "out.zip", method="libcurl")
>>
>> In R 3.3.0, the download succeeds with method="wget", and
>> method="curl". It's only method="libcurl" that has problems.
>>
>>
>> The second place I've encountered a problem is in downloading attached
>> files from a GitHub release.
>>options(internet.info=1) # Show verbose download info
>>url <-
>> "https://github.com/wch/webshot/releases/download/v0.3/phantomjs-2.1.1-macosx.zip";
>>download.file(url, destfile = "out.zip")
>>
>> This one fails with a 403 Forbidden because it gets redirected to a
>> URL in Amazon S3, where a signature of the file is embedded in the
>> URL. However, the signature is computed with the request type (HEAD
>> vs. GET), and so the same URL doesn't work for both. (See
>> http://stackoverflow.com/a/20580036/412655)
>>
>> Any help would be appreciated!
>
>
> I think I introduced this, in
>
> 
> r69280 | morgan | 2015-09-03 06:24:49 -0400 (Thu, 03 Sep 2015) | 4 lines
>
> don't create empty file on 404 and similar errors
>
> - download.file(method="libcurl")
>
> 
>
> The idea was to test that the file can be downloaded before trying to
> download it; previously R would download the error page as though it were
> the content.
>
> I'll give this some thought.
>
> Martin Morgan
>
>
>> -Winston
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
> This email message may contain legally privileged and/or confidential
> information.  If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited.  If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.


This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] dowload.file(method="libcurl") and GET vs. HEAD requests

2016-06-21 Thread Martin Morgan

On 06/21/2016 09:35 PM, Winston Chang wrote:

In R 3.2.4, if you ran download.file(method="libcurl"), it issues a
HTTP GET request for the file. However, in R 3.3.0, it issues a HTTP
HEAD request first, and then a GET requet. This can result in problems
when the web server gives an error for a HEAD request, even if the
file is available with a GET request.

Is it possible to tell download.file to simply send a GET request,
without first sending a HEAD request?


In theory, web servers should give the same response for HEAD and GET
requests, except that for a HEAD request, it sends only headers, and
not the content. However, not all web servers do this for all files.
I've seen this problem come up in two different places.

The first is from an issue that someone filed for the downloader
package. The following works in R 3.2.4, but in R 3.3.0, it fails with
a 404 (tested on a Mac):
   options(internet.info=1) # Show verbose download info
   url <- 
"https://census.edina.ac.uk/ukborders/easy_download/prebuilt/shape/England_lad_2011_gen.zip";
  download.file(url, destfile = "out.zip", method="libcurl")

In R 3.3.0, the download succeeds with method="wget", and
method="curl". It's only method="libcurl" that has problems.


The second place I've encountered a problem is in downloading attached
files from a GitHub release.
   options(internet.info=1) # Show verbose download info
   url <- 
"https://github.com/wch/webshot/releases/download/v0.3/phantomjs-2.1.1-macosx.zip";
   download.file(url, destfile = "out.zip")

This one fails with a 403 Forbidden because it gets redirected to a
URL in Amazon S3, where a signature of the file is embedded in the
URL. However, the signature is computed with the request type (HEAD
vs. GET), and so the same URL doesn't work for both. (See
http://stackoverflow.com/a/20580036/412655)

Any help would be appreciated!


I think I introduced this, in


r69280 | morgan | 2015-09-03 06:24:49 -0400 (Thu, 03 Sep 2015) | 4 lines

don't create empty file on 404 and similar errors

- download.file(method="libcurl")



The idea was to test that the file can be downloaded before trying to 
download it; previously R would download the error page as though it 
were the content.


I'll give this some thought.

Martin Morgan



-Winston

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




This email message may contain legally privileged and/or...{{dropped:2}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is it possible to increase MAX_NUM_DLLS in future R releases?

2016-05-04 Thread Martin Morgan



On 05/04/2016 05:15 AM, Prof Brian Ripley wrote:

On 04/05/2016 08:44, Martin Maechler wrote:

Qin Zhu 
 on Mon, 2 May 2016 16:19:44 -0400 writes:


 > Hi,
 > I’m working on a Shiny app for statistical analysis. I ran into
this "maximal number of DLLs reached" issue recently because my app
requires importing many other packages.

 > I’ve posted my question on stackoverflow
(http://stackoverflow.com/questions/36974206/r-maximal-number-of-dlls-reached
<http://stackoverflow.com/questions/36974206/r-maximal-number-of-dlls-reached>).


 > I’m just wondering is there any reason to set the maximal
number of DLLs to be 100, and is there any plan to increase it/not
hardcoding it in the future? It seems many people are also running
into this problem. I know I can work around this problem by modifying
the source, but since my package is going to be used by other people,
I don’t think this is a feasible solution.

 > Any suggestions would be appreciated. Thanks!
 > Qin

Increasing that number is of course "possible"... but it also
costs a bit (adding to the fixed memory footprint of R).


And not only that.  At the time this was done (and it was once 50) the
main cost was searching DLLs for symbols.  That is still an issue, and
few packages exclude their DLL from symbol search so if symbols have to
searched for a lot of DLLs will be searched.  (Registering all the
symbols needed in a package avoids a search, and nowadays by default
searches from a namespace are restricted to that namespace.)

See
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Registering-native-routines
for some further details about the search mechanism.


I did not set that limit, but I'm pretty sure it was also meant
as reminder for the useR to "clean up" a bit in her / his R
session, i.e., not load package namespaces unnecessarily. I
cannot yet imagine that you need > 100 packages | namespaces
loaded in your R session. OTOH, some packages nowadays have a
host of dependencies, so I agree that this at least may happen
accidentally more frequently than in the past.


I am not convinced that it is needed.  The OP says he imports many
packages, and I doubt that more than a few are required at any one time.
  Good practice is to load namespaces as required, using requireNamespace.


Extensive package dependencies in Bioconductor make it pretty easy to 
end up with dozen of packages attached or loaded. For instance


  library(GenomicFeatures)
  library(DESeq2)

> length(loadedNamespaces())
[1] 63
> length(getLoadedDLLs())
[1] 41

Qin's use case is a shiny app, presumably trying to provide relatively 
comprehensive access to a particular domain. Even if the app were to 
load / requireNamespace() (this requires considerable programming 
discipline to ensure that the namespace is available on all programming 
paths where it is used), it doesn't seem at all improbable that the user 
in an exploratory analysis would end up accessing dozens of packages 
with orthogonal dependencies. This is also the use case with Karl 
Forner's post 
https://stat.ethz.ch/pipermail/r-devel/2015-May/071104.html (adding 
library(crlmm) to the above gets us to 53 DLLs).





The real solution of course would be a code improvement that
starts with a relatively small number of "DLLinfo" structures
(say 32), and then allocates more batches (of size say 32) if
needed.


The problem of course is that such code will rarely be exercised, and
people have made errors on the boundaries (here multiples of 32) many
times in the past.  (Note too that DLLs can be removed as well as added,
another point of coding errors.)


That argues for a simple increase in the maximum number of DLLs. This 
would enable some people to have very bulky applications that pay a 
performance cost (but the cost here is in small fractions of a 
second...) in terms of symbol look-up (and collision?), but would have 
no consequence for those of us with more sane use cases.


Martin Morgan




Patches to the R sources (development trunk in subversion at
https://svn.r-project.org/R/trunk/ ) are very welcome!

Martin Maechler
ETH Zurich  &  R Core Team







This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] vignette index

2016-03-04 Thread Morgan, Martin
Within R these are determined by \VignetteIndexEntry{}. I think you are 
referring to the order on the CRAN landing page for your package 
https://cran.r-project.org/web/packages/bst/index.html, and then the question 
is for a CRAN member.

In Rnw

\documentclass{article}

% \VignetteIndexEntry{01-Foo}

\begin{document}
01-Foo
\end{document}

or Rmd as

---
title: "Demo"
author: Ima Scientist 
vignette: >
  % \VignetteIndexEntry{01-Foo}
  % \VignetteEngine{knitr::rmarkdown}
---


From: R-devel  on behalf of Wang, Zhu 

Sent: Friday, March 4, 2016 11:18 AM
To: Duncan Murdoch; r-devel@r-project.org
Subject: Re: [Rd] vignette index

I think the online order of vignette files are not based on vignette title or 
filename alphabetically. I am just curious: by what order these vignette files 
were displayed online so I can make changes accordingly?

Thanks,

Zhu

-Original Message-
From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com]
Sent: Friday, March 04, 2016 10:47 AM
To: Wang, Zhu; r-devel@r-project.org
Subject: Re: [Rd] vignette index

On 04/03/2016 9:44 AM, Wang, Zhu wrote:
> Dear helpers,
>
> I have multiple vignette files for a package, and I would like to have the 
> "right" order of these files when displayed online. For instance, see below:
>
> https://cran.r-project.org/web/packages/bst/index.html
>
> The order of vignette links on CRAN is different from what I hoped for:
>
> > vignette(package="bst")
> Vignettes in package 'bst':
>
> prosCancer Classification Using Mass
>  Spectrometry-based Proteomics Data (source,
>  pdf)
> static_khan Classification of Cancer Types Using Gene
>  Expression Data (Long) (source, pdf)
> khanClassification of Cancer Types Using Gene
>  Expression Data (Short) (source, pdf)
> static_mcl  Classification of UCI Machine Learning Datasets
>  (Long) (source, pdf)
> mcl Classification of UCI Machine Learning Datasets
>  (Short) (source, pdf)
>
> The package bst already has an index.html,  and I thought that should have 
> done the job, but apparently not. Any suggestions?
>

The index.html file should be used in the online help system, but
vignette() doesn't use that, it looks in the internal database of vignettes.  I 
don't think you can control the order in which it displays things.

This could conceivably be changed, but not by consulting your index.html file 
--- it is not required to follow a particular structure, so we can't find what 
order you want from it.  One more likely possibility would be to sort 
alphabetically in the current locale according to filename or vignette title.  
So then you could get what you want by naming your vignettes 1pros, 
2static_khan, etc.

It would also be possible to add a new \Vignette directive so affect 
collation order, but that seems like overkill.

Duncan Murdoch

**Connecticut Children's Confidentiality Notice**
This e-mail message, including any attachments, is for =...{{dropped:15}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] as.vector in R-devel loaded 3/3/2016

2016-03-04 Thread Morgan, Martin
I see as below, where getGeneric and getMethod imply a different signature; the 
signature is mode="any" for both cases in R version 3.2.3 Patched (2016-01-28 
r70038)I don't know how to reproduce Jeff's error, though.

> library(Matrix)
> as.vector
function (x, mode = "any") 
.Internal(as.vector(x, mode))


> getGeneric("as.vector")
standardGeneric for "as.vector" defined from package "base"

function (x, mode) 
standardGeneric("as.vector")


Methods may be defined for arguments: x
Use  showMethods("as.vector")  for currently available ones.
> selectMethod("as.vector", "ANY")
Method Definition (Class "internalDispatchMethod"):

function (x, mode) 
.Internal(as.vector(x, mode))


Signatures:
x
target  "ANY"
defined "ANY"
> sessionInfo()
R Under development (unstable) (2016-02-27 r70232)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] Matrix_1.2-4

loaded via a namespace (and not attached):
[1] grid_3.3.0  lattice_0.20-33



From: R-devel  on behalf of Martin Maechler 

Sent: Friday, March 4, 2016 6:05 AM
To: peter dalgaard
Cc: r-devel@r-project.org; Jeff Laake - NOAA Federal
Subject: Re: [Rd] as.vector in R-devel loaded 3/3/2016

> peter dalgaard 
> on Fri, 4 Mar 2016 09:21:48 +0100 writes:

> Er, until _what_ is fixed?
> I see no anomalies with the version in R-pre:

Indeed.

The problem ... I also have stumbled over ..
is that I'm sure Jeff is accidentally loading a different
version of 'Matrix' than the one that is part of R-devel.

Jeff you must accidentally be loading a version Matrix made with
R 3.2.x in R 3.3.0  and that will fail with the as.vector()
mismatch error message.

(and IIRC, you also get such an error message if you load a
 3.3.0-built version of Matrix into a non-3.3.0 version of R).


Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Error in socketConnection(master, port = port, blocking = TRUE, open = "a+b", : cannot open the connection

2016-01-15 Thread Morgan, Martin
Arrange to make the ssh connection passwordless. Do this by copying your 
'public key' to the machine that you are trying to connect to. Google will be 
your friend in accomplishing this.

It might be that a firewall stands between you and the other machine, or that 
the other machine does not allow connections to port 11001. Either way, the 
direction toward a solution is to speak with your system administrator. If it 
is firewall, then they are unlikely to accommodate you; the strategy is to run 
your cluster exclusively on one side of the firewall.

Martin Morgan

From: R-devel [r-devel-boun...@r-project.org] on behalf of Soumen Pal via 
R-devel [r-devel@r-project.org]
Sent: Friday, January 15, 2016 2:05 AM
To: r-devel@r-project.org
Subject: [Rd] Error in socketConnection(master, port = port, blocking = TRUE, 
open = "a+b", :cannot open the connection

Dear All

I have sucessfully created cluster of four nodes using localhost in my local 
machine by executing the following command

> cl<-makePSOCKcluster(c(rep("localhost",4)),outfile='',homogeneous=FALSE,port=11001)
starting worker pid=4271 on localhost:11001 at 12:12:26.164
starting worker pid=4280 on localhost:11001 at 12:12:26.309
starting worker pid=4289 on localhost:11001 at 12:12:26.456
starting worker pid=4298 on localhost:11001 at 12:12:26.604
>
> stopCluster(cl)

Now I am trying to create a cluster of 2 nodes (one in my local machine and 
another remote machine) by using "makePSOCKcluster" command. Both machine have 
identical settings and connected by SSH. OS is Ubuntu 14.04 LTS & R version 
3.2.1. I have executed the follwoing command to create cluster but getting the 
following error message and R Session is getting hanged.

cl<-makePSOCKcluster(c(rep("soumen@10.10.2.32",1)),outfile='',homogeneous=FALSE,port=11001)
soumen@10.10.2.32's password:
starting worker pid=2324 on soumen-HP-ProBook-440-G2:11001 at 12:11:59.349
Error in socketConnection(master, port = port, blocking = TRUE, open = "a+b",  :
  cannot open the connection
Calls:  ... doTryCatch -> recvData -> makeSOCKmaster -> 
socketConnection
In addition: Warning message:
In socketConnection(master, port = port, blocking = TRUE, open = "a+b",  :
  soumen-HP-ProBook-440-G2:11001 cannot be opened
Execution halted


My sessionInfo() is as follwos

sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: Ubuntu 14.04.1 LTS

locale:
 [1] LC_CTYPE=en_IN   LC_NUMERIC=C LC_TIME=en_IN
 [4] LC_COLLATE=en_IN LC_MONETARY=en_INLC_MESSAGES=en_IN
 [7] LC_PAPER=en_IN   LC_NAME=CLC_ADDRESS=C
[10] LC_TELEPHONE=C   LC_MEASUREMENT=en_IN LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats graphics  grDevices utils datasets  methods
[8] base


I dont know how to solve this problem.Plese help me to solve this problem.

Thanks

Soumen Pal

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] For integer vectors, `as(x, "numeric")` has no effect.

2015-12-11 Thread Morgan, Martin
>From the Bioconductor side of things, the general feeling is that this is a 
>step in the right direction and worth the broken packages. Martin Morgan

From: R-devel [r-devel-boun...@r-project.org] on behalf of Martin Maechler 
[maech...@stat.math.ethz.ch]
Sent: Friday, December 11, 2015 4:25 AM
To: John Chambers; r-devel@r-project.org; bioc-devel list; Benjamin Tyner
Cc: Martin Maechler
Subject: Re: [Rd] For integer vectors, `as(x, "numeric")` has no effect.

>>>>> Martin Maechler 
>>>>> on Tue, 8 Dec 2015 15:25:21 +0100 writes:

>>>>> John Chambers 
>>>>> on Mon, 7 Dec 2015 16:05:59 -0800 writes:

>> We do need an explicit method here, I think.
>> The issue is that as() uses methods for the generic function coerce() 
but cannot use inheritance in the usual way (if it did, you would be 
immediately back with no change, since "integer" inherits from "numeric").

>> Copying in the general method for coercing to "numeric" as an explicit 
method for "integer" gives the expected result:

>>> setMethod("coerce", c("integer", "numeric"), getMethod("coerce", 
c("ANY", "numeric")))
>> [1] "coerce"
>>> typeof(as(1L, "numeric"))
>> [1] "double"

>> Seems like a reasonable addition to the code, unless someone sees a 
problem.
>> John

> I guess that that some package checks (in CRAN + Bioc + ... -
> land) will break,
> but I still think we should add such a coercion to R.

> Martin

Hmm...  I've tried to add the above to R
and do notice that there are consequences that may be larger than
anticipated:

Here is example code:

   myN   <- setClass("myN",   contains="numeric")
   myNid <- setClass("myNid", contains="numeric", 
representation(id="character"))
   NN <-setClass("NN", representation(x="numeric"))

   (m1 <- myN  (1:3))
   (m2 <- myNid(1:3, id = "i3"))
   tools::assertError(NN (1:3))# in all R versions

   ## # current R  |  new R
   ## # ---|--
   class(getDataPart(m1)) # integer|  numeric
   class(getDataPart(m2)) # integer|  numeric


In other words, with the above setting, the traditional
gentleperson's agreement in S and R,

  __ "numeric" sometimes conveniently means "integer" or "double"  __

will be slightly less often used ... which of course may be a
very good thing.

However, it breaks strict back compatibility also in cases where
the previous behavior may have been preferable:
After all integer vectors need only have the space of doubles.

Shall we still go ahead and do apply this change to R-devel
and then all package others will be willing to update where necessary?

As this may affect the many hundreds of bioconductor packages
using S4 classes, I am -- exceptionally -- cross posting to the
bioc-devel list.

Martin Maechler


>> On Dec 7, 2015, at 3:37 PM, Benjamin Tyner  wrote:

>>> Perhaps it is not that surprising, given that
>>>
>>> > mode(1L)
>>> [1] "numeric"
>>>
>>> and
>>>
>>> > is.numeric(1L)
>>> [1] TRUE
>>>
>>> On the other hand, this is curious, to say the least:
>>>
>>> > is.double(as(1L, "double"))
>>> [1] FALSE
>>>
>>>> Here's the surprising behavior:
>>>>
>>>> x <- 1L
>>>> xx <- as(x, "numeric")
>>>> class(xx)
>>>> ## [1] "integer"
>>>>
>>>> It occurs because the call to `as(x, "numeric")` dispatches the coerce
>>>> S4 method for the signature `c("integer", "numeric")`, whose body is
>>>> copied in below.
>>>>
>>>> function (from, to = "numeric", strict = TRUE)
>>>> if (strict) {
>>>> class(from) <- "numeric"
>>>> from
>>>> } else from
>>>>
>>>> This in turn does nothing, even when strict=TRUE, because that
>>>> assignment to class "numeric" has no effect:
>>>>
>>>> x <- 10L
>>>> class(x) <- "numeric"
>>>> class(x)
>>>> [1] "integer"
>>>>
>>>> Is thi

Re: [Rd] Error generated by .Internal(nchar) disappears when debugging

2015-10-07 Thread Morgan, Martin


> -Original Message-
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Cook,
> Malcolm
> Sent: Wednesday, October 07, 2015 3:52 PM
> To: 'Duncan Murdoch'; Matt Dowle; r-de...@stat.math.ethz.ch
> Subject: Re: [Rd] Error generated by .Internal(nchar) disappears when
> debugging
> 
> What other packages do you have loaded?  Perhaps a BioConductor one that
> loads S4Vectors that announces upon load:
> 
>   Creating a generic function for 'nchar' from package 'base' in package
> 'S4Vectors'

This was introduced as a way around the problem, where the declaration of a 
method was moved to the .onLoad hook

.onLoad <- function(libname, pkgname)
 setMethod("nchar", "Rle", .nchar_Rle)

instead of in the body of the package. The rationale was that the method is 
then created at run-time, when the generic is defined on the user's R, rather 
than at compile time, when the generic is defined on the build system's R. 
There was a subsequent independent report that this did not solve the problem, 
but we were not able to follow up on that.

This is only defined in the current release version of S4Vectors, which is the 
only version expected to straddle R versions with different signatures.

Martin Morgan

> 
> Maybe a red herring...
> 
> ~Malcolm
> 
>  > -Original Message-
>  > From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of
> Duncan  > Murdoch  > Sent: Monday, October 05, 2015 6:57 PM  > To: Matt
> Dowle ; r-de...@stat.math.ethz.ch  > Subject: Re:
> [Rd] Error generated by .Internal(nchar) disappears when  > debugging  >  >
> On 05/10/2015 7:24 PM, Matt Dowle wrote:
>  > > Joris Meys  gmail.com> writes:
>  > >
>  > >>
>  > >> Hi all,
>  > >>
>  > >> I have a puzzling problem related to nchar. In R 3.2.1, the internal  > 
> >
> nchar  > >> gained an extra argument (see  > >>
> https://stat.ethz.ch/pipermail/r-announce/2015/000586.html)
>  > >>
>  > >> I've been testing code using the package copula, and at home I'm  > >>
> still running R 3.2.0 (I know, I know...). When trying the following  > >> 
> code, I
> > > got  > >> an error:
>  > >>
>  > >>> library(copula)
>  > >>> fgmCopula(0.8)
>  > >> Error in substr(sc[i], 2, nchar(sc[i]) - 1) :
>  > >>   4 arguments passed to .Internal(nchar) which requires 3
>  > >>
>  > >> Cheers
>  > >> Joris
>  > >
>  > >
>  > > I'm seeing a similar problem. IIUC, the Windows binary .zip from CRAN  >
> > of any package using base::nchar is affected. Could someone check my  > >
> answer here is correct please :
>  > > http://stackoverflow.com/a/32959306/403310
>  >
>  > Nobody has posted a simple reproducible example here, so it's kind of
> hard to  > say.
>  >
>  > I would have guessed that a change to the internal signature of the C code
> > underlying nchar() wouldn't have any effect on a package that called the R
> > nchar() function.
>  >
>  > When I put together my own example (a tiny package containing a
> function  > calling nchar(), built to .zip using R 3.2.2, installed into R 
> 3.2.0), it
> confirmed  > my guess.
>  >
>  > On the other hand, if some package is calling the .Internal function 
> directly,
> I'd  > expect that to break.  Packages shouldn't do that.
>  >
>  > So I'd say there's been no evidence posted of a problem in R here, though
> > there may be problems in some of the packages involved.  I'd welcome an
> > example that provided some usable evidence.
>  >
>  > Duncan Murdoch
>  >
>  > __
>  > R-devel@r-project.org mailing list
>  > https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Issues with libcurl + HTTP status codes (eg. 403, 404)

2015-08-27 Thread Martin Morgan
R-devel r69197 returns appropriate errors for the cases below; I know of a few 
rough edges


- ftp error codes are not reported correctly
- download.file creates destfile before discovering that http fails, leaving an 
empty file on disk


and am happy to hear of more.

Martin

On 08/27/2015 08:46 AM, Jeroen Ooms wrote:

On Thu, Aug 27, 2015 at 5:16 PM, Martin Maechler
 wrote:

Probably I'm confused now...
Both R-patched and R-devel give an error (after a *long* wait!)
for
download.file("https://someserver.com/mydata.csv";, "mydata.csv")

So that problem is I think  solved now.


I'm sorry for the confusion, this was a hypothetical example.
Connection failures are different from http status errors. Below some
real examples of servers returning http errors. For each example the
"internal" method correctly raises an R error, whereas the "libcurl"
method does not.

# File not found (404)
download.file("http://httpbin.org/data.csv";, "data.csv", method = "internal")
download.file("http://httpbin.org/data.csv";, "data.csv", method = "libcurl")
readLines(url("http://httpbin.org/data.csv";, method = "internal"))
readLines(url("http://httpbin.org/data.csv";, method = "libcurl"))

# Unauthorized (401)
download.file("https://httpbin.org/basic-auth/user/passwd";,
"data.csv", method = "internal")
download.file("https://httpbin.org/basic-auth/user/passwd";,
"data.csv", method = "libcurl")
readLines(url("https://httpbin.org/basic-auth/user/passwd";, method =
"internal"))
readLines(url("https://httpbin.org/basic-auth/user/passwd";, method = "libcurl"))




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Issues with libcurl + HTTP status codes (eg. 403, 404)

2015-08-27 Thread Martin Morgan

On 08/27/2015 08:16 AM, Martin Maechler wrote:

"DM" == Duncan Murdoch 
 on Wed, 26 Aug 2015 19:07:23 -0400 writes:


 DM> On 26/08/2015 6:04 PM, Jeroen Ooms wrote:
 >> On Tue, Aug 25, 2015 at 10:33 PM, Martin Morgan 
 wrote:
 >>>
 >>> actually I don't know that it does -- it addresses the symptom but I 
think there should be an error from libcurl on the 403 / 404 rather than from read.dcf 
on error page...
 >>
 >> Indeed, the only correct behavior is to turn the protocol error code
 >> into an R exception. When the server returns a status code >= 400, it
 >> indicates that the request was unsuccessful and the response body does
 >> not contain the content the client had requested, but should instead
 >> be interpreted as an error message/page. Ignoring this fact and
 >> proceeding with parsing the body as usual is incorrect and leads to
 >> all kind of strange errors downstream.

 DM> Yes.  I haven't been following this long thread.  Is it only in 
R-devel,
 DM> or is this happening in 3.2.2 or R-patched?

 DM> If the latter, please submit a bug report.  If it is only R-devel,
 DM> please just be patient.  When R-devel becomes R-alpha next year, if the
 DM> bug still exists, please report it.

 DM> Duncan Murdoch

Probably I'm confused now...
Both R-patched and R-devel give an error (after a *long* wait!)
for
download.file("https://someserver.com/mydata.csv";, "mydata.csv")

So that problem is I think  solved now.
Ideally, it would nice to set the *timeout* as an R function
argument ourselves.. though.

Kevin Ushey's original problem however is still in R-patched and
R-devel:

ap <- available.packages("http://www.stats.ox.ac.uk/pub/RWin";, method="libcurl")
ap

giving


ap <- available.packages("http://www.stats.ox.ac.uk/pub/RWin";, 
method="libcurl")Warning: unable to access index for repository 
http://www.stats.ox.ac.uk/pub/RWin:

   Line starting '
ap

  Package Version Priority Depends Imports LinkingTo Suggests Enhances 
License License_is_FOSS License_restricts_use OS_type Archs
  MD5sum NeedsCompilation File Repository




and the resulting 'ap' is the same as e.g., with the the default
method which also gives a warning and then an empty list (well
"data.frame") of packages.


I don't see a big problem with the above.
It would be better if the warning did not contain the extra
"Line starting '

In Kevin's original post, he was using an earlier version of R, and the code in 
available.packages was returning an error.


The code had been updated (by me) in the version that you are using to return a 
warning, which was the original design and intention (to convert errors during 
repository queries into warnings, so other repositories could be queried; this 
was Kevin's original point).


The fix I provided does not address the underlying problem, which is that

  download.file("http://www.stats.ox.ac.uk/pub/RWin/PACKAGES.gz";,
fl <- tempfile(), method="libcurl")

actually downloads the error file, without throwing an error

>   download.file("http://www.stats.ox.ac.uk/pub/RWin/PACKAGES.gz";,   fl <- 
tempfile(), method="libcurl")

trying URL 'http://www.stats.ox.ac.uk/pub/RWin/PACKAGES.gz'
Content type 'text/html; charset=iso-8859-1' length 302 bytes
==
downloaded 302 bytes

> cat(paste(readLines(fl), collapse="\n"))


404 Not Found

Not Found
The requested URL /pub/RWin/PACKAGES.gz was not found on this server.

Apache/2.2.22 (Debian) Server at www.stats.ox.ac.uk Port 80
>


I do have a patch for this, which I will share off-list before committing.

Martin
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Issues with libcurl + HTTP status codes (eg. 403, 404)

2015-08-25 Thread Martin Morgan

On 08/25/2015 01:30 PM, Kevin Ushey wrote:

Hi Martin,

Indeed it does (and I should have confirmed myself with R-patched and R-devel
before posting...)


actually I don't know that it does -- it addresses the symptom but I think there 
should be an error from libcurl on the 403 / 404 rather than from read.dcf on 
error page...


Martin




Thanks, and sorry for the noise.
Kevin


On Tue, Aug 25, 2015, 13:11 Martin Morgan mailto:mtmor...@fredhutch.org>> wrote:

On 08/25/2015 12:54 PM, Kevin Ushey wrote:
 > Hi all,
 >
 > The following fails for me (on OS X, although I imagine it's the same
 > on other platforms using libcurl):
 >
 >  options(download.file.method = "libcurl")
 >  options(repos = c(CRAN = "https://cran.rstudio.com/";, CRANextra =
 > "http://www.stats.ox.ac.uk/pub/RWin";))
 >  install.packages("lattice") ## could be any package
 >
 > gives me:
 >
 >  > options(download.file.method = "libcurl")
 >  > options(repos = c(CRAN = "https://cran.rstudio.com/";, CRANextra
 > = "http://www.stats.ox.ac.uk/pub/RWin";))
 >  > install.packages("lattice") ## coudl be any package
 >  Installing package into ‘/Users/kevinushey/Library/R/3.2/library’
 >  (as ‘lib’ is unspecified)
 >  Error: Line starting '
 > This seems to come from a call to `available.packages()` to a URL that
 > doesn't exist on the server (likely when querying PACKAGES on the
 > CRANextra repo)
 >
 > Eg.
 >
 >  > URL <- "http://www.stats.ox.ac.uk/pub/RWin";
 >  > available.packages(URL, method = "internal")
 >  Warning: unable to access index for repository
 > http://www.stats.ox.ac.uk/pub/RWin
 >   Package Version Priority Depends Imports LinkingTo Suggests
 > Enhances License License_is_FOSS
 >  License_restricts_use OS_type Archs MD5sum NeedsCompilation
 > File Repository
 >  > available.packages(URL, method = "libcurl")
 >  Error: Line starting '
 > It looks like libcurl downloads and retrieves the 403 page itself,
 > rather than reporting that it was actually forbidden, e.g.:
 >
 >  >

download.file("http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz";,
 > tempfile(), method = "libcurl")
 >  trying URL

'http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz'
 >  Content type 'text/html; charset=iso-8859-1' length 339 bytes
 >  ==
 >  downloaded 339 bytes
 >
 > Using `method = "internal"` gives an error related to the inability to
 > access that URL due to the HTTP status 403.
 >
 > The overarching issue here is that package installation shouldn't fail
 > even if libcurl fails to access one of the repositories set.
 >

With

  > R.version.string
[1] "R version 3.2.2 Patched (2015-08-25 r69179)"

the behavior is to warn with an indication of the repository for which the
problem occurs

  > URL <- "http://www.stats.ox.ac.uk/pub/RWin";
  > available.packages(URL, method="libcurl")
Warning: unable to access index for repository
http://www.stats.ox.ac.uk/pub/RWin:
Line starting ' available.packages(URL, method="internal")
Warning: unable to access index for repository
http://www.stats.ox.ac.uk/pub/RWin:
cannot open URL 'http://www.stats.ox.ac.uk/pub/RWin/PACKAGES'
   Package Version Priority Depends Imports LinkingTo Suggests Enhances
   License License_is_FOSS License_restricts_use OS_type Archs MD5sum
   NeedsCompilation File Repository

Does that work for you / address the problem?

Martin

 >> sessionInfo()
 > R version 3.2.2 (2015-08-14)
 > Platform: x86_64-apple-darwin13.4.0 (64-bit)
 > Running under: OS X 10.10.4 (Yosemite)
 >
 > locale:
 > [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
 >
 > attached base packages:
 > [1] stats graphics  grDevices utils datasets  methods   base
 >
 > other attached packages:
 > [1] testthat_0.8.1.0.99  knitr_1.11   devtools_1.5.0.9001
 > [4] BiocInstaller_1.15.5
 >
 > loaded via a namespace (and not attached):
 >   [1] httr_1.0.0 R6_2.0.0.9000  tools_3.2.2parallel_3.2.2
whisker_0.3-2
 >   [6] RCurl_1

Re: [Rd] Issues with libcurl + HTTP status codes (eg. 403, 404)

2015-08-25 Thread Martin Morgan

On 08/25/2015 12:54 PM, Kevin Ushey wrote:

Hi all,

The following fails for me (on OS X, although I imagine it's the same
on other platforms using libcurl):

 options(download.file.method = "libcurl")
 options(repos = c(CRAN = "https://cran.rstudio.com/";, CRANextra =
"http://www.stats.ox.ac.uk/pub/RWin";))
 install.packages("lattice") ## could be any package

gives me:

 > options(download.file.method = "libcurl")
 > options(repos = c(CRAN = "https://cran.rstudio.com/";, CRANextra
= "http://www.stats.ox.ac.uk/pub/RWin";))
 > install.packages("lattice") ## coudl be any package
 Installing package into ‘/Users/kevinushey/Library/R/3.2/library’
 (as ‘lib’ is unspecified)
 Error: Line starting ' URL <- "http://www.stats.ox.ac.uk/pub/RWin";
 > available.packages(URL, method = "internal")
 Warning: unable to access index for repository
http://www.stats.ox.ac.uk/pub/RWin
  Package Version Priority Depends Imports LinkingTo Suggests
Enhances License License_is_FOSS
 License_restricts_use OS_type Archs MD5sum NeedsCompilation
File Repository
 > available.packages(URL, method = "libcurl")
 Error: Line starting ' 
download.file("http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz";,
tempfile(), method = "libcurl")
 trying URL 
'http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/mavericks/contrib/3.2/PACKAGES.gz'
 Content type 'text/html; charset=iso-8859-1' length 339 bytes
 ==
 downloaded 339 bytes

Using `method = "internal"` gives an error related to the inability to
access that URL due to the HTTP status 403.

The overarching issue here is that package installation shouldn't fail
even if libcurl fails to access one of the repositories set.



With

> R.version.string
[1] "R version 3.2.2 Patched (2015-08-25 r69179)"

the behavior is to warn with an indication of the repository for which the 
problem occurs


> URL <- "http://www.stats.ox.ac.uk/pub/RWin";
> available.packages(URL, method="libcurl")
Warning: unable to access index for repository 
http://www.stats.ox.ac.uk/pub/RWin:
  Line starting ' available.packages(URL, method="internal")
Warning: unable to access index for repository 
http://www.stats.ox.ac.uk/pub/RWin:
  cannot open URL 'http://www.stats.ox.ac.uk/pub/RWin/PACKAGES'
 Package Version Priority Depends Imports LinkingTo Suggests Enhances
 License License_is_FOSS License_restricts_use OS_type Archs MD5sum
 NeedsCompilation File Repository

Does that work for you / address the problem?

Martin


sessionInfo()

R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.4 (Yosemite)

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] testthat_0.8.1.0.99  knitr_1.11   devtools_1.5.0.9001
[4] BiocInstaller_1.15.5

loaded via a namespace (and not attached):
  [1] httr_1.0.0 R6_2.0.0.9000  tools_3.2.2parallel_3.2.2 whisker_0.3-2
  [6] RCurl_1.95-4.1 memoise_0.2.1  stringr_0.6.2  digest_0.6.4   evaluate_0.7.2

Thanks,
Kevin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] download.file() on ftp URL fails in windows with default download method

2015-08-16 Thread Martin Morgan
In r69089 (R-devel) and 69090 (R-3-2-branch) the "wininet" ftp download method 
tries EPSV / PASV first. Success requires that the client (user) be able to 
open outgoing unprivileged ports, which will usually be the case. Proxies and 
so on should be handled by the OS and virtualization layer. Reports to the 
contrary welcome... Martin

- Original Message -
> Hi David,
> 
> - Original Message -
> > From: "David Smith" 
> > To: "Dan Tenenbaum" , "Uwe Ligges"
> > , "Elliot Waingold"
> > 
> > Cc: "R-devel@r-project.org" 
> > Sent: Wednesday, August 12, 2015 12:42:39 PM
> > Subject: RE: [Rd] download.file() on ftp URL fails in windows with
> > default download method
> > 
> > We were also able to reproduce the issue on Windows Server 2012. If
> > there's anything we can do to help please let me know; Elliot
> > Waingold (CC'd here) can provide access to the VM we used for
> > testing if that's of any help.
> > 
> 
> Thanks!
> 
> I have just been looking at this issue with Martin Morgan. We found
> that if we "or" the additional flag INTERNET_FLAG_PASSIVE on line
> 1012 of src/modules/internet/internet.c (R-3.2 branch, last changed
> in r68393)
> that the ftp connection works.
> 
> Further investigation reveals that in a passive ftp connection,
> certain ports on the client need to be open.
> This machine is in the Amazon cloud so it was easy to open the ports.
> But we still have a problem and I believe it's that the wrong IP
> address is being sent to the server (on an AWS machine, the machine
> thinks of itself as having one IP address, but that is a private
> address that is valid inside AWS only).
> 
> Here's a curl command line that gets around this by sending the
> correct address (or hostname):
> 
> curl --ftp-port myhostname.com
> ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/All/GCF_01405.13.assembly.txt
> 
> Curl normally uses passive mode which is why it works, but the
> --ftp-port switch tells it to use active mode with the specified ip
> address or hostname.
> 
> So I'm not sure where we go from here. One easy fix is just to add
> the INTERNET_FLAG_PASSIVE flag as described above. Another would be
> to first check if active mode works, and if not, use passive mode.
> 
> Dan
> 
> 
> > # David Smith
> > 
> > --
> > David M Smith 
> > R Community Lead, Revolution Analytics (a Microsoft company)
> > Tel: +1 (312) 9205766 (Chicago IL, USA)
> > Twitter: @revodavid | Blog:  http://blog.revolutionanalytics.com
> > We are hiring engineers for Revolution R and Azure Machine
> > Learning.
> > 
> > -Original Message-
> > From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of
> > Dan
> > Tenenbaum
> > Sent: Tuesday, August 11, 2015 09:51
> > To: Uwe Ligges 
> > Cc: R-devel@r-project.org
> > Subject: Re: [Rd] download.file() on ftp URL fails in windows with
> > default download method
> > 
> > 
> > 
> > - Original Message -
> > > From: "Dan Tenenbaum" 
> > > To: "Uwe Ligges" 
> > > Cc: "R-devel@r-project.org" 
> > > Sent: Saturday, August 8, 2015 4:02:54 PM
> > > Subject: Re: [Rd] download.file() on ftp URL fails in windows
> > > with
> > > default download method
> > > 
> > > 
> > > 
> > > - Original Message -
> > > > From: "Uwe Ligges" 
> > > > To: "Dan Tenenbaum" ,
> > > > "R-devel@r-project.org" 
> > > > Sent: Saturday, August 8, 2015 3:57:34 PM
> > > > Subject: Re: [Rd] download.file() on ftp URL fails in windows
> > > > with
> > > > default download method
> > > > 
> > > > 
> > > > 
> > > > On 08.08.2015 01:11, Dan Tenenbaum wrote:
> > > > > Hi,
> > > > >
> > > > >> url <-
> > > > >> "ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/All/GCF_01405.13.assembly.txt";
> > > > >> download.file(url, tempfile())
> > > > > trying URL
> > > > > 'ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/All/GCF_01405.13.assembly.txt'
> > > > > Error in download.file(url, tempfile()) :
> > > > >cannot open URL
> > > > >
> > > > > 'ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/All/GCF_01405.13.assembly.txt'
> > &g

Re: [Rd] List S3 methods and defining packages

2015-07-07 Thread Martin Morgan

On 07/07/2015 02:05 AM, Renaud Gaujoux wrote:

Hi,

from the man page ?methods, I expected to be able to build pairs
(class,package) for a given S3 method, e.g., print, using

attr(methods(print), 'info').

However all the methods, except the ones defined in base or S4
methods, get the 'from' value "registered S3method for print", instead
of the actual package name (see below for the first rows).

Is this normal behaviour? If so, is there a way to get what I want: a
character vector mapping class to package (ideally in loading order,
but this I can re-order from search()).


It's the way it has always been, so normal in that sense.

There could be two meanings of 'from' -- the namespace in which the generic to 
which the method belongs is defined, and the namespace in which the method is 
defined. I think the former is what you're interested in, but the latter likely 
what methods() might be modified return.


For your use case, maybe something like

.S3methodsInNamespace <- function(envir, pattern) {
mtable <- get(".__S3MethodsTable__.", envir = asNamespace(envir))
methods <- ls(mtable, pattern = pattern)
env <- vapply(methods, function(x) {
environmentName(environment(get(x, mtable)))
}, character(1))
setNames(names(env), unname(env))
}


followed by

  nmspc = loadedNamespaces()
  lapply(setNames(nmspc, nmspc), .S3methodsInNamespace, "^plot.")

which reveals the different meanings of 'from', e.g.,

> lapply(setNames(nmspc, nmspc), .S3methodsInNamespace, "^plot.")["graphics"]
$graphics
   stats graphicsstats
  "plot.acf""plot.data.frame" "plot.decomposed.ts"
graphicsstatsstats
  "plot.default""plot.dendrogram"   "plot.density"
   stats graphics graphics
 "plot.ecdf""plot.factor"   "plot.formula"
graphicsstats graphics
 "plot.function""plot.hclust" "plot.histogram"
   statsstatsstats
  "plot.HoltWinters""plot.isoreg""plot.lm"
   statsstatsstats
"plot.medpolish"   "plot.mlm"   "plot.ppr"
   statsstatsstats
   "plot.prcomp"  "plot.princomp"   "plot.profile.nls"
graphicsstatsstats
   "plot.raster"  "plot.spec"   "plot.stepfun"
   stats graphicsstats
  "plot.stl" "plot.table""plot.ts"
   statsstats
 "plot.tskernel"  "plot.TukeyHSD"

Also this is for loaded, rather than attached, namespaces.

Martin Morgan


Thank you.

Bests,
Renaud

  visible
from generic  isS4
print.abbrev   FALSE registered
S3method for print   print FALSE
print.acf  FALSE registered
S3method for print   print FALSE
print.AES  FALSE registered
S3method for print   print FALSE
print.agnesFALSE registered
S3method for print   print FALSE
print.anovaFALSE registered
S3method for print   print FALSE
print.AnovaFALSE registered
S3method for print   print FALSE
print.anova.loglm  FALSE registered
S3method for print   print FALSE
print,ANY-methodTRUE
base   print  TRUE
print.aov  FALSE registered
S3method for print   print FALSE

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] S4 inheritance and old class

2015-05-28 Thread Martin Morgan

On 05/28/2015 02:49 AM, Julien Idé wrote:

Hey everyone,

I would like to develop a package using S4 classes.
I have to define several S4 classes that inherits from each others as
follow:

# A <- B <- C <- D

I also would like to define .DollarNames methods for these class so, if I
have understood well, I also have to define an old class as follow:

# AOld <- A <- B <- C <- D

setOldClass(Classes = "AOld")

setClass(
   Class = "A",
   contains = "AOld",
   slots = list(A = "character")
)

.DollarNames.A <- function(x, pattern)
   grep(pattern, slotNames(x), value = TRUE)


Instead of setOldClass, define a $ method on A

setMethod("$", "A", function(x, name) slot(x, name))

And then

  a = new("A")
  a$
  d = new("D")
  d$

I don't know about the setOldClass problem; it seems like a bug.

Martin Morgan



setClass(
   Class = "B",
   contains = "A",
   slots = list(B = "character"),
   validity = function(object){
 cat("Testing an object of class '", class(object),
 "'' with valitity function of class 'B'", sep = "")
 cat("Validity test for class 'B': ", object@A, sep = "")
 return(TRUE)
   }
)

setClass(
   Class = "C",
   contains = c("B"),
   slots = list(C = "character"),
   validity = function(object){
 cat("Testing an object of class '", class(object),
 "'' with valitity function of class 'C'", sep = "")
 cat("Validity test for class 'C': ", object@A, sep = "")
 return(TRUE)
   }
)

setClass(
   Class = "D",
   contains = "C",
   slots = list(D = "character"),
   validity = function(object){
 cat("Testing an object of class '", class(object),
 "'' with valitity function of class 'D'", sep = "")
 cat("Validity test for class 'D': ", object@A, sep = "")
 return(TRUE)
   }
)

My problem is that when I try to create an object of class "D" and test its
validity

validObject(new("D"))

it seems that at some point the object is coerced to an object of class
"AOld" and tested by the validity function of class "B". What am I missing
here?

Julien

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] example fails during R CMD CHECK but works interactively?

2015-05-15 Thread Martin Morgan

On 05/15/2015 05:05 AM, Charles Determan wrote:

Does anyone else have any thoughts about troubleshooting the R CMD check
environment?


In the pkg.Rcheck directory there is a file pkg-Ex.R.

LANGUAGE=en _R_CHECK_INTERNALS2_=1 $(R_HOME)/bin/R --vanilla pkge-Ex.R

followed by the usual strategy of bisecting the file into smaller chunks that 
still reproduce the example.


(this is based on my parsing of the complicated source, most relevant at

  https://github.com/wch/r-source/blob/trunk/src/library/tools/R/check.R#L2467

and

  https://github.com/wch/r-source/blob/trunk/src/library/tools/R/check.R#L36

)

Martin



Charles

On Wed, May 13, 2015 at 1:57 PM, Charles Determan 
wrote:


Thank you Dan but it isn't my tests that are failing (all of them pass
without problem) but one of the examples from the inst/examples directory.
I did try, however, to start R with the environmental variables as you
suggest but it had no effect on my tests.

Charles

On Wed, May 13, 2015 at 1:51 PM, Dan Tenenbaum 
wrote:




- Original Message -

From: "Charles Determan" 
To: r-devel@r-project.org
Sent: Wednesday, May 13, 2015 11:31:36 AM
Subject: [Rd] example fails during R CMD CHECK but works interactively?

Greetings,

I am collaborating with developing the bigmemory package and have run
in to
a strange problem when we run R CMD CHECK.  For some reason that
isn't
clear to us one of the examples crashes stating:

Error:  memory could not be allocated for instance of type big.matrix

You can see the output on the Travis CI page at
https://travis-ci.org/kaneplusplus/bigmemory where the error starts
at line
1035.  This is completely reproducible when running
devtools::check(args='--as-cran') locally.  The part that is
confusing is
that the calls work perfectly when called interactively.

Hadley comments on the 'check' page of his R packages website (
http://r-pkgs.had.co.nz/check.html) regarding test failing following
R CMD
check:

Occasionally you may have a problem where the tests pass when run
interactively with devtools::test(), but fail when in R CMD check.
This
usually indicates that you’ve made a faulty assumption about the
testing
environment, and it’s often hard to figure it out.

Any thoughts on how to troubleshoot this problem?  I have no idea
what
assumption we could have made.


Note that R CMD check runs R with environment variables set as follows
(at least on my system; you can check $R_HOME/bin/check to see what it does
on yours):

  R_DEFAULT_PACKAGES= LC_COLLATE=C

So try staring R like this:

  R_DEFAULT_PACKAGES= LC_COLLATE=C  R

And see if that reproduces the test failure. The locale setting could
affect tests of sort order, and the default package setting could
potentially affect other things.

Dan





Regards,
Charles

   [[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel








[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Creating a vignette which depends on a non-distributable file

2015-05-14 Thread Martin Morgan

On 05/14/2015 04:33 PM, Henrik Bengtsson wrote:

On May 14, 2015 15:04, "January Weiner"  wrote:


Dear all,

I am writing a vignette that requires a file which I am not allowed to
distribute, but which the user can easily download manually. Moreover, it
is not possible to download this file automatically from R: downloading
requires a (free) registration that seems to work only through a browser.
(I'm talking here about the MSigDB from the Broad Institute,
http://www.broadinstitute.org/gsea/msigdb/index.jsp).

In the vignette, I tell the user to download the file and then show how it
can be parsed and used in R. Thus, I can compile the vignette only if this
file is present in the vignettes/ directory of the package. However, it
would then get included in the package -- which I am not allowed to do.

What should I do?

(1) finding an alternative to MSigDB is not a solution -- there simply is
no alternative.
(2) I could enter the code (and the results) in a verbatim environment
instead of using Sweave. This has obvious drawbacks (for one thing, it
would look incosistent).


use the chunk argument eval=FALSE instead of placing the code in a verbatim 
argument. See ?RweaveLatex if you're compiling a PDF vignette from Rnw or the 
knitr documentation for (much nicer for users of your vignette, in my opinion) 
Rmd vignettes processed to HTML.


A common pattern is to process chunks 1, 2, 3, 4, and then there is a 'leap of 
faith' in chunk 5 (with eval=FALSE) and a second chunk (maybe with echo=FALSE, 
eval=TRUE) that reads the _result_ that would have been produced by chunk 5 from 
a serialized instance into the R session for processing in chunks 6, 7, 8...


Also very often while it might make sense to analyse an entire data set as part 
of a typical work flow, for illustrative purposes a much smaller subset or 
simulated data might be relevant; again a strategy would be to illustrate the 
problematic steps with simulated data, and then resume the narrative with the 
analyzed full data.


A secondary consideration may be that if your package _requires_ MSigDB to 
function, then it can't be automatically tested by repository build machines -- 
you'll want to have unit tests or other approaches to ensure that 'bit rot' does 
not set in without you being aware of it.


If this is a Bioconductor package, then it's appropriate to ask on the 
Bioconductor devel mailing list.


  http://bioconductor.org/developers/

http://bioconductor.org/packages/BiocStyle/ might be your friend for producing 
stylish vignettes.


Martin


(3) I could build vignette outside of the package and put it into the
inst/doc directory. This also has obvious drawbacks.
(4) Leaving this example out defies the purpose of my package.

I am tending towards solution (2). What do you think?


Not clear how big of a static piece you're taking about, but maybe you
could set it up such that you use (2) as a fallback, i.e. have the vignette
include a static/pre-generated piece (which is clearly marked as such) only
if the external dependency is not available.

Just a thought

Henrik



Kind regards,

j.



--
 January Weiner --

 [[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] S4 method dispatch sometimes leads to incorrect when object loaded from file?

2015-05-12 Thread Martin Morgan

On 05/12/2015 05:31 AM, Martin Maechler wrote:

Martin Morgan 
 on Mon, 11 May 2015 10:18:07 -0700 writes:


 > On 05/10/2015 08:19 AM, Martin Morgan wrote:
 >> Loading an S4 object from a file without first loading the library 
sometimes (?,
 >> the example below and actual example involves a virtual base class and 
the show
 >> generic) leads to incorrect dispatch (to the base class method).

"Of course", this is not as desired.

Other code automatically does try and typically succeed to load the package
(yes "package" ! ;-)) when 'needed', right,  so  show() is an
exception here, no ?


I added dim() methods, which also misbehave (differently)

  setMethod("dim", "A", function(x) "A-dim")
  setMethod("dim", "B", function(x) "B-dim")

~/tmp$ R --vanilla --slave -e "load('b.Rda'); dim(b)"
Loading required package: PkgA
NULL
~/tmp$ R --vanilla --slave -e "require('PkgA'); load('b.Rda'); dim(b)"
[1] "B-dim"

but sort of auto-heal (versus show, which is corrupted)

~/tmp$ R --vanilla --slave -e "load('b.Rda'); dim(b); dim(b)"
Loading required package: PkgA
NULL
[1] "B-dim"
~/tmp$ R --vanilla --slave -e "load('b.Rda'); b; b"
Loading required package: PkgA
A
A




 >> The attached package reproduces the problem. It has

 > The package was attached but stripped; a version is at

 > https://github.com/mtmorgan/PkgA

 > FWIW the sent mail was a multi-part MIME with the header on the package 
part

 > Content-Type: application/gzip;
 > name="PkgA.tar.gz"
 > Content-Transfer-Encoding: base64
 > Content-Disposition: attachment;
 > filename="PkgA.tar.gz"

 > From http://www.r-project.org/mail.html#instructions "we allow 
application/pdf,
 > application/postscript, and image/png (and x-tar and gzip on R-devel)" 
so I
 > thought that this mime type would not be stripped?

You were alright in your assumptions -- but unfortunately, the
accepted type has been  application/x-gzip instead of .../gzip.
I now *have* added the 2nd one as well.

Sorry for that.
The other Martin M..

 > Martin Morgan

 >>
 >> setClass("A")
 >> setClass("B", contains="A")
 >> setMethod("show", "A", function(object) cat("A\n"))
 >> setMethod("show", "B", function(object) cat("B\n"))
 >>
 >> with NAMESPACE
 >>
 >> import(methods)
 >> exportClasses(A, B)
 >> exportMethods(show)
 >>
 >> This creates the object and illustrated expected behavior
 >>
 >> ~/tmp$ R --vanilla --slave -e "library(PkgA); b = new('B'); save(b,
 >> file='b.Rda'); b"
 >> B
 >>
 >> Loading PkgA before the object leads to correct dispatch
 >>
 >> ~/tmp$ R --vanilla --slave -e "library(PkgA); load(file='b.Rda'); b"
 >> B
 >>
 >> but loading the object without first loading PkgA leads to dispatch to
 >> show,A-method.
 >>
 >> ~/tmp$ R --vanilla --slave -e "load(file='b.Rda'); b"
 >> Loading required package: PkgA
 >> A
 >>
 >> Martin Morgan


 > --
 > Computational Biology / Fred Hutchinson Cancer Research Center
 > 1100 Fairview Ave. N.
 > PO Box 19024 Seattle, WA 98109

 > Location: Arnold Building M1 B861
 > Phone: (206) 667-2793

 > __
 > R-devel@r-project.org mailing list
 > https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] S4 method dispatch sometimes leads to incorrect when object loaded from file?

2015-05-11 Thread Martin Morgan

On 05/10/2015 08:19 AM, Martin Morgan wrote:

Loading an S4 object from a file without first loading the library sometimes (?,
the example below and actual example involves a virtual base class and the show
generic) leads to incorrect dispatch (to the base class method).

The attached package reproduces the problem. It has


The package was attached but stripped; a version is at

  https://github.com/mtmorgan/PkgA

FWIW the sent mail was a multi-part MIME with the header on the package part

Content-Type: application/gzip;
 name="PkgA.tar.gz"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="PkgA.tar.gz"

From http://www.r-project.org/mail.html#instructions "we allow application/pdf, 
application/postscript, and image/png (and x-tar and gzip on R-devel)" so I 
thought that this mime type would not be stripped?


Martin Morgan



setClass("A")
setClass("B", contains="A")
setMethod("show", "A", function(object) cat("A\n"))
setMethod("show", "B", function(object) cat("B\n"))

with NAMESPACE

import(methods)
exportClasses(A, B)
exportMethods(show)

This creates the object and illustrated expected behavior

   ~/tmp$ R --vanilla --slave -e "library(PkgA); b = new('B'); save(b,
file='b.Rda'); b"
   B

Loading PkgA before the object leads to correct dispatch

   ~/tmp$ R --vanilla --slave -e "library(PkgA); load(file='b.Rda'); b"
   B

but loading the object without first loading PkgA leads to dispatch to
show,A-method.

   ~/tmp$ R --vanilla --slave -e "load(file='b.Rda'); b"
   Loading required package: PkgA
   A

Martin Morgan



--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] S4 method dispatch sometimes leads to incorrect when object loaded from file?

2015-05-10 Thread Martin Morgan
Loading an S4 object from a file without first loading the library sometimes (?, 
the example below and actual example involves a virtual base class and the show 
generic) leads to incorrect dispatch (to the base class method).


The attached package reproduces the problem. It has

setClass("A")
setClass("B", contains="A")
setMethod("show", "A", function(object) cat("A\n"))
setMethod("show", "B", function(object) cat("B\n"))

with NAMESPACE

import(methods)
exportClasses(A, B)
exportMethods(show)

This creates the object and illustrated expected behavior

  ~/tmp$ R --vanilla --slave -e "library(PkgA); b = new('B'); save(b, 
file='b.Rda'); b"

  B

Loading PkgA before the object leads to correct dispatch

  ~/tmp$ R --vanilla --slave -e "library(PkgA); load(file='b.Rda'); b"
  B

but loading the object without first loading PkgA leads to dispatch to 
show,A-method.


  ~/tmp$ R --vanilla --slave -e "load(file='b.Rda'); b"
  Loading required package: PkgA
  A

Martin Morgan
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R CMD check and missing imports from base packages

2015-04-29 Thread Martin Morgan

On 04/28/2015 01:04 PM, Gábor Csárdi wrote:

When a symbol in a package is resolved, R looks into the package's
environment, and then into the package's imports environment. Then, if the
symbol is still not resolved, it looks into the base package. So far so
good.

If still not found, it follows the 'search()' path, starting with the
global environment and then all attached packages, finishing with base and
recommended packages.

This can be a problem if a package uses a function from a base package, but
it does not formally import it via the NAMESPACE file. If another package
on the search path also defines a function with the same name, then this
second function will be called.

E.g. if package 'ggplot2' uses 'stats::density()', and package 'igraph'
also defines 'density()', and 'igraph' is on the search path, then
'ggplot2' will call 'igraph::density()' instead of 'stats::density()'.


stats::density() is an S3 generic, so igraph would define an S3 method, right? 
And in general a developer would avoid masking a function in a base package, so 
as not to require the user to distinguish between stats::density() and 
igraph::density(). Maybe the example is not meant literally.


Being able to easily flag non-imported, non-base symbols would definitely 
improve the robustness of package code, even if not helping the end user 
disambiguate duplicate symbols.


Martin Morgan



I think that for a better solution, either
1) the search path should not be used at all to resolve symbols in
packages, or
2) only base packages should be searched.

I realize that this is something that is not easy to change, especially 1)
would break a lot of packages. But maybe at least 'R CMD check' could
report these cases. Currently it reports missing imports for non-base
packages only. Is it reasonable to have a NOTE for missing imports from
base packages as well?

[As usual, please fix me if I am missing or misunderstood something.]

Thank you, Best,
Gabor

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel does not update the C++ returned variables

2015-03-02 Thread Martin Morgan

On 03/02/2015 11:39 AM, Dirk Eddelbuettel wrote:


On 2 March 2015 at 16:37, Martin Maechler wrote:
|
| > On 2 March 2015 at 09:09, Duncan Murdoch wrote:
| > | I generally recommend that people use Rcpp, which hides a lot of the
| > | details.  It will generate your .Call calls for you, and generate the
| > | C++ code that receives them; you just need to think about the real
| > | problem, not the interface.  It has its own learning curve, but I think
| > | it is easier than using the low-level code that you need to work with 
.Call.
|
| > Thanks for that vote, and I second that.
|
| > And these days the learning is a lot flatter than it was a decade ago:
|
| > R> Rcpp::cppFunction("NumericVector doubleThis(NumericVector x) { return(2*x); 
}")
| > R> doubleThis(c(1,2,3,21,-4))
| > [1]  2  4  6 42 -8
| > R>
|
| > That defined, compiled, loaded and run/illustrated a simple function.
|
| > Dirk
|
| Indeed impressive,  ... and it also works with integer vectors
| something also not 100% trivial when working with compiled code.
|
| When testing that, I've went a step further:

As you may know, int can be 'casted up' to double which is what happens
here.  So in what follows you _always_ create a copy from an int vector to a
numeric vector.

For pure int, use eg

 Rcpp::cppFunction("IntegerVector doubleThis(IntegeerVector x) { return(2*x); 
}")

and rename the function names as needed to have two defined concurrently.


avoiding duplication, harmless in the doubleThis() case, comes at some 
considerable hazard in general


> Rcpp::cppFunction("IntegerVector incrThisAndThat(IntegerVector x) { x[0] += 
1; return x; }")

> x = y = 1:5
> incrThisAndThat(x)
[1] 2 2 3 4 5
> x
[1] 2 2 3 4 5
> y
[1] 2 2 3 4 5

(how often this happens in the now relatively large number of user-contributed 
packages using Rcpp?). It seems like 'one-liners' should really encourage 
something safer (sometimes at the expense of 'speed'),


  Rcpp::cppFunction("IntegerVector doubleThis(const IntegerVector x) { return x 
* 2; }")


  Rcpp::cppFunction("std::vector incrThis(std::vector x) { x[0] += 1; 
return x; }")


or that Rcpp should become more careful (i.e., should not allow!) modifying 
arguments with NAMED != 0.


Martin (Morgan)



Dirk

|
| ## now "test":
| require(microbenchmark)
| i <- 1:10
| (mb <- microbenchmark(doubleThis(i), i*2, 2*i, i*2L, 2L*i, i+i, times=2^12))
| ## Lynne (i7; FC 20), R Under development ... (2015-03-02 r67924):
| ## Unit: nanoseconds
| ##   expr min  lq  mean median   uq   max neval cld
| ##  doubleThis(i) 762 985 1319.5974   1124 1338 17831  4096   b
| ##  i * 2 124 151  258.4419164  221 4  4096  a
| ##  2 * i 127 154  266.4707169  216 20213  4096  a
| ## i * 2L 143 164  250.6057181  234 16863  4096  a
| ## 2L * i 144 177  269.5015193  237 16119  4096  a
| ##  i + i 152 183  272.6179199  243 10434  4096  a
|
| plot(mb, log="y", notch=TRUE)
| ## hmm, looks like even the simple arithm. differ slightly ...
| ##
| ## ==> zoom in:
| plot(mb, log="y", notch=TRUE, ylim = c(150,300))
|
| dev.copy(png, file="mbenchm-doubling.png")
| dev.off() # [ <- why do I need this here for png ??? ]
| ##--> see the appended *png graphic
|
| Those who've learnt EDA or otherwise about boxplot notches, will
| know that they provide somewhat informal but robust pairwise tests on
| approximate 5% level.
| >From these, one *could* - possibly wrongly - conclude that
| 'i * 2' is significantly faster than both 'i * 2L' and also
| 'i + i'  which I find astonishing, given that  i is integer here...
|
| Probably no reason for deep thoughts here, but if someone is
| enticed, this maybe slightly interesting to read.
|
| Martin Maechler, ETH Zurich
|
| [DELETED ATTACHMENT mbenchm-doubling.png, PNG image]




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] vapply definition question

2014-12-16 Thread Martin Morgan

On 12/16/2014 08:20 PM, Mick Jordan wrote:

vapply <- function(X, FUN, FUN.VALUE, ...,  USE.NAMES = TRUE)
{
 FUN <- match.fun(FUN)
 if(!is.vector(X) || is.object(X)) X <- as.list(X)
 .Internal(vapply(X, FUN, FUN.VALUE, USE.NAMES))
}

This is an implementor question. Basically, what happened to the '...' args in
the call to the .Internal? cf lapply:, where the ... is passed.

lapply <- function (X, FUN, ...)
{
 FUN <- match.fun(FUN)
 ## internal code handles all vector types, including expressions
 ## However, it would be OK to have attributes which is.vector
 ## disallows.
 if(!is.vector(X) || is.object(X)) X <- as.list(X)
 ##TODO
 ## Note ... is not passed down.  Rather the internal code
 ## evaluates FUN(X[i], ...) in the frame of this function
 .Internal(lapply(X, FUN, ...))
}

Now both of these functions work when extra arguments are passed, so evidently
the implementation can function whether the .Internal "call" contains the ... or
not. I found other cases, notably in S3 generic methods where the ... is not
passed down.


Hi Mick --

You can see that the source code doesn't contain '...' in the final line

~/src/R-devel/src/library/base/R$ svn annotate lapply.R | grep Internal\(l
 38631 ripley .Internal(lapply(X, FUN))

and that it's been there for a long time (I'd guess 'forever')

~/src/R-devel/src/library/base/R$ svn log -r38631

r38631 | ripley | 2006-07-17 14:30:55 -0700 (Mon, 17 Jul 2006) | 2 lines

another attempt at a faster lapply



so I guess you're looking at a modified version of the function... The 
implementation detail is in the comment -- FUN(X[i], ...) is evaluated in the 
frame of lapply.


Martin Morgan



So, essentially, my question is whether the vapply code "should" be changed or
whether a .Internal implementation should always assume an implicit ...
regardless of the code, if the semantics requires it.

Thanks
Mick

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R string comparisons may vary with platform (plain text)

2014-11-23 Thread Martin Morgan


For many scientific applications one is really dealing with ASCII characters and 
LC_COLLATE="C", even if the user is running in non-C locales. What robust 
approaches (if any?) are available to write code that sorts in a 
locale-independent way? The Note in ?Sys.setlocale is not overly optimistic 
about setting the locale within a session.


Martin Morgan

On 11/23/2014 03:44 AM, Prof Brian Ripley wrote:

On 23/11/2014 09:39, peter dalgaard wrote:



On 23 Nov 2014, at 01:05 , Henrik Bengtsson  wrote:

On Sat, Nov 22, 2014 at 12:42 PM, Duncan Murdoch
 wrote:

On 22/11/2014, 2:59 PM, Stuart Ambler wrote:

A colleague¹s R program behaved differently when I ran it, and we thought
we traced it probably to different results from string comparisons as
below, with different R versions.  However the platforms also differed.  A
friend ran it on a few machines and found that the comparison behavior
didn¹t correlate with R version, but rather with platform.

I wonder if you¹ve seen this.  If it¹s not some setting I¹m unaware of,
maybe someone should look into it.  Sorry I haven¹t taken the time to read
the source code myself.


Looks like a collation order issue.  See ?Comparison.


With the oddity that both platforms use what look like similar locales:

LC_COLLATE=en_US.UTF-8
LC_COLLATE=en_US.utf8


It's the sort of thing thay I've tried to wrap my mind around multiple times
and failed, but have a look at

http://stackoverflow.com/questions/19967555/postgres-collation-differences-osx-v-ubuntu


which seems to be essentially the same issue, just for Postgres. If you have
the stamina, also look into the python question that it links to.

As I understand it, there are two potential reasons: Either the two platforms
are not using the same collation table for en_US, or at least one of them is
not fully implementing the Unicode Collation Algorithm.


And I have seen both with R.  At the very least, check if ICU is being used
(capabilities("ICU") in current R, maybe not in some of the obsolete versions
seen in this thread).

As a further possibility, there are choices in the UCA (in R, see
?icuSetCollate) and ICU can be compiled with different default choices.  It is
not clear to me what (if any) difference ICU versions make, but in R-devel
extSoftVersion() reports that.



In general, collation is a minefield: Some languages have the same letters in
different order (e.g. Estonian with Z between S and T); accented characters
sort with the unaccented counterpart in some languages but as separate
characters in others; some locales sort ABab, others AaBb, yet others aAbB;
sometimes punctuation is ignored, sometimes not; sometimes multiple characters
count as one, etc.


As ?Comparison has long said.





--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Changing style for the Sweave vignettes

2014-11-13 Thread Martin Morgan

On 11/13/2014 03:09 AM, January Weiner wrote:

As a user, I am always annoyed beyond measure that Sweave vignettes
precede the code by a command line prompt. It makes running examples
by simple copying of the commands from the vignette to the console a
pain. I know the idea is that it is clear what is the command, and
what is the output, but I'd rather precede the output with some kind
of marking.

Is there any other solution possible / allowed in vignettes? I would
much prefer to make my vignettes easier to use for people like me.


Vignettes do not need to be generated by Sweave and to pdf documents. My current 
favorite (e.g., recent course material at 
http://bioconductor.org/help/course-materials/ which uses styling from the 
BiocStyle package 
http://bioconductor.org/packages/release/bioc/html/BiocStyle.html) uses the 
knitr package (see http://yihui.name/knitr/) to produce HTML vignettes (knitr 
will also process Rnw files to pdf with perhaps more appealing styling, see, 
e.g.,  http://bit.ly/117OLVl for an example of PDF output).


The mechanics are discussed in Writing R Extensions (RShowDoc('R-exts')), 
section 1.4.2 Non-Sweave vignettes. There are three steps involved: specifying a 
\VignetteEngine in the vignette itself, specifying VignetteBuilder: field in the 
DESCRIPTION file, and including the package providing the engine (knitr, in my 
case) in the Suggests: field of the DESCRIPTION file.


Brian mentioned processing the vignette to it's underlying code; see 
?browseVignettes and ?vignette for installed packages, and ?Stangle in R and R 
CMD Stangle for extracting the R code from stand-alone vignettes to .R files.


Martin Morgan



Kind regards,

j.




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How to maintain memory in R extension

2014-11-12 Thread Martin Morgan

On 11/12/2014 05:36 AM, Zheng Da wrote:

Hello,

I wrote a system to perform data analysis in C++. Now I am integrating
it to R. I need to allocate memory for my own C++ data structures,
which can't be represented by any R data structures. I create a global
hashtable to keep a reference to the C++ data structures. Whenever I
allocate one, I register it in the hashtable and return its key to the
R code. So later on, the R code can access the C++ data structures
with their keys.

The problem is how to perform garbage collection on the C++ data
structures. Once an R object that contains the key is garbage
collected, the R code can no longer access the corresponding C++ data
structure, so I need to deallocate it. Is there any way that the C++
code can get notification when an R object gets garbage collected? If
not, what is the usual way to manage memory in R extensions?


register a finalizer that runs when there are no longer references to the R 
object, see ?reg.finalizer or the interface to R and C finalizers in 
Rinternals.h. If you return more than one reference to a key, then of course 
you'll have to manage these in your own C++ code.


Martin Morgan



Thanks,
Da

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] package vignettes build in the same R process?

2014-11-01 Thread Martin Morgan
If I understand correctly, all vignettes in a package are built in the same R 
process. Global options, loaded packages, etc., in an earlier vignette persist 
in later vignettes. This can introduce user confusion (e.g., when a later 
vignette builds successfully because a package is require()'ed in an earlier 
vignette, but not the current one), difficult-to-identify bugs (e.g., when
a setting in an earlier vignette influences calculation in a latter vignette), 
and misleading information about reproducibility (e.g., when the sessionInfo() 
of a later vignette reflects packages used in earlier vignettes).


I believe the relevant code is at

src/library/tools/R/Vignettes.R:505

output <- tryCatch({
## FIXME: run this in a separate process
engine$weave(file, quiet = quiet)
setwd(startdir)
find_vignette_product(name, by = "weave", engine = engine)
}, error = function(e) {
stop(gettextf("processing vignette '%s' failed with 
diagnostics:\n%s",
 file, conditionMessage(e)), domain = NA, call. = FALSE)
})

Is building of each vignette in separate processes a reasonable feature request?

Martin
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Options that are local to the package that sets them

2014-10-31 Thread Martin Morgan

On 10/31/2014 05:55 PM, Gábor Csárdi wrote:

On Fri, Oct 31, 2014 at 8:16 PM, William Dunlap  wrote:

You can put the following 3 objects, an environment and 2 functions
that access it, in any package that need some package-specific
storage (say your pkgB1 and pkgB2).
.pkgLocalStorage <- new.env(parent = emptyenv())
assignInPkgLocalStorage <- function(name, object) {
.pkgLocalStorage[[name]] <- object
}
getFromPkgLocalStorage <- function(name, object) {
.pkgLocalStorage[[name]]
}
Leave the environment private and export the functions.  Then a user can
use them as
pkgB1::assignInPkgLocalStorage("myPallete", makeAPallete(1,2,3))
pkgB2::assignInPkgLocalStorage("myPallete", makeAPallete(5,6,7))
pkgB1::getFromPkgLocalStorage("myPallete") # get the 1,2,3 pallete


I am trying to avoid requiring pkgBn to do this kind of magic. I just
want it to call function(s) from pkgA. But maybe something like this
would work. In pkgBn:

my_palettes <- pkgA::palette_factory()

and my_palettes is a function or an environment that has the API
functions to modify my_palettes itself (via closure if it is a
function), e.g.

my_palettes$add_palette(...)
my_palettes$get_palette(...)

or if it is a function, then

my_palettes(add(...), ...)
my_palettes(get(...), ...)

etc.

This would work, right? I'll try it in a minute.


You'll need pkgA to be able to know that pkgB1's invokation is to use pkgB1's 
parameters, so coupling state (parameters) with function, i.e., a class with 
methods. So a solution is to use an S4 or reference class and generator to 
encapsulate state and dispatch to appropriate functions, E.g.,


  .Plotter <- setRefClass("Plotter",
  fields=list(palette="character"),
  methods=list(
update(palette) {
.self$palette <- palette
},
plot=function(...) {
graphics::plot(..., col=.self$palette)
}))

  APlotter <- function(palette=c("red", "green", "blue"))
  .Plotter(palette=palette)

PkgB1, 2 would then

  plt = APlotter()
  plt$plot(mpg ~ disp, mtcars)
  plt$update(c("blue", "green"))
  plt$plot(mpg ~ disp, mtcars)

or

  .S4Plotter <- setClass("S4Plotter", representation(palette="character")
  S4Plotter <- function(palette=c("red", "blue", "green"))
  s4plot <- function(x, ...) graphics::plot(..., col=x@palette))

(make s4plot a generic with method for class S4Plotter to enforce type).

Seems like this interface could be generated automatically in .onLoad() of pkgA, 
especially if adopting a naming convention of some sort.


Martin



Gabor



If only one of pkgB1 and pkgB2 is loaded you can leave off the pkgBn::.

A package writer can always leave off the pkgBn:: as well.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Fri, Oct 31, 2014 at 4:34 PM, Gábor Csárdi  wrote:

Dear All,

I am trying to do the following, and could use some hints.

Suppose I have a package called pkgA. pkgA exposes an API that
includes setting some options, e.g. pkgA works with color palettes,
and the user of the package can define new palettes. pkgA provides an
API to manipulate these palettes, including defining them.

pkgA is intended to be used in other packages, e.g. in pkgB1 and
pkgB2. Now suppose pkgB1 and pkgB2 both set new palettes using pkgA.
They might set palettes with the same name, of course, they do not
know about each other.

My question is, is there a straightforward way to implement pkgA's
API, such that pkgB1 and pkgB2 do not interfere? In other words, if
pkgB1 and pkgB2 both define a palette 'foo', but they define it
differently, each should see her own version of it.

I guess this requires that I put something (a function?) in both
pkgB1's and pkgB2's package namespace. As I see it, this can only
happen when pkgA's API is called from pkgB1 (and pkgB2).

So at this time I could just walk up the call tree and put the palette
definition in the first environment that is not pkgA's. This looks
somewhat messy, and I am probably missing some caveats.

Is there a better way? I have a feeling that this is already supported
somehow, I just can't find out how.

Thanks, Best Regards,
Gabor

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] mpi.h errors on Mavericks packages

2014-10-03 Thread Martin Morgan

On 10/03/2014 04:58 PM, Martin Morgan wrote:

On 10/03/2014 04:17 PM, Daniel Fuka wrote:

Dear mac folks,

I have started porting a large legacy toolset maintained in windows
and heavily mpi laden so it can be used across platforms in R... so I
am building a package out of it. On this note, I am noticing that
almost all of the mpi dependent packages do not compile on the CRAN
repositories with the basic issue that it appears it can not find
mpi installed:

configure: error: "Cannot find mpi.h header file"




sorry for the noise! you're after mpi and not openMP. Arrgh Martin


Hi Dan -- not a mac folk, or particularly expert on the subject, but have you
looked at section 1.2.1.1 of RShowDoc("R-exts")? The basic idea is

a) check for compiler support via a src/Makevars file that might be like

PKG_CFLAGS = $(SHLIB_OPENMP_CFLAGS)
PKG_LIBS = $(SHLIB_OPENMP_CFLAGS)

b) conditionally include mpi header files and execute mpi code with

#ifdef SUPPORT_OPENMP
#include 
#endif

and similarly for #pragma's and other mpi-isms littered through your code?
Likely this gets quite tedious for projects making extensive use of openMP.

Martin




I do not see any chatter about mpi issues in the lists since the
inception of mavericks.. and possibly this question should go to
Simon.. but in case I missed a discussion, or if anyone has any
suggestions on how to proceed, or what might be missing from the Rmpi,
npRmpi, etc. packages for compilation on Mavericks, it would be
greatly appreciated if you could let me know.. and maybe I can help
fix the other packages as well.

Thanks for any help or pointers to guide me!
dan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel







--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] mpi.h errors on Mavericks packages

2014-10-03 Thread Martin Morgan

On 10/03/2014 04:17 PM, Daniel Fuka wrote:

Dear mac folks,

I have started porting a large legacy toolset maintained in windows
and heavily mpi laden so it can be used across platforms in R... so I
am building a package out of it. On this note, I am noticing that
almost all of the mpi dependent packages do not compile on the CRAN
repositories with the basic issue that it appears it can not find
mpi installed:

configure: error: "Cannot find mpi.h header file"


Hi Dan -- not a mac folk, or particularly expert on the subject, but have you 
looked at section 1.2.1.1 of RShowDoc("R-exts")? The basic idea is


a) check for compiler support via a src/Makevars file that might be like

PKG_CFLAGS = $(SHLIB_OPENMP_CFLAGS)
PKG_LIBS = $(SHLIB_OPENMP_CFLAGS)

b) conditionally include mpi header files and execute mpi code with

#ifdef SUPPORT_OPENMP
#include 
#endif

and similarly for #pragma's and other mpi-isms littered through your code? 
Likely this gets quite tedious for projects making extensive use of openMP.


Martin




I do not see any chatter about mpi issues in the lists since the
inception of mavericks.. and possibly this question should go to
Simon.. but in case I missed a discussion, or if anyone has any
suggestions on how to proceed, or what might be missing from the Rmpi,
npRmpi, etc. packages for compilation on Mavericks, it would be
greatly appreciated if you could let me know.. and maybe I can help
fix the other packages as well.

Thanks for any help or pointers to guide me!
dan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] install.packages misleads about package availability?

2014-09-10 Thread Martin Morgan
In the context of installing a Bioconductor package using our biocLite() 
function, install.packages() warns


> install.packages("RUVSeq", repos="http://bioconductor.org/packages/2.14/bioc";)
Installing package into 
'/home/mtmorgan/R/x86_64-unknown-linux-gnu-library/3.1-2.14'
(as 'lib' is unspecified)
Warning message:
package 'RUVSeq' is not available (for R version 3.1.1 Patched)

but really the problem is that the package is not available at the specified 
repository (it is available, for the same version of R, in the Bioc devel 
repository http://bioconductor.org/packages/3.0/bioc).


I can see the value of identifying the R version, and see that mentioning 
something about 'specified repositories' would not necessarily be helpful. Also, 
since the message is translated and our user base is international, it is 
difficult to catch and process by the biocLite() script.


Is there a revised wording that could be employed to more accurately convey the 
reason for the failure, or is this an opportunity to use the condition system?


Index: src/library/utils/R/packages2.R
===
--- src/library/utils/R/packages2.R (revision 66562)
+++ src/library/utils/R/packages2.R (working copy)
@@ -46,12 +46,12 @@
 p0 <- unique(pkgs)
 miss <-  !p0 %in% row.names(available)
 if(sum(miss)) {
-   warning(sprintf(ngettext(sum(miss),
-"package %s is not available (for %s)",
-"packages %s are not available (for %s)"),
-   paste(sQuote(p0[miss]), collapse=", "),
-   sub(" *\\(.*","", R.version.string)),
-domain = NA, call. = FALSE)
+txt <- ngettext(sum(miss), "package %s is not available (for %s)",
+"packages %s are not available (for %s)")
+msg <- simpleWarning(sprintf(txt, paste(sQuote(p0[miss]), collapse=", 
"),
+ sub(" *\\(.*","", R.version.string)))
+class(msg) <- c("packageNotAvailable", class(msg))
+warning(msg)
 if (sum(miss) == 1L &&
 !is.na(w <- match(tolower(p0[miss]),
   tolower(row.names(available) {

--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Re R CMD check checking in development version of R

2014-08-28 Thread Martin Morgan

On 08/28/2014 05:52 AM, Hadley Wickham wrote:

I'd say: Depends is a historical artefact from ye old days before
package namespaces. Apart from depending on a specific version of R,
you should basically never use depends.  (The one exception is, as
mentioned in R-exts, if you're writing something like latticeExtras
that doesn't make sense unless lattice is already loaded).


Keeping this nuance in mind when when discussing Depends vs Imports is
important so as to not suggest that there isn't any reason to use Depends
any longer.


A common case in Bioconductor is that a package defines a class and methods
intended for the user; this requires the package to be on the search path
(else the user wouldn't be able to do anything with the returned object). A
class and supporting methods can represent significant infrastructure, so
that it makes sense to separate these in distinct packages. It is not
uncommon to find 3-5 or more packages in the Depends: field of derived
packages for this reason.


For that scenario, is it reasonable to say that every package in
depends must also be in imports?


Important to pay attention to capitalization here. A package listed in Depends: 
_never_ needs to be listed in Imports:, but will often be import()'ed (in one 
way or another) in the NAMESPACE. Some would argue that listing a package in 
Depends: and Imports: in this case clarifies intent -- provides functionality 
available to the user, and important for the package itself. Others (such as R 
CMD check) view the replication as redundancy.


I think one can imagine scenarios where a package in the Depends: fields does 
not actually have anything import()'ed, e.g., PkgA defines a class, PkgB 
provides some special functionality that returns the class PkgC use PkgB's 
special functionality without ever manipulating the object of PkgA.


Martin Morgan



Hadley




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Re R CMD check checking in development version of R

2014-08-28 Thread Martin Morgan

On 08/27/2014 08:33 PM, Gavin Simpson wrote:

On Aug 27, 2014 5:24 PM, "Hadley Wickham"


I'd say: Depends is a historical artefact from ye old days before
package namespaces. Apart from depending on a specific version of R,
you should basically never use depends.  (The one exception is, as
mentioned in R-exts, if you're writing something like latticeExtras
that doesn't make sense unless lattice is already loaded).


Keeping this nuance in mind when when discussing Depends vs Imports is
important so as to not suggest that there isn't any reason to use Depends
any longer.



A common case in Bioconductor is that a package defines a class and methods 
intended for the user; this requires the package to be on the search path (else 
the user wouldn't be able to do anything with the returned object). A class and 
supporting methods can represent significant infrastructure, so that it makes 
sense to separate these in distinct packages. It is not uncommon to find 3-5 or 
more packages in the Depends: field of derived packages for this reason.


Martin



I am in full agreement that its use should be limited to exceptional
situations, and have modified my packages accordingly.

Cheers,

G


This check (whilst having found some things I should have imported and
didn't - which is a good thing!) seems to be circumventing the

intention of

having something in Depends. Is Depends going to go away?


I don't think it's going to go away anytime soon, but you should
consider it to be largely deprecated and you should avoid it wherever
possible.


(And really you shouldn't have any packages in depends, they should
all be in imports)


I disagree with *any*; having say vegan loaded when one is using

analogue is

a design decision as the latter borrows heavily from and builds upon

vegan.

In general I have moved packages that didn't need to be in Depends into
Imports; in the version I am currently doing final tweaks on before it

goes

to CRAN I have remove all but vegan from Depends.


I think that is a reasonable use case for depends. Here's the exact
text from R-exts: "Field ‘Depends’ should nowadays be used rarely,
only for packages which are intended to be put on the search path to
make their facilities available to the end user (and not to the
package itself): for example it makes sense that a user of package
latticeExtra would want the functions of package lattice made
available."

Personally I avoid even this use, requiring users of my packages to be
explicit about exactly what packages are on the search path.  You are
of course welcome to your own approach, but I think you'll find it
will become more and more difficult to maintain in time. I recommend
that you bite the bullet now.

Put another way, packages should be extremely conservative about
global side effects (and modifying the search path is such a
side-effect)

Hadley

--
http://had.co.nz/


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Should a package that indirectly Suggests: a vignette engine pass R CMD check?

2014-06-14 Thread Martin Morgan
A package uses VignetteEngine: knitr; the package itself does not Suggests: 
knitr, but it Suggests: BiocStyle which in turn Suggests: knitr. Nonetheless, R 
CMD check fails indicating that a package required for checking is not declared. 
Is it really the intention that the original package duplicate Suggests: knitr?


This is only with a recent R. In detail, with

$ Rdev --version|head -3
R Under development (unstable) (2014-06-14 r65947) -- "Unsuffered Consequences"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)

trying to  check the Bioconductor genefilter package leads to

$ Rdev --vanilla CMD check genefilter_1.47.5.tar.gz
* using log directory ‘/home/mtmorgan/b/Rpacks/genefilter.Rcheck’
* using R Under development (unstable) (2014-06-13 r65941)
* using platform: x86_64-unknown-linux-gnu (64-bit)
* using session charset: UTF-8
* checking for file ‘genefilter/DESCRIPTION’ ... OK
* this is package ‘genefilter’ version ‘1.47.5’
* checking package namespace information ... OK
* checking package dependencies ... ERROR
VignetteBuilder package not declared: ‘knitr’

See the information on DESCRIPTION files in the chapter ‘Creating R
packages’ of the ‘Writing R Extensions’ manual.

I interpret this to mean that knitr should be mentioned in Suggests: or other 
dependency field. The package does not Suggests: knitr, but it does Suggests: 
BiocStyle, which itself Suggests: knitr. The author knows that they are using 
the BiocStyle package for their vignette, and the BiocStyle package suggests the 
appropriate builder.


Martin Morgan
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R CMD check for the R code from vignettes

2014-05-31 Thread Martin Morgan

On 05/31/2014 03:52 PM, Yihui Xie wrote:

Note the test has been done once in weave, since R CMD check will try
to rebuild vignettes. The problem is whether the related tools in R
should change their tangle utilities so we can **repeat** the test,
and it seems the answer is "no" in my eyes.

Regards,
Yihui
--
Yihui Xie 
Web: http://yihui.name


On Sat, May 31, 2014 at 4:54 PM, Gabriel Becker  wrote:




On Fri, May 30, 2014 at 9:22 PM, Yihui Xie  wrote:


Hi Kevin,


I tend to adopt Henrik's idea, i.e., to provide vignette
engines that just ignore tangle. At the moment, it seems R CMD check


It is very useful, pedagogically and when reproducing analyses, to be able to 
source() the tangled .R code into an R session, analogous to running example 
code with example(). The documentation for ?Stangle does read


 (Code inside '\Sexpr{}' statements is ignored by 'Stangle'.)

So my 'vote' (recognizing that I don't have one of those) is to incorporate 
\Sexpr{} expressions into the tangled code, or to continue to flag use of Sexpr 
with side effects as errors (indirectly, by source()ing the tangled code), 
rather than writing engines that ignore tangle.


It is very valuable to all parties to write a vignette with code that is fully 
evaluated; otherwise, it is too easy for bit rot to seep in, or to 'fake' it in 
a way that seems innocent but is misleading.


Martin Morgan


is comfortable with vignettes that do not have corresponding R
scripts, and I hope these R scripts will not become mandatory in the
future.



I'm not sure this is the right approach. This would essentially make the
test optional based on decisions by the package author. I'm not arguing in
favor if this particular test, but if package authors are able to turn a
test off then the test loses quite a bit of it's value.

I think that R CMD check has done a great deal for the R community by
presenting a uniform, minimum "barrier to entry" for R packages. Allowing
package developers to alter the tests it does (other than the obvious case
of their own unit tests) would remove that.

That having been said, it seems to me that tangle-like utilities should have
the option of extracting inline code, and that during R CMD check that
option should *always* be turned on.  That would solve the problem in
question while retaining the test would it not?

~G


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] citEntry handling of encoded URLs

2014-05-23 Thread Martin Morgan

On 05/23/2014 05:35 AM, Achim Zeileis wrote:

On Thu, 22 May 2014, Martin Morgan wrote:


The following citEntry includes a url with %3A and other encodings

citEntry(entry="article",
title = "Software for Computing and Annotating Genomic Ranges",
author = personList( as.person("Michael Lawrence" )),
year = 2013,
journal = "{PLoS} Computational Biology",
volume = "9",
issue = "8",
doi = "10.1371/journal.pcbi.1003118",
url =
"http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118";,

textVersion = "Lawrence M..." )

Evaluating this as R code doesn't parse correctly and generates a warning


The citEntry (or bibentry) itself is parsed without problem. Some printing
styles cause the warning, specifically when the Rd parser is used for
formatting. Depending on how you want to print it, the warning doesn't occur
though. Using bibentry() directly, we can do:

b <- bibentry("Article",
   title = "Software for Computing and Annotating Genomic Ranges",
   author = "Michael Lawrence and others",
   year = "2013",
   journal = "PLoS Comptuational Biology",
   volume = "9",
   number = "8",
   doi = "10.1371/journal.pcbi.1003118",
   url =
"http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118";,
   textVersion = "Lawrence M et al. (2013) ..."
)

Then the default

print(b)

issues a warning because the Rd parser thinks that the % are comments. However,

print(b, style = "BibTeX")
print(b, style = "citation")

don't issue warnings and also produce output that one might expect.


Thanks for clarifying. For what it's worth, I was aiming for

print(b, style="html")


A work-around is, apparently, to quote the %, \\%3A etc., but is this the
intention?


In that case the default print(b) yields the desired output without warning but
print(b, style = "BibTeX") or print(b, style = "citation") are possibly not in
the desired format. I'm not sure though how the different BibTeX style files
actually handle the URLs. I think some .bst files handle the "url" field
verbatim (i.e., don't need escaping) while others treat it as text (i.e., need
escaping). Personally, I would hence avoid the problem and only use the DOI URL
here as this will be robust across BibTeX styles.

Nevertheless it is not ideal that there is a discrepancy between the different
printing styles. I think currently this can only be avoided if custom macros are
employed. But Duncan might be able to say more about this. A similar situation
occurs if you use commands that are not part of the Rd markup, e.g.

n01 <- bibentry("Misc", title = "The $\\mathcal{N}(0, 1)$ Distribution",
   author = "Foo Bar", year = "2014")
print(n01) # warning
print(n01, style = "BibTeX") # ok


Also, citEntry points to bibentry points to *Entry Fields*, but the 'url' tag
is not mentioned there, even though url appears in the examples; if the list
of supported tags is not easy to enumerate, perhaps some insight can be
provided at this point as to how the supported tags are determined?


This follows the BibTeX conventions. Thus, you can use any tag that you wish to
use and it will depend on the style whether it is displayed or not. The only
restriction is that certain bibtypes require certain fields, e.g., an "Article"
has to specify: author, title, journal, year. But beyond that you can add any
additional field. For example, in your bibentry above you used the "issue" field
which is ignored by most BibTeX styles. My adaptation uses the "number" field
instead which is processed by most standard BibTeX styles.

The default print(..., style = "text") uses a bibstyle that is modeled after
jss.bst, the BibTeX style employed by the Journal of Statistical Software. But
you could plug in other .bibstyle arguments, e.g. one that processes the "issue"
field etc.

Hope that helps,


Yes, that helps a lot, thanks,

Martin


Z


Thanks

Martin Morgan
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] citEntry handling of encoded URLs

2014-05-22 Thread Martin Morgan

The following citEntry includes a url with %3A and other encodings

citEntry(entry="article",
 title = "Software for Computing and Annotating Genomic Ranges",
 author = personList( as.person("Michael Lawrence" )),
 year = 2013,
 journal = "{PLoS} Computational Biology",
 volume = "9",
 issue = "8",
 doi = "10.1371/journal.pcbi.1003118",
 url = 
"http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118";,

 textVersion = "Lawrence M..." )

Evaluating this as R code doesn't parse correctly and generates a warning

Lawrence M (2013). “Software for Computing and Annotating Genomic
Ranges.” _PLoS Computational Biology_, *9*. http://dx.doi.org/10.1371/journal.pcbi.1003118>, http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118}.>
Warning message:
In parse_Rd(Rd, encoding = encoding, fragment = fragment, ...) :
  :5: unexpected END_OF_INPUT '
'

A work-around is, apparently, to quote the %, \\%3A etc., but is this the 
intention?


Also, citEntry points to bibentry points to *Entry Fields*, but the 'url' tag is 
not mentioned there, even though url appears in the examples; if the list of 
supported tags is not easy to enumerate, perhaps some insight can be provided at 
this point as to how the supported tags are determined?


Thanks

Martin Morgan
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


  1   2   3   4   5   >