[Rd] eapply duplicates elements unnecessarily

2014-02-09 Thread Martin Morgan

eapply duplicates the elements of the environment it is being applied to

> env = new.env(); x = 1; tracemem(x)
[1] "<0x1758cd18>"
> env[["x"]] = x
> xx <- eapply(env, length)
tracemem[0x1758cd18 -> 0x1758cbc8]: eapply

but duplication seems unnecessary. I think this is because of 'duplicate' in 
FrameValues (envir.c:2402). It's hard to tell what contract FrameValues is 
living up to, but is  INCREMENT_NAMED() sufficient? (the PROTECT on value in 
line 2398 also seems unnecessary -- it must already be protected?)


Martin
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Question re: NA, NaNs in R

2014-02-09 Thread Prof Brian Ripley

There is one NA but mulitple NaNs.

And please re-read 'man memcmp': your cast is wrong.

On 10/02/2014 06:52, Kevin Ushey wrote:

Hi R-devel,

I have a question about the differentiation between NA and NaN values
as implemented in R. In arithmetic.c, we have

int R_IsNA(double x)
{
 if (isnan(x)) {
ieee_double y;
y.value = x;
return (y.word[lw] == 1954);
 }
 return 0;
}

ieee_double is just used for type punning so we can check the final
bits and see if they're equal to 1954; if they are, x is NA, if
they're not, x is NaN (as defined for R_IsNaN).

My question is -- I can see a substantial increase in speed (on my
computer, in certain cases) if I replace this check with

int R_IsNA(double x)
{
 return memcmp(
 (char*)(&x),
 (char*)(&NA_REAL),
 sizeof(double)
 ) == 0;
}

IIUC, there is only one bit pattern used to encode R NA values, so
this should be safe. But I would like to be sure:

Is there any guarantee that the different functions in R would return
NA as identical to the bit pattern defined for NA_REAL, for a given
architecture? Similarly for NaN value(s) and R_NaN?

My guess is that it is possible some functions used internally by R
might encode NaN values differently; ie, setting the lower word to a
value different than 1954 (hence being NaN, but potentially not
identical to R_NaN), or perhaps this is architecture-dependent.
However, NA should be one specific bit pattern (?). And, I wonder if
there is any guarantee that the different functions used in R would
return an NaN value as identical to R_NaN (which appears to be the
'IEEE NaN')?

(interested parties can see + run a simple benchmark from the gist at
https://gist.github.com/kevinushey/8911432)

Thanks,
Kevin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Question re: NA, NaNs in R

2014-02-09 Thread Kevin Ushey
Hi R-devel,

I have a question about the differentiation between NA and NaN values
as implemented in R. In arithmetic.c, we have

int R_IsNA(double x)
{
if (isnan(x)) {
ieee_double y;
y.value = x;
return (y.word[lw] == 1954);
}
return 0;
}

ieee_double is just used for type punning so we can check the final
bits and see if they're equal to 1954; if they are, x is NA, if
they're not, x is NaN (as defined for R_IsNaN).

My question is -- I can see a substantial increase in speed (on my
computer, in certain cases) if I replace this check with

int R_IsNA(double x)
{
return memcmp(
(char*)(&x),
(char*)(&NA_REAL),
sizeof(double)
) == 0;
}

IIUC, there is only one bit pattern used to encode R NA values, so
this should be safe. But I would like to be sure:

Is there any guarantee that the different functions in R would return
NA as identical to the bit pattern defined for NA_REAL, for a given
architecture? Similarly for NaN value(s) and R_NaN?

My guess is that it is possible some functions used internally by R
might encode NaN values differently; ie, setting the lower word to a
value different than 1954 (hence being NaN, but potentially not
identical to R_NaN), or perhaps this is architecture-dependent.
However, NA should be one specific bit pattern (?). And, I wonder if
there is any guarantee that the different functions used in R would
return an NaN value as identical to R_NaN (which appears to be the
'IEEE NaN')?

(interested parties can see + run a simple benchmark from the gist at
https://gist.github.com/kevinushey/8911432)

Thanks,
Kevin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] suggestion for "sets" tools upgrade

2014-02-09 Thread Carl Witthoft
No, I wasn't, so thanks for pointing me to that package.  I think I'll 
still post my new toys, because "sets"  fails the "obvious for dummies" 
test.  I looked at the documentation, and while it's clearly a very 
powerful collection of functions, it's not obvious exactly what the 
family of "cset_*" tools do or don't do.
The stuff I'm writing is,  I hope :-), much easier to learn and remember 
for the casual or occasional user.   De gustibus non disputandum applies 
here.


Carl


On 2/9/14 4:38 AM, Kurt Hornik wrote:

Carl Witthoft writes:



Thanks to Duncan and all who responded.
I agree that the algebraic set rules do not allow for indistinguishable
elements;  I must have been deeply immersed in quantum fermions when I
wrote "strictly" rather than "less" in front of "algebraic style.



I'll clean up my code (so that intersect() remains symmetric, among
other things) , and submit as a separate package to CRAN.



Carl


Btw, you are aware of the "sets" package on CRAN (which does the
multisets I think you were asking for)?

Best
-k


--

Sent from a parallel universe almost, but not entirely,
nothing at all like this one.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] suggested addition to 'install.packages' help file

2014-02-09 Thread Prof Brian Ripley

On 06/02/2014 18:25, Patrick Burns wrote:

I suggest that there be an additional sentence
in the explanation for the 'repos' argument in
the help file for 'install.packages':

If the repository is on a local drive, then the
string should begin with \code{file:}, e.g.,
\code{"file:J:/Rrepos"}.


Perhaps I'm missing some subtlety, but it makes
things work in my case.


You are (more than one).  The help says

   repos: character vector, the base URL(s) of the repositories to use,

A file:// scheme *is* a URL, but how to specify it correctly is 
described under ?url.  'Local drive' is a not a portable concept, but a 
repository on a file system mounted on the same machine may also be 
accessible via a http:// scheme.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel