Re: [Rd] patch about compile R with clang

2010-02-22 Thread Prof Brian Ripley
configure is a generated file, and so should not be edited directly. 
You have not told us what version of R these patches were against, but 
it looks to me as if wchar.h is included already in current R 
(R-patched/R-devel) -- certainly in the second case before wctype.h.
(It really should not be needed according to POSIX, but it was on 
MinGW-w64.   Also, headers are an issue not just for a compiler but an 
OS, and you have not told use that either.)


So can you please clarify what version of R, what OS, and what changes 
you think might be needed to m4/R.m4 in the R-devel version of R?


On Mon, 22 Feb 2010, Gong Yu wrote:

clang is compiler http://clang.llvm.org, it is fast and better c 
compiler then gcc, yesterday i use clang and gfortran compile R.


Hmm, it claims to be 'faster and better', but past reports on Mac OS X 
(it ships with Snow Leopard) suggested those claims to be exaggerated.
(It did not create as fast an R, although it compiled faster, and its 
error messages were markedly worse than other compilers despite claims 
to the contrary.)



The only two change in source code is :

1. the configure file (in confiure when test include wctype.h,gcc can compile 
but clang need include both wchar.h wctype.h),so this is patch
--- /r/configure
+++ /myr/configure
@@ -39172,6 +39172,7 @@
cat >>conftest.$ac_ext <<_ACEOF
/* end confdefs.h.  */
$ac_includes_default
+#include 
#include <$ac_header>
_ACEOF
rm -f conftest.$ac_objext
@@ -39480,6 +39481,7 @@
cat confdefs.h >>conftest.$ac_ext
cat >>conftest.$ac_ext <<_ACEOF
/* end confdefs.h.  */
+#include 
#include 

#ifdef F77_DUMMY_MAIN


2. edit tre-match-approx.c
change the following line
#define __USE_STRING_INLINES
#undef __NO_INLINE__
to
//#define __USE_STRING_INLINES
//#undef __NO_INLINE__
becasue clang will report errors(fields must have a constant size:'variable 
length array in structure' extension will never be supported' in string.h)


Please use C comments not C++ ones: we prefer but do not require C99.

At least on my version of Linux (Fedora 12), these optimizations are 
only supposed to be used with 'GNU CC', and are inside a test for 
__GNUC__ >= 2.  So if clang is using them, this is a bug in clang (we 
have seem similar things with the Intel CC masquerading as GCC). 
Your OS may differ, of course.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best style to organize code, namespaces

2010-02-22 Thread Mark.Bravington
Ben--

FWIW my general take on this is:

 - Namespaces solve the collision issue.

 - Style 2 tends to make for unreadably long code inside Foo, unless the 
subfunctions are really short.

 - Style 3 is too hard to work with

 - So I usually use a variant on style 1:

### Style 4 (mlocal-style) ### 
Foo <-  function(x) { 
   initialize.Foo()
}

initialize.Foo <- function( nlocal=sys.parent()) mlocal({
  
})

The 'mlocal' call means that code in the body of 'initialize.Foo' executes 
directly in the environment of 'Foo', or wherever it's called from-- it doesn't 
get its own private environment, and automatically reads/writes/creates 
variables in 'Foo'. However, you can still pass parameters that are private to 
'initialize.Foo', though you may not need any. The 'debug' package will handle 
'mlocal' functions without any trouble. One downside might be that you can't 
(or shouldn't) call 'initialize.Foo' directly. Another is if your sub-function 
creates a lot of junk variables that you really don't want in 'Foo'-- obviously 
that's exactly what you want from an initialization function, but not 
necessarily in general.

 - Sometimes (style 5) I define the subfunctions externally to 'Foo' but not as 
'mlocal's, and then inside 'Foo' I do

subf <- subf
environment( subf) <- environment()

just as if I'd inserted the definition of 'subf' into 'Foo'. This is like style 
2, but keeps the 'Foo' code short, and lets me set up debugging externally.

 - If you use style 2, you can still automatically set up the 'debug' package's 
debugging on 'Subf' by:

mtrace( Foo)
bp( fname='Foo', 1, FALSE) # don't stop at line 1
bp( fname='Foo', 2, { mtrace( Subf); FALSE}) #  set the breakpoint in 'Subf,' 
and then carry on in 'Foo' without stopping

You won't have to intervene manually when 'Foo' runs. However, this may slow 
down 'Foo' itself, and does require you to know a line number after the 
definition of 'Subf'.

No doubt there are many other approaches...

Mark

-- 
Mark Bravington
CSIRO Mathematical & Information Sciences
Marine Laboratory
Castray Esplanade
Hobart 7001
TAS

ph (+61) 3 6232 5118
fax (+61) 3 6232 5012
mob (+61) 438 315 623

Ben wrote:
> Hi all,
> 
> I'm hoping someone could tell me what best practices are as far as
> keeping programs organized in R.  In most languages, I like to keep
> things organized by writing small functions.  So, suppose I want to
> write a function that would require helper functions or would just be
> too big to write in one piece.  Below are three ways to do this:
> 
> 
> ### Style 1 (C-style) ### Foo <-
>   function(x) { 
> }
> Foo.subf <- function(x, blah) {
>   
> }
> Foo.subg <- function(x, bar) {
>   
> }
> 
> ### Style 2 (Lispish?) ## Foo <-
>   function(x) { Subf <- function(blah) {
> 
>   }
>   Subg <- function(bar) {
> 
>   }
>   
> }
> 
> ### Object-Oriented # Foo <-
>   function(x) { Subf <- function(blah) {
> 
>   }
>   Subg <- function(bar) {
> 
>   }
>   Main <- function() {
> 
>   }
>   return(list(subf=subf, subg=subg, foo=foo)) } ###
> End examples  
> 
> Which of these ways is best?  Style 2 seems at first to be the most
> natural in R, but I found there are some major drawbacks.  First, it
> is hard to debug.  For instance, if I want to debug Subf, I need to
> first "debug(Foo)" and then while Foo is debugging, type
> "debug(Subf)".  Another big limitation is that I can't write
> test-cases (e.g. using RUnit) for Subf and Subg because they aren't
> visible in any way at the global level.  
> 
> For these reasons, style 1 seems to be better than style 2, if less
> elegant.  However, style 1 can get awkward because any parameters
> passed to the main function are not visible to the others.  In the
> above case, the value of "x" must be passed to Foo.subf and Foo.subg
> explicitly.  Also there is no enforcement of code isolation (i.e.
> anyone can call Foo.subf). 
> 
> Style 3 is more explicitly object oriented.  It has the advantage of
> style 2 in that you don't need to pass x around, and the advantage of
> style 1 in that you can still write tests and easily debug the
> subfunctions.  However to actually call the main function you have to
> type "Foo(x)$Main()" instead of "Foo(x)", or else write a wrapper
> function for this.  Either way there is more typing. 
> 
> So anyway, what is the best way to handle this?  R does not seem to
> have a good way of managing namespaces or avoiding collisions, like a
> module system or explicit object-orientation.  How should we get
> around this limitation?  I've looked at sample R code in the
> distribution and elsewhere, but so far it's been pretty
> disappointing---most people seem to write very long, hard to
> understand functions.  
> 
> Thanks for any advice!

Re: [Rd] Best style to organize code, namespaces

2010-02-22 Thread Gabor Grothendieck
As you mention ease of debugging basically precludes subfunctions so
style 1 is left.

Functions can be nested in environments rather than in other functions
and this will allow debugging to still occur.

The proto package which makes it particularly convenient to nest
functions in environments giving an analog to #3 while still allowing
debugging.  See http//:r-proto.googlecode.com

> library(proto)
> # p is proto object with variable a and method f
> p <- proto(a = 1, f = function(., x = 1) .$a <- .$a + 1)
> with(p, debug(f))
> p$f()
debugging in: get("f", env = p, inherits = TRUE)(p, ...)
debug: .$a <- .$a + 1
Browse[2]>
exiting from: get("f", env = p, inherits = TRUE)(p, ...)
[1] 2
> p$a
[1] 2


On Mon, Feb 22, 2010 at 9:49 PM, Ben  wrote:
> Hi all,
>
> I'm hoping someone could tell me what best practices are as far as
> keeping programs organized in R.  In most languages, I like to keep
> things organized by writing small functions.  So, suppose I want to
> write a function that would require helper functions or would just be
> too big to write in one piece.  Below are three ways to do this:
>
>
> ### Style 1 (C-style) ###
> Foo <- function(x) {
>  
> }
> Foo.subf <- function(x, blah) {
>  
> }
> Foo.subg <- function(x, bar) {
>  
> }
>
> ### Style 2 (Lispish?) ##
> Foo <- function(x) {
>  Subf <- function(blah) {
>    
>  }
>  Subg <- function(bar) {
>    
>  }
>  
> }
>
> ### Object-Oriented #
> Foo <- function(x) {
>  Subf <- function(blah) {
>    
>  }
>  Subg <- function(bar) {
>    
>  }
>  Main <- function() {
>    
>  }
>  return(list(subf=subf, subg=subg, foo=foo))
> }
> ### End examples 
>
> Which of these ways is best?  Style 2 seems at first to be the most
> natural in R, but I found there are some major drawbacks.  First, it
> is hard to debug.  For instance, if I want to debug Subf, I need to
> first "debug(Foo)" and then while Foo is debugging, type
> "debug(Subf)".  Another big limitation is that I can't write
> test-cases (e.g. using RUnit) for Subf and Subg because they aren't
> visible in any way at the global level.
>
> For these reasons, style 1 seems to be better than style 2, if less
> elegant.  However, style 1 can get awkward because any parameters
> passed to the main function are not visible to the others.  In the
> above case, the value of "x" must be passed to Foo.subf and Foo.subg
> explicitly.  Also there is no enforcement of code isolation
> (i.e. anyone can call Foo.subf).
>
> Style 3 is more explicitly object oriented.  It has the advantage of
> style 2 in that you don't need to pass x around, and the advantage of
> style 1 in that you can still write tests and easily debug the
> subfunctions.  However to actually call the main function you have to
> type "Foo(x)$Main()" instead of "Foo(x)", or else write a wrapper
> function for this.  Either way there is more typing.
>
> So anyway, what is the best way to handle this?  R does not seem to
> have a good way of managing namespaces or avoiding collisions, like a
> module system or explicit object-orientation.  How should we get
> around this limitation?  I've looked at sample R code in the
> distribution and elsewhere, but so far it's been pretty
> disappointing---most people seem to write very long, hard to
> understand functions.
>
> Thanks for any advice!
>
> --
> Ben
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best style to organize code, namespaces

2010-02-22 Thread Duncan Murdoch

On 22/02/2010 9:49 PM, Ben wrote:

Hi all,

I'm hoping someone could tell me what best practices are as far as
keeping programs organized in R.  In most languages, I like to keep
things organized by writing small functions.  So, suppose I want to
write a function that would require helper functions or would just be
too big to write in one piece.  Below are three ways to do this:


### Style 1 (C-style) ###
Foo <- function(x) {
  
}
Foo.subf <- function(x, blah) {
  
}
Foo.subg <- function(x, bar) {
  
}

### Style 2 (Lispish?) ##
Foo <- function(x) {
  Subf <- function(blah) {

  }
  Subg <- function(bar) {

  }
  
}

### Object-Oriented #
Foo <- function(x) {
  Subf <- function(blah) {

  }
  Subg <- function(bar) {

  }
  Main <- function() {

  }
  return(list(subf=subf, subg=subg, foo=foo))
}
### End examples 

Which of these ways is best?  Style 2 seems at first to be the most
natural in R, but I found there are some major drawbacks.  First, it
is hard to debug.  For instance, if I want to debug Subf, I need to
first "debug(Foo)" and then while Foo is debugging, type
"debug(Subf)".  


You can use setBreakpoint to set a breakpoint in the nested functions, 
and it will exist in all invocations of Foo (which each create new 
instances of the nested functions).  debug() is not the only debugging tool.


Another big limitation is that I can't write

test-cases (e.g. using RUnit) for Subf and Subg because they aren't
visible in any way at the global level.

For these reasons, style 1 seems to be better than style 2, if less
elegant.  However, style 1 can get awkward because any parameters
passed to the main function are not visible to the others.  In the
above case, the value of "x" must be passed to Foo.subf and Foo.subg
explicitly.  Also there is no enforcement of code isolation
(i.e. anyone can call Foo.subf).

Style 3 is more explicitly object oriented.  It has the advantage of
style 2 in that you don't need to pass x around, and the advantage of
style 1 in that you can still write tests and easily debug the
subfunctions.  However to actually call the main function you have to
type "Foo(x)$Main()" instead of "Foo(x)", or else write a wrapper
function for this.  Either way there is more typing.

So anyway, what is the best way to handle this?  R does not seem to
have a good way of managing namespaces or avoiding collisions, like a
module system or explicit object-orientation. 


Packages are self-contained modules.  You don't get collisions between 
names of locals between packages, and if they export the same name, 
other packages can explicitly select which export to use.


 How should we get

around this limitation?  I've looked at sample R code in the
distribution and elsewhere, but so far it's been pretty
disappointing---most people seem to write very long, hard to
understand functions.


I would normally use a mixture of styles 1 and 2.  Use style 2 for 
functions that really do need access to Foo locals, and use style 1 for 
self-contained functions.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Best style to organize code, namespaces

2010-02-22 Thread Ben
Hi all,

I'm hoping someone could tell me what best practices are as far as
keeping programs organized in R.  In most languages, I like to keep
things organized by writing small functions.  So, suppose I want to
write a function that would require helper functions or would just be
too big to write in one piece.  Below are three ways to do this:


### Style 1 (C-style) ###
Foo <- function(x) {
  
}
Foo.subf <- function(x, blah) {
  
}
Foo.subg <- function(x, bar) {
  
}

### Style 2 (Lispish?) ##
Foo <- function(x) {
  Subf <- function(blah) {

  }
  Subg <- function(bar) {

  }
  
}

### Object-Oriented #
Foo <- function(x) {
  Subf <- function(blah) {

  }
  Subg <- function(bar) {

  }
  Main <- function() {

  }
  return(list(subf=subf, subg=subg, foo=foo))
}
### End examples 

Which of these ways is best?  Style 2 seems at first to be the most
natural in R, but I found there are some major drawbacks.  First, it
is hard to debug.  For instance, if I want to debug Subf, I need to
first "debug(Foo)" and then while Foo is debugging, type
"debug(Subf)".  Another big limitation is that I can't write
test-cases (e.g. using RUnit) for Subf and Subg because they aren't
visible in any way at the global level.

For these reasons, style 1 seems to be better than style 2, if less
elegant.  However, style 1 can get awkward because any parameters
passed to the main function are not visible to the others.  In the
above case, the value of "x" must be passed to Foo.subf and Foo.subg
explicitly.  Also there is no enforcement of code isolation
(i.e. anyone can call Foo.subf).

Style 3 is more explicitly object oriented.  It has the advantage of
style 2 in that you don't need to pass x around, and the advantage of
style 1 in that you can still write tests and easily debug the
subfunctions.  However to actually call the main function you have to
type "Foo(x)$Main()" instead of "Foo(x)", or else write a wrapper
function for this.  Either way there is more typing.

So anyway, what is the best way to handle this?  R does not seem to
have a good way of managing namespaces or avoiding collisions, like a
module system or explicit object-orientation.  How should we get
around this limitation?  I've looked at sample R code in the
distribution and elsewhere, but so far it's been pretty
disappointing---most people seem to write very long, hard to
understand functions.

Thanks for any advice!

-- 
Ben

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] patch about compile R with clang

2010-02-22 Thread Gong Yu
clang is compiler http://clang.llvm.org, it is fast and better c compiler then 
gcc, yesterday i use clang and gfortran compile R.
The only two change in source code is :

1. the configure file (in confiure when test include wctype.h,gcc can compile 
but clang need include both wchar.h wctype.h),so this is patch
--- /r/configure
+++ /myr/configure
@@ -39172,6 +39172,7 @@
 cat >>conftest.$ac_ext <<_ACEOF
 /* end confdefs.h.  */
 $ac_includes_default
+#include 
 #include <$ac_header>
 _ACEOF
 rm -f conftest.$ac_objext
@@ -39480,6 +39481,7 @@
 cat confdefs.h >>conftest.$ac_ext
 cat >>conftest.$ac_ext <<_ACEOF
 /* end confdefs.h.  */
+#include 
 #include 
 
 #ifdef F77_DUMMY_MAIN


2. edit tre-match-approx.c
change the following line 
#define __USE_STRING_INLINES
#undef __NO_INLINE__
to 
//#define __USE_STRING_INLINES
//#undef __NO_INLINE__
becasue clang will report errors(fields must have a constant size:'variable 
length array in structure' extension will never be supported' in string.h)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] grid unit bug? (PR#14220)

2010-02-22 Thread gunter . berton
The following seems to me to be at least a perverse trap, if not an =
outright
bug:

> is.numeric(unit(1,"npc"))
[1] TRUE
> is.numeric(1*unit(1,"npc"))
[1] FALSE
> is.numeric(unit(0,"npc") +unit(1,"npc"))
[1] FALSE

...etc.
i.e. is.numeric() appears to be TRUE for class "unit" but false for =
class
("unit.arithmetic" "unit" ). Seems to me it ought to b the same for =
both.


Bert Gunter
Genentech Nonclinical Biostatistics

(FWIW, I think grid graphics is brilliant!)

This was R version 2.11.0dev for Windows btw (not that it makes a
difference):

sessionInfo()

R version 2.11.0 Under development (unstable) (2010-02-15 r51142)=20
i386-pc-mingw32=20

locale:
[1] LC_COLLATE=3DEnglish_United States.1252=20
[2] LC_CTYPE=3DEnglish_United States.1252  =20
[3] LC_MONETARY=3DEnglish_United States.1252
[4] LC_NUMERIC=3DC =20
[5] LC_TIME=3DEnglish_United States.1252   =20

attached base packages:
 [1] datasets  splines   grid  tcltk stats graphics  =
grDevices
 [8] utils methods   base=20

other attached packages:
[1] TinnR_1.0.3 R2HTML_1.59-1   Hmisc_3.7-0 survival_2.35-8
[5] svSocket_0.9-48 lattice_0.18-3  MASS_7.3-5=20

loaded via a namespace (and not attached):
[1] cluster_1.12.1 svMisc_0.9-56



=A0
=A0

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Compiling R on Linux with SunStudio 12.1: "wide-character type" problems

2010-02-22 Thread rt
I am trying to compile R on Linux using SunStudio. Configure flags are
mostly as suggested in the R install guide.

CC=/opt/sun/sunstudio12.1/bin/suncc
CFLAGS="-g -xc99 -xlibmil -xlibmieee"
MAIN_CFLAGS=-g
SHLIB_CFLAGS=-g
CPPFLAGS="-I. -I/opt/sun/sunstudio12.1/prod/include
-I/opt/sun/sunstudio12.1/prod/include/cc"
CPPFLAGS+="-I/opt/sun/sunstudio12.1/prod/include/cc/sys
-I/usr/local/include"
F77=/opt/sun/sunstudio12.1/bin/sunf95
FFLAGS="-g -O -libmil "
SAFE_FFLAGS="-g -libmil"
CPICFLAGS=-Kpic
FPICFLAGS=-Kpic
SHLIB_LDFLAGS=-shared
LDFLAGS=-L/opt/sun/sunstudio12.1/lib/386
CXX=/opt/sun/sunstudio12.1/bin/sunCC
CXXFLAGS="-g -xlibmil -xlibmieee"
CXXPICFLAGS=-Kpic
SHLIB_CXXLDFLAGS="-G -lCstd"
FC=/opt/sun/sunstudio12.1/bin/sunf95
FCFLAGS=$FFLAGS
FCPICFLAGS=-Kpic
MAKE=dmake

R install guide also indicates that: "The OS needs to have enough support
for wide-character types: this is checked at configuration. Specifically,
the C99 functionality of headers wchar.h and wctype.h, types wctans_t and
mbstate_t and functions mbrtowc, mbstowcs, wcrtomb, wcscoll, wcstombs,
wctrans, wctype, and iswctype."
Configure stops with the following error message:

checking iconv.h usability... yes
checking iconv.h presence... yes
checking for iconv.h... yes
checking for iconv... in libiconv
checking whether iconv accepts "UTF-8", "latin1" and "UCS-"... yes
checking for iconvlist... yes
checking wchar.h usability... yes
checking wchar.h presence... yes
checking for wchar.h... yes
checking wctype.h usability... yes
checking wctype.h presence... yes
checking for wctype.h... yes
checking whether mbrtowc exists and is declared... yes
checking whether wcrtomb exists and is declared... yes
checking whether wcscoll exists and is declared... yes
checking whether wcsftime exists and is declared... yes
checking whether wcstod exists and is declared... yes
checking whether mbstowcs exists and is declared... yes
checking whether wcstombs exists and is declared... yes
**checking whether wctrans exists and is declared... no
checking whether iswblank exists and is declared... no
checking whether wctype exists and is declared... no
checking whether iswctype exists and is declared... no
configure: error: Support for MBCS locales is required.*

Relevant parts of config.log are as follows:

configure:39472: checking whether iswctype exists and is declared
configure:39510: /opt/sun/sunstudio12.1/bin/suncc -o conftest -g -xc99
-xlibmil -xlibmieee -m32  -I. -I/opt/sun/sunstudio12.1/prod/include
-I/opt/sun/sunstudio12.1/prod/include/cc-I/opt/sun/sunstudio12.1/prod/include/cc/sys
-I/usr/local/include  -L/opt/sun/sunstudio12.1/lib/386 -L/usr/local/lib
conftest.c -ldl -lm  -liconv >&5
*"/usr/include/wctype.h", line 112: syntax error before or at: __wc
"/usr/include/wctype.h", line 195: syntax error before or at: towlower
"/usr/include/wctype.h", line 302: syntax error before or at: towupper_l
"/usr/include/wctype.h", line 302: syntax error before or at: __wc
"/usr/include/wctype.h", line 310: syntax error before or at: towctrans_l
"/usr/include/wctype.h", line 310: syntax error before or at: __wc
cc: acomp failed for conftest.c
configure:39516: $? = 1
configure: failed program was:
| /* confdefs.h.  */
*| #define PACKAGE_NAME "R"


*| #include 
*|
| #ifdef F77_DUMMY_MAIN
|
| #  ifdef __cplusplus
|  extern "C"
| #  endif
|int F77_DUMMY_MAIN() { return 1; }
|
| #endif
*| int
| main ()
| {
| #ifndef iswctype
|   char *p = (char *) iswctype;
| #endif
|
|   ;
|   return 0;
| }
configure:39534: result: no
configure:39710: error: Support for MBCS locales is required.*

I am not sure if this is a Linux issue or if it is a SunStudio issue.  Has
anybody tried to compile R on Linux using SunStudio?

Thanks in advance,

Russ

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Where does install.R go when R gets compiled? Or, how to experiment with changes to install.R?

2010-02-22 Thread Uwe Ligges



On 20.02.2010 00:04, Paul Johnson wrote:

On Thu, Feb 18, 2010 at 2:20 PM, Jens Elkner  wrote:

On Thu, Feb 18, 2010 at 11:33:14AM -0600, Paul Johnson wrote:

I'm pursuing an experiment to make RPM files for R packages
on-the-fly. Any time I install an R package successfully, I want to
wrap up those files in an RPM.   Basically, the idea is to "hack" an
option similar to --build for R CMD INSTALL.


Hmm, why not take the easy way:

clean_dst $PROTO
cd $TMPBUILD
mkdir -p $PROTO/R/library
$R_HOME/bin/R CMD INSTALL -l $PROTO/R/library $TMPBUILD


Yes, I've been there, done that.

I have to administer this on 60 servers in a cluster.  I don't want to
rebuild all packages on all systems. If I can figure a way to create
RPM for them, I can script the RPM installs and then I'm sure all the
systems are identical.

In the worst case scenario, I just have to copy the library tree from
one machine to another.  But the RPM approach has a bit more built-in
error checking.

pj



Paul,

beside the already answered parts: in general I'd try to read from some 
network space so that I'd had to make only 1 installation which is also 
useful for easier upgrades and parallel execution where you need 
identical doftware on all nodes.


Best,
Uwe

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] shash in unique.c

2010-02-22 Thread Matthew Dowle

Looking at shash in unique.c, from R-2.10.1  I'm wondering if it makes sense 
to hash the pointer itself rather than the string it points to?
In other words could the SEXP pointer be cast to unsigned int and the usual 
scatter be called on that as if it were integer?

shash would look like a slightly modified version of ihash like this :

static int shash(SEXP x, int indx, HashData *d)
{
if (STRING_ELT(x,indx) == NA_STRING) return 0;
return scatter((unsigned int) (STRING_ELT(x,indx), d);
}

rather than its current form which appears to hash the string it points to :

static int shash(SEXP x, int indx, HashData *d)
{
unsigned int k;
const char *p;
if(d->useUTF8)
 p = translateCharUTF8(STRING_ELT(x, indx));
else
 p = translateChar(STRING_ELT(x, indx));
k = 0;
while (*p++)
 k = 11 * k + *p; /* was 8 but 11 isn't a power of 2 */
return scatter(k, d);
}

Looking at sequal, below, and reading its comments, if the pointers are 
equal it doesn't look at the strings they point to, which lead to the 
question above.

static int sequal(SEXP x, int i, SEXP y, int j)
{
if (i < 0 || j < 0) return 0;
/* Two strings which have the same address must be the same,
   so avoid looking at the contents */
if (STRING_ELT(x, i) == STRING_ELT(y, j)) return 1;
/* Then if either is NA the other cannot be */
/* Once all CHARSXPs are cached, Seql will handle this */
if (STRING_ELT(x, i) == NA_STRING || STRING_ELT(y, j) == NA_STRING)
 return 0;
return Seql(STRING_ELT(x, i), STRING_ELT(y, j));
}

Matthew

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] scale(x, center=FALSE) (PR#14219)

2010-02-22 Thread mrizzo
Full_Name: Maria Rizzo
Version: 2.10.1 (2009-12-14) 
OS: Windows XP SP3
Submission from: (NULL) (72.241.75.222)


platform   i386-pc-mingw32  
arch   i386 
os mingw32  
system i386, mingw32
status  
major  2
minor  10.1 
year   2009 
month  12   
day14   
svn rev50720
language   R
version.string R version 2.10.1 (2009-12-14)

scale returns incorrect values when center=FALSE and scale=TRUE.

When center=FALSE, scale=TRUE, the "scale" used is not the square root of sample
variance, the "scale" attribute is equal to sqrt(sum(x^2)/(n-1)).

Example:

x <- runif(10)
n <- length(x)

scaled <- scale(x, center=FALSE, scale=TRUE)
scaled
s.bad <- attr(scaled, "scale")
s.bad  #wrong
sd(x)  #correct

#compute the sd as if data has already been centered
#that is, compute the variance as sum(x^2)/(n-1)

sqrt(sum(x^2)/(n-1))

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel