Re: [R] Importing huge XML-Files

2007-09-02 Thread Martin Morgan
Hi Alex --

Try xmlTreeParse with useInternalNodes=TRUE. This will likely allow
you to read the document in easily, even with limited memory (33MB
does not sound too 'huge').

To extract just the node sets you're interested in, use getNodeSet or
xpathApply. This is easier (with a little extra learning, e.g.,
http://www.w3.org/TR/xpath20/#abbrev) and more memory efficient than
creating a DOM parser. Something like (WARNING: not tested!!)

> doc <- xmlTreeParse("doc.xml", useInternalNode=TRUE)
> res <- getNodeSet(doc, "//MRType1")

will return (pointers to) all the MRType1 nodes. You could then use
sapply and xmlValue, or more directly

> dates <- xpathApply(doc, "//MRType1/date", xmlValue)

and the like to extract specific values.  The examples and help page
for xpathApply are useful to get you going.

Watch out for memory use, and the need to 'free' the original 'doc'
and perhaps other intermediate objects.

Martin

Alexander Heidrich <[EMAIL PROTECTED]> writes:

> Dear all,
>
> for my diploma thesis I have to import huge XML-Files into R for  
> statistical processing - huge means a size about 33 MB.
>
> I'm using the XML-Package version 1.9
>
> As far as reading the complete file into R via xmlTreeParse doesn't  
> work or is too slow, I'm trying to use xmlEventParse but I got  
> completely stuck.
>
> I have many different type of nodes
>
> + 
>
> - 
>   - 
>- 
> - 
>  - 
>   - 
>  21.04.2005
>  10:00
>  1
>  
>  2,33
>  
> 
>
>  - 
>   - 
>  21.04.2005
>  10:00
>  1
>  
>  2,33
>  
> 
>
>   ...
> + 
> + 
>
> I only need the measurement/MRType1 nodes - how can I do this?  
> Currently I am trying the following code:
>
> xmlEventParse("/input.xml", list(startElement=xtract.startElement,  
> text=xtract.text), useTagName=TRUE, addContext = FALSE)
>
> xtract.startElement <- function(name,attr){
>   startElement.name <<- c(startElement.name,name)
>   }
>
> xtract.text <- function(text) {
>   startElement.value <<- c(startElement.value,text)
>   }
>
> this only gives me two lists, one with the all node names (even the  
> ones I dont need) and one with the values (also together with the  
> ones I dont need) but I can't put things together this way.
>
> What I want is:
>
> No.   DateTimePlotcodecollarcode  value   depth
> 1 ... ... ... ... ... ...
> 2 ... ... ... ... ... ...
>
> Any help is really really appreciated. I tried the whole week,  
> starting with xmlTreeParse whick works fine for files with 200  
> entries but for files with 5 entries it keeps crashing my core 2  
> duo, 2.4 GHz machine.
>
> Thanks so much in advance! If you need any further information, code  
> snippets or XML file details please do not hestitate to mail!
>
> Alex
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and Web Applications

2007-08-30 Thread Martin Morgan
Hi Chris --

RWebServices provides a way to expose R functionality as (SOAP-based)
web services.

http://bioconductor.org/packages/2.0/bioc/html/RWebServices.html

This might be more than you are looking for, and has not been tested
on MacOS (which seems to be your platform).

Martin

"Chris Parkin" <[EMAIL PROTECTED]> writes:

> Hello,
>
> I'm curious to know how people are calling R from web applications (I've
> been looking for Perl but I'm open to other languages).  After doing a
> search, I came across the R package "RSPerl", but I'm having difficulties
> getting it installed (on Mac OSX).  I believe the problem probably has to do
> with changes in R since the package release.  Below you will see where the
> installation process comes to an end.  Does anyone have any suggestions, or
> perhaps a direction to point me in?
>
> Thanks in advance for your insight!
>
> Chris
>
> * Installing to library '/Library/Frameworks/R.framework/Resources/library'
> * Installing *source* package 'RSPerl' ...
> checking for perl... /usr/bin/perl
> No support for any of the Perl modules from calling Perl from R.
> *
>
>Set PERL5LIB to
> /Library/Frameworks/R.framework/Versions/2.5/Resources/library/RSPerl/perl
>
> *
> Testing: -F/Library/Frameworks/R.framework/.. -framework R
> Using '/usr/bin/perl' as the perl executable
> Perl modules (no):
> Adding R package to list of Perl modules to enable callbacks to R from Perl
> Creating the C code for dynamically loading modules with native code for
> Perl:  R
> modules:   R; linking:
> checking for gcc... gcc
> checking for C compiler default output file name... a.out
> checking whether the C compiler works... yes
> checking whether we are cross compiling... no
> checking for suffix of executables...
> checking for suffix of object files... o
> checking whether we are using the GNU C compiler... yes
> checking whether gcc accepts -g... yes
> checking for gcc option to accept ISO C89... none needed
> Support R in Perl: yes
> configure: creating ./config.status
> config.status: creating src/Makevars
> config.status: creating inst/scripts/RSPerl.csh
> config.status: creating inst/scripts/RSPerl.bsh
> config.status: creating src/RinPerlMakefile
> config.status: creating src/Makefile.PL
> config.status: creating cleanup
> config.status: creating src/R.pm
> config.status: creating R/perl5lib.R
> making target all in RinPerlMakefile
> RinPerlMakefile:5: /Library/Frameworks/R.framework/Resources/etc/Makeconf:
> No such file or directory
> make: *** No rule to make target
> `/Library/Frameworks/R.framework/Resources/etc/Makeconf'.  Stop.
>
>   [[alternative HTML version deleted]]
>
> ______
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rmpi and x86

2007-08-28 Thread Martin Morgan
Edna --

I'll keep this on the list, so that others will learn, and others will
correct me when I give bad advice!

> relocation R_X86_64_32 against `lam_mpi_comm_world' can not be used
> when making a shared object; recompile with -fPIC

This likely means that your lam was not built with the --enable-shared
configure option, as documented in the installation guide. (It might
also mean that R was not configure with --enable-R-shlib; the message
is opaque to me).

I believe others on the list have only had success with specific
versions of LAMMPI, so that would be the next place to look (after
sorting out the shared library issue)

Martin

"Edna Bell" <[EMAIL PROTECTED]> writes:

> Here is what happens:
> Note: lam-7.1.4
>
> linux-tw9c:/home/bell/Desktop/R-2.5.1/bin # ./R CMD INSTALL --clean
> Rmpi_0.5-3.tar.gz
> * Installing to library '/home/bell/Desktop/R-2.5.1/library'
> * Installing *source* package 'Rmpi' ...
> checking for gcc... gcc
> checking for C compiler default output... a.out
> checking whether the C compiler works... yes
> checking whether we are cross compiling... no
> checking for suffix of executables...
> checking for suffix of object files... o
> checking whether we are using the GNU C compiler... yes
> checking whether gcc accepts -g... yes
> checking for gcc option to accept ANSI C... none needed
> checking how to run the C preprocessor... gcc -E
> checking for egrep... grep -E
> checking for ANSI C header files... yes
> checking for sys/types.h... yes
> checking for sys/stat.h... yes
> checking for stdlib.h... yes
> checking for string.h... yes
> checking for memory.h... yes
> checking for strings.h... yes
> checking for inttypes.h... yes
> checking for stdint.h... yes
> checking for unistd.h... yes
> checking mpi.h usability... yes
> checking mpi.h presence... yes
> checking for mpi.h... yes
> Try to find libmpi or libmpich ...
> checking for main in -lmpi... yes
> Try to find liblam ...
> checking for main in -llam... yes
> checking for openpty in -lutil... yes
> checking for main in -lpthread... yes
> configure: creating ./config.status
> config.status: creating src/Makevars
> ** libs
> gcc -std=gnu99 -I/home/bell/Desktop/R-2.5.1/include
> -I/home/hodgesse/Desktop/R-2.5.1/include -DPACKAGE_NAME=\"\"
> -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\"
> -DPACKAGE_BUGREPORT=\"\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
> -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
> -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
> -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1   -DMPI2 -I/usr/local/include
> -fpic  -g -O2 -c conversion.c -o conversion.o
> gcc -std=gnu99 -I/home/bell/Desktop/R-2.5.1/include
> -I/home/bell/Desktop/R-2.5.1/include -DPACKAGE_NAME=\"\"
> -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\"
> -DPACKAGE_BUGREPORT=\"\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
> -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
> -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
> -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1   -DMPI2 -I/usr/local/include
> -fpic  -g -O2 -c internal.c -o internal.o
> gcc -std=gnu99 -I/home/bell/Desktop/R-2.5.1/include
> -I/home/bell/Desktop/R-2.5.1/include -DPACKAGE_NAME=\"\"
> -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\"
> -DPACKAGE_BUGREPORT=\"\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
> -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
> -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
> -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1   -DMPI2 -I/usr/local/include
> -fpic  -g -O2 -c RegQuery.c -o RegQuery.o
> gcc -std=gnu99 -I/home/bell/Desktop/R-2.5.1/include
> -I/home/bell/Desktop/R-2.5.1/include -DPACKAGE_NAME=\"\"
> -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\"
> -DPACKAGE_BUGREPORT=\"\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
> -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
> -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
> -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1   -DMPI2 -I/usr/local/include
> -fpic  -g -O2 -c Rmpi.c -o Rmpi.o
> gcc -std=gnu99 -shared -L/usr/local/lib64 -o Rmpi.so conversion.o
> internal.o RegQuery.o Rmpi.o -lmpi -llam -lutil -lpthread
> /usr/lib64/gcc/x86_64-suse-linux/4.1.0/../../../../x86_64-suse-linux/bin/ld:
> /usr/lib64/gcc/x86_64-suse-linux/4.1.0/../../../../lib64/libmpi.a(infoset.o):
> relocation R_X86_64_32 against `lam_mpi_comm_world' can not be used
> when making a shared object; recompile with -fPIC
> /usr/lib64/gcc/x86_64-suse-linux/

Re: [R] Rmpi and x86

2007-08-28 Thread Martin Morgan
Hi Edna --

I have Rmpi 0.5-3 under R 2.5.1 with LAM 7.1.2 installed on an x86_64
SuSE 10.0. 

I installed (as a regular user, to my own disc space) LAM and ran
through some basic checks (lamboot / lamhalt, checking that I could
compile the demo programs)

After downloading Rmpi_0.5.3.tar.gz, I did

CC=mpicc R CMD INSTALL --clean Rmpi_0.5.3.tar.gz

the configure script of Rmpi found libmpi and liblam in my LAMHOME,
and also -lutil and -lpthread. Source files compiled without any
issues, and the package installed. If I have not issued a lamboot
command, inside R,

> library(Rmpi)

loads the library and indicates that it is starting lam. If I have
already issued lamboot, then library(Rmpi) loads as expected.

Where do things go wrong for you?

Martin

"Edna Bell" <[EMAIL PROTECTED]> writes:

> Dear R Gurus:
>
> Is there a problem with Rmpi on x86 with SUSE 10.1, please?
>
> I've tried everything and it still won't load.
>
> Has anyone else dealt with this please?
>
> Thanks,
> Edna Bell
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Possible memory leak with R v.2.5.0

2007-08-16 Thread Martin Morgan
 ratios=get.global("ratios"),  var.norm=T,
>  r.sig=get.global( "r.sig" ),
>  allow.anticor=get.global( "allow.anticor" )
>  ) {
>cat( "phw dbg msg\n")
>cluster <<- cluster
>opt <- match.arg( opt )
>rows <- cluster$rows
>cols <- cluster$cols
>if ( opt == "rows" ) {
>  cat( "phw dbg msg: if opt == rows\n" )
>  r <- ratios[ rows, cols ]
>  r.all <- ratios[ genes, cols ]
>  avg.rows <- apply( r, 2, mean, na.rm=T ) ##median )
>  rm( r )  # phw added 8/9/07
>  gc( reset=TRUE ) # phw added 8/9/07
>  devs <- apply( r.all, 1, "-", avg.rows )
>  if ( !allow.anticor ) rm( r.all, avg.rows )  # phw added 8/9/07
>  gc( reset=TRUE ) #  phw added 8/9/07
>  cat( "phw dbg msg: finished calc'ing avg.rows & devs\n" )
>##   This   is  what  we'd  use  from  the  deHoon  paper
>  (bioinformatics/bth927)
>  ##sd.rows <- apply( r, 2, sd )
>  ##devs <- devs * devs
>  ##sd.rows <- sd.rows * sd.rows
>  ##sds <- apply( devs, 2, "/", sd.rows )
>  ##sds <- apply( sds, 2, sum )
>  ##return( log10( sds ) )
>  ## This is faster and nearly equivalent
>  vars <- apply( devs, 2, var, na.rm=T )
>  rm( devs )
>  gc( reset=TRUE ) #  phw added 8/9/07
>  test <- log10( vars ) #  phw added 8/9/07
>  rm( vars ) #  phw added 8/9/07
>  gc( reset=TRUE ) #  phw added 8/9/07
>  vars <- log10( test ) #  phw added 8/9/07
>  rm( test ) #  phw added 8/9/07
>  gc( reset=TRUE ) #  phw added 8/9/07
>  #vars <- log10( vars )
>  cat( "phw dbg msg: finished calc'ing vars (\n" )
>  ## HOW TO ALLOW FOR ANTICOR??? Here's how:
>  if ( allow.anticor ) {
>cat( "phw dbg msg: allow.anticor==T\n" )
>   ##  Get  variance  against the inverse of the mean
>  profile
>devs.2 <- apply( r.all, 1, "-", -avg.rows )
>gc( reset=TRUE ) #  phw added 8/9/07
>vars.2 <- apply( devs.2, 2, var, na.rm=T )
>rm( devs.2 )
>gc( reset=TRUE ) #  phw added 8/9/07
>vars.2 <- log10( vars.2 )
>gc( reset=TRUE ) #  phw added 8/9/07
>   ##  For  each  gene  take  the  min of variance or
>  anti-cor variance
>vars <- cbind( vars, vars.2 )
>rm( vars.2 )
>gc( reset=TRUE ) #  phw added 8/9/07
>vars <- apply( vars, 1, min )
>gc( reset=TRUE ) #  phw added 8/9/07
>  }
>   ##  Normalize  the values by the variance over the rows in
>  the cluster
>  if ( var.norm ) {
>cat( "phw dbg msg: var.norm == T \n")
>vars <- vars - mean( vars[ rows ], na.rm=T )
>tmp.sd <- sd( vars[ rows ], na.rm=T )
> if  (  !  is.na(  tmp.sd ) && tmp.sd != 0 ) vars <- vars / (
>  tmp.sd + r.sig )
>  }
>  gc( reset=TRUE ) #  phw added 8/9/07
>  return( vars )
>} else {
>  cat( "phw dbg msg: else\n" )
>  r.all <- ratios[ rows, ]
>  ## Mean-normalized variance
>   vars  <-  log10( apply( r.all, 2, var, na.rm=T ) / abs( apply(
>  r.all, 2, mean, na.rm=T ) ) )
>  names( vars ) <- colnames( ratios )
>   ##  Normalize  the values by the variance over the rows in the
>  cluster
>  if ( var.norm ) {
>vars <- vars - mean( vars[ cluster$cols ], na.rm=T )
>tmp.sd <- sd( vars[ cluster$cols ], na.rm=T )
> if  (  !  is.na(  tmp.sd ) && tmp.sd != 0 ) vars <- vars / (
>  tmp.sd + r.sig )
>  }
>  return( vars )
>}
>  },
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting lapply() to work for a new class

2007-08-15 Thread Martin Morgan
Pijus,

My 2 cents, which might be post-hoc rationalization.

If your class 'is' a list, with additional information, then

setClass("Test", contains="list")

and you're ready to go.

On the other hand, if your class 'has' a list, along with other
information, create an accessor

setGeneric("test", function(object) standardGeneric("test"))
setMethod("test",
  signature=signature(object="Test"),
  function(object) slot(object, "test"))

obj <- new("Test", test=list(1,2,3))
lapply(test(obj), print)

(You could define a generic on 'lapply' and a method that dispatches
to your class, but that implies that Test 'is' a list).

Martin


"Pijus Virketis" <[EMAIL PROTECTED]> writes:

> Thank you.
>
> When I tried to set as.list() in baseenv(), I learned that its bindings
> are locked. Does this mean that the thing to do is just to write my own
> "lapply", which does the coercion using my "private" as.list(), and then
> invokes the base lapply()?
>
> -P
>
> -Original Message-
> From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, August 15, 2007 5:18 PM
>
>> As far as I can tell, lapply() needs the class to be coercible to a 
>> list. Even after I define as.list() and as.vector(x, mode="list") 
>> methods, though, I still get an "Error in as.vector(x, "list") : 
>> cannot coerce to vector". What am I doing wrong?
>
> Not considering namespaces.  Setting an S4 method for as.list() creates
> an object called as.list in your workspace, but the lapply function uses
> the as.list in the base namespace.  That's the whole point of
> namespaces: to protect code against redefining functions.
>
> ______
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] initalizing and checking validity of S4 classes

2007-07-24 Thread Martin Morgan
Hi Michal --

Add validObject to your initialize method:

> setMethod("initialize", "someclass",
+ function(.Object, v=numeric(0), l=character(0))
+ {
+ # strip the vector names
+ cv <- v
+ cl <- l
+ names(cv) <- NULL
+ names(cl) <- NULL
+ rval <- .Object
+ [EMAIL PROTECTED] <- cv
+ [EMAIL PROTECTED] <- cl
+ validObject(rval)
+ rval
+ } )
[1] "initialize"
> new("someclass", v=1:2, l=letters[1:3])
Error in validObject(rval) : 
  invalid class "someclass" object: lengths dont match

Note that without initialize, we have:

> setClass("A",
+  representation=representation(
+v="numeric", l="character"))
[1] "A"
> setValidity("A", f)

Slots:
  
Name:  v l
Class:   numeric character
> new("A", v=1:2, l=letters[1:3])
Error in validObject(.Object) : 
  invalid class "A" object: lengths dont match
 
Here are two interpretations of this. (1) using 'initialize' means
that you are taking control of the initialization process, and hence
know when you need to call validObject. (2) Responsibility for object
validity is ambiguous -- does it belong with 'new', 'initialize', or a
'constructor' that the programmer might write? This is particularly
problematic with R's copy semantics, where creating transiently
invalid objects seems to be almost necessary (e.g., callNextMethod()
in 'initialize' might initialize the inherited slots of the object,
but the object itself is of the derived class and could well be
invalid 'invalid' after the base class has finished with initialize).

Martin

"Bojanowski, M.J.  (Michal)" <[EMAIL PROTECTED]> writes:

> Dear useRs and wizaRds,
>
> I am currently developing a set of functions using S4 classes. On the way I 
> encountered the problem exemplified with the code below. For some reason the 
> 'validity' method does not seem to work, i.e. does not check for errors in 
> the specification of the slots of the defined class. Any hints?
>
> My understanding of the whole S4 system was that validity checks are made 
> *after* class initialization. Is that correct?
>
> Thanks a lot in advance!
>
> PS. Session info:
>
> R version 2.5.1 (2007-06-27) 
> i386-pc-mingw32 
>
> locale:
> LC_COLLATE=Polish_Poland.1250;LC_CTYPE=Polish_Poland.1250;LC_MONETARY=Polish_Poland.1250;LC_NUMERIC=C;LC_TIME=Polish_Poland.1250
>
> attached base packages:
> [1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"  
> [7] "base" 
>> 
>
>
>
>
> -b-e-g-i-n--r--c-o-d-e-
>
> setClass( "someclass", representation(v="numeric", l="character"),
> prototype( v=numeric(0),
>   l=character(0) )
> )
>
> setMethod("initialize", "someclass",
> function(.Object, v=numeric(0), l=character(0))
> {
> # strip the vector names
> cv <- v
> cl <- l
> names(cv) <- NULL
> names(cl) <- NULL
> rval <- .Object
> [EMAIL PROTECTED] <- cv
> [EMAIL PROTECTED] <- cl
> rval
> } )
>
> # at this point this should be OK
> o <- new("someclass", v=1:2, l=letters[1:3])
> o
>
> # check validity
> f <- function(object)
> {
> rval <- NULL
> if( length([EMAIL PROTECTED]) != length([EMAIL PROTECTED]) )
>   rval <- c( rval, "lengths dont match")
> if( is.null(rval) ) return(TRUE)
> else return(rval)
> }
>  
> # this should return error description
> f(o)
>
>
> # define the validity method
> setValidity( "someclass", f)
>
> # this should return an error
> new("someclass", v=1:2, l=letters[1:3])
>
> # but it doesn't...
>
> -e-n-d--r--c-o-d-e-
>
>
>
>
> 
> Michal Bojanowski
> ICS / Department of Sociology
> Utrecht University
> Heidelberglaan 2; 3584 CS Utrecht
> Room 1428
> m.j.bojanowski at uu dot nl
> http://www.fss.uu.nl/soc/bojanowski/
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rmpi installation error

2007-07-23 Thread Martin Morgan
Sunitha -- I think Rmpi finds an MPICH1 installation, rather than
MPICH2. Martin

"Sunithak" <[EMAIL PROTECTED]> writes:

> Hello everybody,
> I am trying to install Rmpi on 64-bit HP-Proliant server. I am facing the
> following error:
>
>
> * Installing *source* package 'Rmpi' ...
> Try to find mpi.h ...
> Found in /usr/local/mpich-1.2.7p1/include
> Try to find libmpi or libmpich ...
> Found libmpich in /usr/local/mpich-1.2.7p1/lib
> Try to find liblam ...
> checking for main in -llam... no
> liblam not found. Probably not LAM-MPI
> checking for openpty in -lutil... no
> checking for main in -lpthread... no
> configure: creating ./config.status
> config.status: creating src/Makevars
> ** libs
> gcc -I/usr/local/lib64/R/include -I/usr/local/lib64/R/include -DPACKAGE_NAME
> =\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" -D
> PACKAGE_BUGREPORT=\"\"  -I/usr/local/mpich-1.2.7p1/include -DMPI2 -I/usr/loc
> al/include   -fpic  -g -O2 -std=gnu99 -c conversion.c -o conversion.o
> gcc -I/usr/local/lib64/R/include -I/usr/local/lib64/R/include -DPACKAGE_NAME
> =\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" -D
> PACKAGE_BUGREPORT=\"\"  -I/usr/local/mpich-1.2.7p1/include -DMPI2 -I/usr/loc
> al/include   -fpic  -g -O2 -std=gnu99 -c internal.c -o internal.o
> gcc -I/usr/local/lib64/R/include -I/usr/local/lib64/R/include -DPACKAGE_NAME
> =\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" -D
> PACKAGE_BUGREPORT=\"\"  -I/usr/local/mpich-1.2.7p1/include -DMPI2 -I/usr/loc
> al/include   -fpic  -g -O2 -std=gnu99 -c RegQuery.c -o RegQuery.o
> gcc -I/usr/local/lib64/R/include -I/usr/local/lib64/R/include -DPACKAGE_NAME
> =\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" -D
> PACKAGE_BUGREPORT=\"\"  -I/usr/local/mpich-1.2.7p1/include -DMPI2 -I/usr/loc
> al/include   -fpic  -g -O2 -std=gnu99 -c Rmpi.c -o Rmpi.o
> Rmpi.c: In function `mpi_universe_size':
> Rmpi.c:84: warning: implicit declaration of function `MPI_Comm_get_attr'
> Rmpi.c:84: error: `MPI_UNIVERSE_SIZE' undeclared (first use in this
> function)
> Rmpi.c:84: error: (Each undeclared identifier is reported only once
> Rmpi.c:84: error: for each function it appears in.)
> Rmpi.c: In function `mpi_comm_spawn':
> Rmpi.c:873: warning: implicit declaration of function `MPI_Comm_spawn'
> Rmpi.c:873: error: `MPI_ARGV_NULL' undeclared (first use in this function)
> Rmpi.c: In function `mpi_comm_get_parent':
> Rmpi.c:897: warning: implicit declaration of function `MPI_Comm_get_parent'
> Rmpi.c: In function `mpi_comm_disconnect':
> Rmpi.c:910: warning: implicit declaration of function `MPI_Comm_disconnect'
> make: *** [Rmpi.o] Error 1
> chmod: cannot access `/usr/local/lib64/R/library/Rmpi/libs/*': No such file
> or directory
> ERROR: compilation failed for package 'Rmpi'
> ** Removing '/usr/local/lib64/R/library/Rmpi'
>
>
> can any of you please suggest as to what needs to be done?
>
> Thanks for your time
> regards
> Sunitha Manjari
> Bioinformatics Team
> C-DAC, Pune-411007
> Maharashtra
> INDIA
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to revert to an older limma version?

2007-07-08 Thread Martin Morgan
"Maya Bercovich" <[EMAIL PROTECTED]> writes:

> Dear Sirs,

Please post to the Bioconductor list (see http://bioconductor.org for
instructions)

> How can I revert to an older limma version?
> Typing "install.packages("limma")"  in R gives a list of mirrors. How
> can I install the version I want after I obtain and untar the file (e.g,
> limma_2.9.1.tar.gz)?
> I am running R 2.5.0 on a Linux machine (CentOS 5).  When using limma it
> will not go past the read.maimages command. 
> I get this error:
> Error in readGenericHeader(fullname, columns = columns, sep = sep) :
>  Specified column headings not found in file
>  In addition: Warning message:
> input string 1 is invalid in this locale in: grep(pattern, x,
> ignore.case, extended, value, fixed, useBytes)

Please provide a short section of code that shows how you invoke the
function. It is very hard to tell what the cause of your problem is
without this.

read.maimages has a 'columns' argument. Do you supply it? If so, does
it contain the correct column names for the file type you are reading?
For instance, is the capitalization correct? The help page for
read.maimages provides some guidance.

> I was told by a colleague that this may be due to my limma version. 
> I try to use limma 2.10.5 and he uses 2.9.1

Specific Bioconductor package versions work with specific R
versions. Instead of using install.packages, use

> source("http://bioconductor.org/biocLite.R";)
> biocLite("limma")

to get the right version for your R. For R 2.5.0, limma 2.10.5 is the
correct version. The limma author is very responsive to bug reports, so
seek additional help and if necessary report bugs rather than revert
to previous versions.

> Could this be the reason?

Please always provide the output of the command

> sessionInfo()

to provide a concise summary of your system.

> Thanks in advance
> Maya
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lookups in R

2007-07-04 Thread Martin Morgan
Michael,

A hash provides constant-time access, though the resulting perl-esque
data structures (a hash of lists, e.g.) are not convenient for other
manipulations

> n_accts <- 10^3
> n_trans <- 10^4
> t <- list()
> t$amt <- runif(n_trans)
> t$acct <- as.character(round(runif(n_trans, 1, n_accts)))
> 
> uhash <- new.env(hash=TRUE, parent=emptyenv(), size=n_accts)
> ## keys, presumably account ids
> for (acct in as.character(1:n_accts)) uhash[[acct]] <- list(amt=0, n=0)
> 
> system.time(for (i in seq_along(t$amt)) {
+ acct <- t$acct[i]
+ x <- uhash[[acct]]
+ uhash[[acct]] <- list(amt=x$amt + t$amt[i], n=x$n + 1)
+ })
   user  system elapsed 
  0.264   0.000   0.262 
> udf <- data.frame(amt=0, n=rep(0L, n_accts),
+   row.names=as.character(1:n_accts))
> system.time(for (i in seq_along(t$amt)) {
+ idx <- row.names(udf)==t$acct[i]
+ udf[idx, ] <- c(udf[idx,"amt"], udf[idx, "n"]) + c(t$amt[i], 1)
+ })
   user  system elapsed 
 18.398   0.000  18.394 

Peter Dalgaard <[EMAIL PROTECTED]> writes:

> mfrumin wrote:
>> Hey all; I'm a beginner++ user of R, trying to use it to do some processing
>> of data sets of over 1M rows, and running into a snafu.  imagine that my
>> input is a huge table of transactions, each linked to a specif user id.  as
>> I run through the transactions, I need to update a separate table for the
>> users, but I am finding that the traditional ways of doing a table lookup
>> are way too slow to support this kind of operation.
>>
>> i.e:
>>
>> for(i in 1:100) {
>>userid = transactions$userid[i];
>>amt = transactions$amounts[i];
>>users[users$id == userid,'amt'] += amt;
>> }
>>
>> I assume this is a linear lookup through the users table (in which there are
>> 10's of thousands of rows), when really what I need is O(constant time), or
>> at worst O(log(# users)).
>>
>> is there any way to manage a list of ID's (be they numeric, string, etc) and
>> have them efficiently mapped to some other table index?
>>
>> I see the CRAN package for SQLite hashes, but that seems to be going a bit
>> too far.
>>   
> Sometimes you need a bit of lateral thinking. I suspect that you could 
> do it like this:
>
> tbl <- with(transactions, tapply(amount, userid, sum))
> users$amt <- users$amt + tbl[users$id]
>
> one catch is that there could be users with no transactions, in which 
> case you may need to replace userid by factor(userid, levels=users$id). 
> None of this is tested, of course.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loading problem with XML_1.9

2007-06-28 Thread Martin Morgan
Weijun --

If memory is a problem, you might try using the 'handler' argument of
xmlTreeParse. This provides access to each node as it is processed, so
that you can, for instance, choose to ignore nodes, or save only
numeric values, or ... I'm not sure whether the entire document is
read into a C 'external pointer', or whether the savings is just in
the R representation of the document.

Also, depending on how you use the resulting document, you might want
to watch out for the memory leak mentioned in
http://www.omegahat.org/RSXML/Changes

Martin

Luo Weijun <[EMAIL PROTECTED]> writes:

> Hello all,
> I have loading problem with XML_1.9 under 64 bit
> R2.3.1, which I got from http://R.research.att.com/.
> XML_1.9 works fine under 32 bit R2.5.0. I thought that
> could be installation problem, and I tried
> install.packages or biocLite, every time the package
> installed fine, except some warning messages below:
> ld64 warning: in /usr/lib/libxml2.dylib, file does not
> contain requested architecture
> ld64 warning: in /usr/lib/libz.dylib, file does not
> contain requested architecture
> ld64 warning: in /usr/lib/libiconv.dylib, file does
> not contain requested architecture
> ld64 warning: in /usr/lib/libz.dylib, file does not
> contain requested architecture
> ld64 warning: in /usr/lib/libxml2.dylib, file does not
> contain requested architecture
>
> Here is the error messages I got, when XML is loaded:
>> library(XML)
> Error in dyn.load(x, as.logical(local),
> as.logical(now)) : 
> unable to load shared library
> '/usr/local/lib64/R/library/XML/libs/XML.so':
>   dlopen(/usr/local/lib64/R/library/XML/libs/XML.so,
> 6): Symbol not found: _xmlMemDisplay
>   Referenced from:
> /usr/local/lib64/R/library/XML/libs/XML.so
>   Expected in: flat namespace
> Error: .onLoad failed in 'loadNamespace' for 'XML'
> Error: package/namespace load failed for 'XML'
>
> I understand that it has been pointed out that
> Sys.getenv("PATH") needs to be revised in the file
> XML/R/zzz.R, but I canâ�t even find that file under
> XML/R/ directory. Does anybody have any idea what
> might be the problem, and how to solve it? Thanks a
> lot!
> BTW, the reason I need to use R64 is that I have
> memory limitation issue with R 32 bit version when I
> load some very large XML trees. 
>
> Session information
>> sessionInfo()
> Version 2.3.1 Patched (2006-06-27 r38447) 
> powerpc64-apple-darwin8.7.0 
>
> attached base packages:
> [1] "methods"   "stats" "graphics"  "grDevices"
> "utils" "datasets" 
> [7] "base" 
>
> Weijun
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CORRECTION: Re: BioC2007, August 6-7 ** early registration ends July 1 **

2007-06-27 Thread Martin Morgan
BioC2007: Where Software and Biology Connect
August 6-7 in Seattle, WA, USA

The BioC2007 developer-focused meeting is on August 8, the day *after*
the main conference.

In addition, I neglected to mention the poster session on Monday
evening.

Best,

Martin Morgan
-- 
Bioconductor / Computational Biology
http://bioconductor.org


Martin Morgan <[EMAIL PROTECTED]> writes:

> BioC2007: Where Software and Biology Connect
> August 6-7 in Seattle, WA, USA
>
> ** Early registration ends July 1 **
>
> This conference highlights developments in and beyond Bioconductor. A
> major goal is to discuss the use and design of software for analyzing
> high throughput biological data.
>
> Format: 
>   * Scientific talks on both days from 8:30-12:00 
>   * Expanded hands-on lab sessions on both afternoons 1:00-5:30
>   * Technology overviews and social hour in the evening
>
> Get more information and register for the conference at
>
> https://secure.bioconductor.org/BioC2007/
>
> A developer focused meeting is on August 5th. For (preliminary)
> details, please visit:
>
> http://wiki.fhcrc.org/bioc/Seattle_Dev_Meeting_2007
>
> See you in August!
>
> Martin Morgan
> -- 
> Bioconductor / Computational Biology
> http://bioconductor.org
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] BioC2007, August 6-7 ** early registration ends July 1 **

2007-06-26 Thread Martin Morgan
BioC2007: Where Software and Biology Connect
August 6-7 in Seattle, WA, USA

** Early registration ends July 1 **

This conference highlights developments in and beyond Bioconductor. A
major goal is to discuss the use and design of software for analyzing
high throughput biological data.

Format: 
  * Scientific talks on both days from 8:30-12:00 
  * Expanded hands-on lab sessions on both afternoons 1:00-5:30
  * Technology overviews and social hour in the evening

Get more information and register for the conference at

https://secure.bioconductor.org/BioC2007/

A developer focused meeting is on August 5th. For (preliminary)
details, please visit:

http://wiki.fhcrc.org/bioc/Seattle_Dev_Meeting_2007

See you in August!

Martin Morgan
-- 
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a string to enviroment or function

2007-06-25 Thread Martin Morgan
Weiwei

See ?library and the character.only argument.

> f <- function(x) library(x)
> f("hgu95av2")
Error in library(x) : there is no package called 'x'
> f <- function(x) library(x, character.only=TRUE)
> f("hgu95av2")
> search()
 [1] ".GlobalEnv""package:hgu95av2"  "package:stats"
 [4] "package:graphics"  "package:grDevices" "package:utils"
 [7] "package:datasets"  "package:methods"   "Autoloads"
[10] "package:base" 

Also

> g <- function(x) as.list(get(paste(x, "ENTREZID", sep="")))
> ll <- g("hgu95av2")
> length(ll)
[1] 12625

and finally reverseSplit in Biobase and revmap in AnnotationDbi might
be helpful (though AnnotationDbi is only available with R-devel and
revmap seems not to be documented).

> res <- revmap(hgu95av2ENTREZID)
> hgu95av2ENTREZID[["1190_at"]]
[1] 5800
> res[["5800"]]
[1] "1190_at"  "32199_at"


Martin

"Weiwei Shi" <[EMAIL PROTECTED]> writes:

> then how to do this
>
> f1 <- function(mylab){
>   library(mylab)
>   ...
> }
>
> it seems that if you call
> library("hgu133a") # which is file
> # but
> library(mylab) # even you pass "hgu133a" as parameter, it still
> complains about "mylab" does not exist. It seems that it consider
> mylab as package instead of its value.
>
>
> On 6/25/07, jim holtman <[EMAIL PROTECTED]> wrote:
>> I think that you might want:
>>
>> t0 <- (paste("hgu133a", "ENTREZID", sep=""))
>> xx <- as.list(get(t0)) # make it work like xx<-as.list(hgu133aENTREZID)
>>
>>
>>
>>
>> On 6/25/07, Weiwei Shi <[EMAIL PROTECTED]> wrote:
>> >
>> > Hi,
>> >
>> > I am wondering how to make a function Fun to make the following work:
>> >
>> > t0 <- (paste("hgu133a", "ENTREZID", sep=""))
>> > xx <- as.list(Fun(t0)) # make it work like xx<-as.list(hgu133aENTREZID)
>> >
>> > thanks,
>> > --
>> > Weiwei Shi, Ph.D
>> > Research Scientist
>> > GeneGO, Inc.
>> >
>> > "Did you always know?"
>> > "No, I did not. But I believed..."
>> > ---Matrix III
>> >
>> > __
>> > R-help@stat.math.ethz.ch mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem you are trying to solve?
>
>
> -- 
> Weiwei Shi, Ph.D
> Research Scientist
> GeneGO, Inc.
>
> "Did you always know?"
> "No, I did not. But I believed..."
> ---Matrix III
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extract index during execution of sapply

2007-06-22 Thread Martin Morgan
Christian,

A favorite of mine is to use lexical scope and a 'factory' model:

> fun_factory <- function() {
+ i <- 0  # 'state' variable(s), unique to each fun_factory
+ function(x) {   # fun_factory return value; used as sapply FUN
+ i <<- i + 1 # <<- assignment finds i
+ x^i # return value of sapply FUN
+ }
+ }
> 
> sapply(1:10, fun_factory())
 [1]   1   4  27 2563125   46656
 [7]  82354316777216   387420489 100


Christian Bieli <[EMAIL PROTECTED]> writes:

> Hi there
> During execution of sapply I want to extract the number of times the 
> function given to supply has been executed. I came up with:
>
> mylist <- list(a=3,b=6,c=9)
> sapply(mylist,function(x)as.numeric(gsub("[^0-9]","",deparse(substitute(x)
>
> This works fine, but looks quite ugly. I'm sure that there's a more 
> elegant way to do this.
>
> Any suggestion?
>
> Christian
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rmpi and rsprng for Windows

2007-06-19 Thread Martin Morgan
Eric -

http://www.stats.uwo.ca/faculty/yu/Rmpi/

(the Rmpi package author page) has helpful Windows instructions.

Martin

"Eric Ferreira" <[EMAIL PROTECTED]> writes:

> Dear f_R_iends,
>
> I'm new on parallel programming and trying to use a machine with Windows to
> access a linux computer cluster. I could install the 'snow' package, but not
> 'Rmpi' nor 'rsprng'.
>
> Some tips for intalling such packages for Windows R ?
>
> All the best,
>
> -- 
> Barba
> Departamento de Ciências Exatas
> Universidade Federal de Lavras
> Minas Gerais - Brasil
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] determining a parent function name

2007-05-31 Thread Martin Morgan
Hi sundar --

maybe

> myerr <- function(err) err$call
> foo <- function() stop()
> tryCatch({ foo() }, error=myerr)
foo()

suggests a way to catch errors without having to change existing code
or re-invent stop?

Martin


Sundar Dorai-Raj <[EMAIL PROTECTED]> writes:

> Hi, Vladimir,
>
> Sorry, didn't see this reply. .Traceback <- NULL doesn't work because of 
> the warning in ?traceback.
>
> Warning:
>
>   It is undocumented where '.Traceback' is stored nor that it is
>   visible, and this is subject to change.  Prior to R 2.4.0 it was
>   stored in the workspace, but no longer.
>
> Thanks,
>
> --sundar
>
> Vladimir Eremeev said the following on 5/31/2007 5:10 AM:
>> 
>> 
>> Vladimir Eremeev wrote:
>>> Does
>>>   tail(capture.output(traceback()),n=1)
>>> do what you want?
>>>
>>> that is 
>>>
>> 
>> Hmmm... Seems, no...
>> 
>> Having the earlier error() definition and
>> 
>> bar<-function() error("asdasdf")
>> ft<-function() bar()
>> 
>> 
>> 
>>> ft()
>> 
>> I get in the tcl/tk window:
>> 
>> Error in bar(): asdasdf
>> 
>>> bar()
>> 
>> I get in the tcl/tk window:
>> 
>> Error in ft(): asdasdf
>> 
>>> I get in the tcl/tk window:
>> 
>> Error in bar(): asdasdf
>> 
>> Some kind of the stack flushing is needed.
>> .Traceback<-NULL did not help
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about setReplaceMethod

2007-05-24 Thread Martin Morgan
[EMAIL PROTECTED] writes:

> Sorry Martin. I had that line in my actual code. It did not work. I
> looked around all the google but it doesn't seem to have answer to
> this. Any ideas? Thank you.

A more complete solution -- remember to return 'this' throughout (R is
'pass by value' rather than pass by reference, so changes inside
functions are only changes to the _local_ definition). Some of this is
only guessing at your intention...

setClass("Foo", 
 representation(x="data.frame", y="character"))
setGeneric("setX<-",
   function(this, value) standardGeneric("setX<-"))
setReplaceMethod("setX",
 signature=signature("Foo", "data.frame"),
 function(this,value) {
 [EMAIL PROTECTED] <- value
 this
 })

setGeneric("generateFrame",
   function(this) standardGeneric("generateFrame"))
   
## generateFrame seems like it is not meant to be a replacement method
setMethod("generateFrame",
  signature=signature(this="Foo"),
  function(this) {
  frame <- data.frame(x=letters[1:5], y=letters[5:1])
  setX(this) <- frame # modifies [EMAIL PROTECTED]
  this # return 'this', an object of class Foo
  })

foo <- function() {
objFoo <- new("Foo", x=data.frame(NULL), y="")
cat("objFoo (after new)\n")
print(objFoo)
## now assign 'frame' the results of generateFrame, i.e., an
## object of class 'Foo'. objFoo does not change
frame <- generateFrame(objFoo)
cat("frame:\n")
print(frame);
## change the value of [EMAIL PROTECTED]
setX(objFoo) <- data.frame(x=LETTERS[1:5], y=LETTERS[5:1])
cat("objFoo (after setX):\n")
print(objFoo)
## what to return?? maybe just 'ok', and lose all our changes!
"ok"
}


>
>  
>
> - adschai
> - Original Message -
> From: Martin Morgan @FHCRC.ORG>
> Date: Thursday, May 24, 2007 9:35 pm
> Subject: Re: [R] Question about setReplaceMethod
> To: [EMAIL PROTECTED]
> Cc: r-help@stat.math.ethz.ch
>> Hi Adschai --
>>
>> You'll want to return the value whose slot you have modified:
>>
>> setReplaceMethod("setX", "foo",
>> function(this,value) {
>> [EMAIL PROTECTED] <- value
>> this # add this line
>> })
>>
>> Martin
>>
>> [EMAIL PROTECTED] writes:
>>
>> > Hi
>> >
>> > I have the code like I show below. The problem here is that I
>> have a
>> > setReplacementMethod to set the value of my class slot. However,
>> > this function doesn't work when I call it within another function
>> > definition (declared by setMethod) of the same class. I do not
>> > understand this behavior that much. I'm wondering how to make this
>> > work? Any help would be really appreciated. Thank you.
>> >
>> > setClass("foo",
>> > representation(x="data.frame", y="character"))
>> > setGeneric("setX<-", function(this, value),
>> standardGeneric("setX<-"))
>> > setReplaceMethod("setX", "foo",
>> > function(this,value) {
>> > [EMAIL PROTECTED] <- value
>> > })
>> > setGeneric("generateFrame", function(this),
>> standardGeneric("generateFrame"))>
>> setReplaceMethod("generateFrame", "foo",
>> > function(this) {
>> > frame <- read.csv(file="myfile.csv", header=T) # read some
>> input file
>> > [EMAIL PROTECTED] <- frame # this doesn't replace the value for me
>> > setX(this) <- frame # this doesn't replace the value for me
>> > frame # instead I have to return the frame object
>> > })
>> > foo <- function(x,y) {
>> > objFoo <- new("foo", x=data.frame(NULL), y="")
>> > frame <- generateFrame(objFoo) # after this point, nothing got
>> assigned to [EMAIL PROTECTED]
>> > setX(objFoo) <- frame # this will work (why do I have to
>> duplicate this??)
>> > }
>> > - adschai
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@stat.math.ethz.ch mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-
>> project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> --
>> Martin Morgan
>> Bioconductor / Computational Biology
>> http://bioconductor.org
>>

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about setReplaceMethod

2007-05-24 Thread Martin Morgan
Hi Adschai --

You'll want to return the value whose slot you have modified:

setReplaceMethod("setX", "foo",
 function(this,value) {
 [EMAIL PROTECTED] <- value
 this   # add this line
 })

Martin

[EMAIL PROTECTED] writes:

> Hi 
>  
> I have the code like I show below. The problem here is that I have a
> setReplacementMethod to set the value of my class slot. However,
> this function doesn't work when I call it within another function
> definition (declared by setMethod) of the same class. I do not
> understand this behavior that much. I'm wondering how to make this
> work? Any help would be really appreciated. Thank you.
>  
> setClass("foo", 
> representation(x="data.frame", y="character"))
> setGeneric("setX<-", function(this, value), standardGeneric("setX<-"))
> setReplaceMethod("setX", "foo",
> function(this,value) {
> [EMAIL PROTECTED] <- value
> })
> setGeneric("generateFrame", function(this), standardGeneric("generateFrame"))
> setReplaceMethod("generateFrame", "foo",
> function(this) {
> frame <- read.csv(file="myfile.csv", header=T) # read some input file
> [EMAIL PROTECTED] <- frame # this doesn't replace the value for me
> setX(this) <- frame # this doesn't replace the value for me
> frame # instead I have to return the frame object
> })
> foo <- function(x,y) {
> objFoo <- new("foo", x=data.frame(NULL), y="")
> frame <- generateFrame(objFoo) # after this point, nothing got assigned to 
> [EMAIL PROTECTED]
> setX(objFoo) <- frame # this will work (why do I have to duplicate this??) 
> }
> - adschai
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error message

2007-05-22 Thread Martin Morgan
Hi Karen --

This sounds like a Bioconductor question, and should be sent to the
Bioconductor list.

http://www.bioconductor.org/docs/mailList.html

Likely the complaint is about RMySQL being too old, rather than R. The
idea of 'reinstalling folders' doesn't sound like a good strategy for
updating packages; for Bioconductor see

http://www.bioconductor.org/docs/install-howto.html

likely

> source("http://www.bioconductor.org/biocLite.R";)
> biocLite("exonmap")

does the trick. If not and the problem seems to be RMySQL, then try

> biocLite("RMySQL")

or in a more robust way update all of your currently installed
packages with

> library("Biobase")
> update.packages(repos=biocReposList())

Finally, please provide a more informative subject line and the output
of

> sessionInfo()

so that the community can get a better understanding of the platform
and packages you're using, and hence the source of your problems.

Best,

Martin

karen power <[EMAIL PROTECTED]> writes:

> Hi, 
>
> I am trying to install the package exonmap and RMySQL however I keep
> getting the following error:
>
> "Error in library(pkg, character.only = TRUE) : 
> 'RMySQL' is not a valid package -- installed < 2.0.0?"
>
> I have R version 2.4.1 so I know its not a version issue. I deleted and
> reinstalled the folders again and the same thing happened. Has anyone
> any ideas?
>
> Thanks, 
>
> Karen
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "welcome" message upon loading data

2007-05-22 Thread Martin Morgan
Hi Simon --

?data indicates that the first file looked for is mydata.R, and then
mydata.RData and so on. So add a file mydata.R to your data directory that
contains R code to print a message and then loads the data. This hack
is used in Bioconductor, e.g., the Biobase package

http://bioconductor.org/packages/2.0/bioc/html/Biobase.html

to indicated that some data sets are deprecated.

Martin

simon bond <[EMAIL PROTECTED]> writes:

> Dear R-help,
>
> I'm building a package which will contain a data set. I was wondering if it's 
> possible to make a message appear on the console whenever a user loads the 
> data. So the console would look like
>
>>data(mydata)
> "Please do not use these data in any publication without permission of the 
> authors"
>>
>
> Would  this message be within the terms of GPL? 
>
> Looking at the ?data page, it seems "packageIQR"  might be the way forward, 
> but I couldn't find any further information on this.
>
>
> Thanks
>
>
> Simon Bond.
>
>
>   ___ 
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unit Testing Frameworks: summary and brief discussion

2007-05-09 Thread Martin Morgan
Oops, taking a look at the unit tests in RUnit, I see that specifying
'where=.GlobalEnv' is what I had been missing.

testCreateClass <- function() {
setClass("A", contains="numeric", where=.GlobalEnv)
a=new("A")
checkTrue(validObject(a))
removeClass("A", where=.GlobalEnv)
checkException(new("A"))
}

Executing test function testCreateClass  ...  done successfully.

RUNIT TEST PROTOCOL -- Wed May  9 11:11:27 2007 
*** 
Number of test functions: 1 
Number of errors: 0 
Number of failures: 0 

Sorry for the noise. Martin

Martin Morgan <[EMAIL PROTECTED]> writes:

> [EMAIL PROTECTED] writes:
>
>> [EMAIL PROTECTED] wrote:
>> [...]
>>> = From Seth Falcon:
>>>   1. At last check, you cannot create classes in unit test code and
>>>  this makes it difficult to test some types of functionality.  I'm
>>>  really not sure to what extent this is RUnit's fault as opposed
>>>  to limitation of the S4 implemenation in R.
>>
>> I'd be very interested to hear what problems you experienced. If you 
>> have any example ready I'd be happy to take a look at it.
>> So far we have not observed (severe) problems to create S4 classes and 
>> test them in unit test code. We actually use RUnit mainly on S4 classes 
>> and methods. There are even some very simple checks in RUnits own test 
>> cases which create and use S4 classes. For example in tests/runitRunit.r
>> in the source package.
>
> RUnit has been great for me, helping to develop a more rigorous
> programming approach and gaining confidence that my refactoring
> doesn't (unintentionally) break the established contract.
>
> One of the strengths of unit tests -- reproducible and expressible in
> the way that language sometimes is not:
>
> testCreateClass <- function() {
> setClass("A", contains="numeric")
> checkTrue(TRUE)
> }
>
>
> RUNIT TEST PROTOCOL -- Wed May  9 10:36:53 2007 
> *** 
> Number of test functions: 1 
> Number of errors: 1 
> Number of failures: 0 
>
>  
> 1 Test Suite : 
> CreateClass_test - 1 test function, 1 error, 0 failures
> ERROR in testCreateClass: Error in assign(mname, def, where) : cannot add 
> bindings to a locked environment
>
>> sessionInfo()
> R version 2.6.0 Under development (unstable) (2007-05-07 r41468) 
> x86_64-unknown-linux-gnu 
>
> locale:
> LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] "tools" "stats" "graphics"  "grDevices" "utils" "datasets" 
> [7] "methods"   "base" 
>
> other attached packages:
>RUnit
> "0.4.15"
>
> -- 
> Martin Morgan
> Bioconductor / Computational Biology
> http://bioconductor.org
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unit Testing Frameworks: summary and brief discussion

2007-05-09 Thread Martin Morgan
[EMAIL PROTECTED] writes:

> [EMAIL PROTECTED] wrote:
> [...]
>> = From Seth Falcon:
>>   1. At last check, you cannot create classes in unit test code and
>>  this makes it difficult to test some types of functionality.  I'm
>>  really not sure to what extent this is RUnit's fault as opposed
>>  to limitation of the S4 implemenation in R.
>
> I'd be very interested to hear what problems you experienced. If you 
> have any example ready I'd be happy to take a look at it.
> So far we have not observed (severe) problems to create S4 classes and 
> test them in unit test code. We actually use RUnit mainly on S4 classes 
> and methods. There are even some very simple checks in RUnits own test 
> cases which create and use S4 classes. For example in tests/runitRunit.r
> in the source package.

RUnit has been great for me, helping to develop a more rigorous
programming approach and gaining confidence that my refactoring
doesn't (unintentionally) break the established contract.

One of the strengths of unit tests -- reproducible and expressible in
the way that language sometimes is not:

testCreateClass <- function() {
setClass("A", contains="numeric")
checkTrue(TRUE)
}


RUNIT TEST PROTOCOL -- Wed May  9 10:36:53 2007 
*** 
Number of test functions: 1 
Number of errors: 1 
Number of failures: 0 

 
1 Test Suite : 
CreateClass_test - 1 test function, 1 error, 0 failures
ERROR in testCreateClass: Error in assign(mname, def, where) : cannot add 
bindings to a locked environment

> sessionInfo()
R version 2.6.0 Under development (unstable) (2007-05-07 r41468) 
x86_64-unknown-linux-gnu 

locale:
LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C

attached base packages:
[1] "tools" "stats" "graphics"  "grDevices" "utils" "datasets" 
[7] "methods"   "base" 

other attached packages:
   RUnit
"0.4.15"

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem installing Rmpi with lam on SGI SLES9

2007-04-25 Thread Martin Morgan
Hendrik,

Are you starting the lam daemons before starting R?

% lamboot

You might need to specify a 'hosts' argument to lamboot. The default
way Rmpi calls lamboot is with no arguments, and this might simply
create a single lam daemon.

I don't usually use papply, but glancing at it's code suggests that it
does require(Rmpi) and then decides based on the result of
mpi.comm.size what to do. So to debug, load Rmpi and try
mpi.comm.size. As a work-around, I think it should be possible to

> library(Rmpi)
> mpi.spawn.Rslaves(nslaves=64) # maybe a little bold, initially!

before making your first papply call.

Hope that helps

Martin

"Hendrik Fuß" <[EMAIL PROTECTED]> writes:

> On 24/04/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
>> On Tue, 24 Apr 2007, Hendrik Fuß wrote:
>>
>> > Hi,
>> >
>> > I've been trying here to install Rmpi on an SGI IA-64 machine with 64
>> > processors, running SuSE Linux Enterprise Server 9, R 2.4.0 and
>> > lam-mpi 7.1.3. While I've read of similar problems on this list, I
>> > think I've got an entirely new set of error messages to contribute
>> > (see below). I'm not sure what the actual error is and what the @gprel
>> > relocation message is about. Any help greatly appreciated.
>>
>> I don't know for sure, but on many 64-bit OSes you cannot link code from
>> static libraries into dynamic shared libraries, and that seems to be the
>> case with ia64 Linux.  Almost certainly you need to re-compile LAM with
>> -fPIC flags.
>
> Yes, thanks a million, this solved the problem.
>
> While Rmpi now works, there is another issue connected with this one:
> I'm trying to use the papply package. However, when I do:
>
>> library(papply)
>> papply(list(1:10, 1:15, 1:20), sum)
> 1 slaves are spawned successfully. 0 failed.
> master (rank 0, comm 1) of size 2 is running on: behemoth
> slave1 (rank 1, comm 1) of size 2 is running on: behemoth
> [1] "Running serial version of papply\n"
>
> Papply only spawns one slave and then decides to run the serial
> version instead. I'm not sure how to tell it to use all the 64
> processors available.
>
> Hendrik
>
> -- 
> Hendrik Fuß
> PhD student
> Systems Biology Research Group
>
> University of Ulster, School of Biomedical Sciences
> Cromore Road, Coleraine, BT52 1SA, Northern Ireland
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R as a server on Linux

2007-04-24 Thread Martin Morgan
RWebServices

http://wiki.fhcrc.org/caBioc/

offers a more structured approach to this -- map R functions and data
classes to their Java representation, expose Java as a web service,
service requests using persistent R workers.

Martin

Markus Loecher <[EMAIL PROTECTED]> writes:

> Hi,
> I am trying to avid the somewhat costly startup overhead of launching 
> a separate R executable for each "client" request on Linux.
> My current architecture is such that My Java client explicitly calls 
> R in batch mode and passes it certain parameters. The initital 
> startup takes almost 10 seconds because R has to load a bunch of 
> libraries as well as a moderately large, previously created workspace.
> I am thinking that it would be so much more eficient to instead have 
> R act as a server and fork off a thread for each client query. Is 
> that possible at all ?
>
> Thanks!
> Markus
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in clusterApply(): recursive default argument reference

2007-04-24 Thread Martin Morgan
Hi Nicolas --

I think your code is assuming that all nodes have access to the same
set of variables. One solution is to write in a more completely
'functional' style

parNullDistribIntersection <- function(g1, g2, perm=1000, cl=cl) {
n1 = nodes(g1)  
parSapply(cl, 1:perm,
  function(i, g1, g2, n1) {
  nodes(g1) <- sample(n1)
  numEdges(intersection(g1,g2))
  },
  g1, g2, n1)
}  

Another possibility is to ensure that the relevant variables are
exported to the cluster before parSapply. With both of these you'll
want to pay some attention to the costs of communicating the data to
the nodes, which could easily be overwhelming (Rmpi's version of
parSapply is smarter at doing the data transfer -- log n time rather
than linear time, where n is the number of nodes -- and more flexible
in distributing the work).

Hope that helps,

Martin


nicolas bertin <[EMAIL PROTECTED]> writes:

> Hi,
>
> I want to compute a distribution of the intersection of a graph and
> 'randomized' graphs induced by the permutations of node labels (to
> preserve the graph topology).
> Since I ll have many permutations to perform, I was thinking of using
> the snow package and in particular "parSapply" to divide the work
> between my 4 CPUs.
> But I get the following error message :
>
> Error in clusterApply(cl, splitList(x, length(cl)), lapply, fun, ...) : 
> recursive default argument reference
>
> What am i doing wrong ?
>
> Some details about my platform and R version :
> ---
> $platform
> [1] "x86_64-redhat-linux-gnu"
> $version.string
> [1] "R version 2.4.1 (2006-12-18)"
>
>
> Below is the snippet of code generating the error message :
> ---
> ### libraries ###
> library(graph)
> library(snow)
>
> ### functions ###
> nullDistribIntersection <- function(g1, g2, perm=1000)
> {
>   n1 <- nodes(g1)
>   sapply(1:perm,
> function(i)
> {
> nodes(g1) <- sample(n1)
>   numEdges(intersection(g1,g2))
> }
>)
> }
>
>
> parNullDistribIntersection <- function(g1, g2, perm=1000, cl=cl)
> {
>   n1 <- nodes(g1)
>   parSapply(cl, 1:perm,
> function(i)
> {
> nodes(g1) <- sample(n1)
>   numEdges(intersection(g1,g2))
> }
>)
> }
>
>
> ## initialize the graph and the rand seed
> set.seed(123)
> g <- randomEGraph(LETTERS[1:15], edges = 100)
>
> ## compute a distribution of the intersection of the graph 
> ## and a 'randomized' graph induced by the permutations of
> ## node labels (to preserve the graph topology)
> cat("1CPU sys time:",
> system.time(
>  dist <- nullDistribIntersection(g,g)
>),
> "\n"
> )
>
> cl <- makeCluster(c("localhost", "localhost"), type = "SOCK")
> cat("Cluster sys time:",
> system.time(
>  dist <- parNullDistribIntersection(g,g)
>),
> "\n"
> )
> --------
>
> Thanks in advance.
>
> Nicolas Bertin 
> GSC / RIKEN Yokohama
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compiling C codes in Windows

2007-04-02 Thread Martin Morgan
Tong Wang <[EMAIL PROTECTED]> writes:

> Hi All,
[...]
> After I followed all the instructions in " Building R for Windows" 
[...]
>  Anyway, I assumed that I need to run this command in Cygwin

No, do this from a DOS shell. The usual problems are either missing
components from the installation steps outlined in Building R for
Windows, or an incorrect PATH variable so that windows or cygwin
versions of programs are used rather than MinGW / Rtools.

Also, this question belongs on the R-devel list, where those doing
development work are more likely to see and respond to your
questions.

Hope that helps,

Martin Morgan
>
> $ make all recommended
> make: ./Rpwd.exe: Command not found
> make[1]: ./Rpwd.exe: Command not found
> make --no-print-directory -C front-ends Rpwd
> make -C ../../include -f Makefile.win version
> make Rpwd.exe
> gcc  -O3 -Wall -pedantic -I../../include  -c rpwd.c -o rpwd.o
> rpwd.c:22:20: direct.h: No such file or directory
> rpwd.c: In function `main':
> rpwd.c:42: warning: implicit declaration of function `chdir'
> rpwd.c:45: warning: implicit declaration of function `getcwd'
> make[3]: *** [rpwd.o] Error 1
> make[2]: *** [Rpwd] Error 2
> make[1]: *** [front-ends/Rpwd.exe] Error 2
> make: *** [all] Error 2 
>
>
> Can I get some help with my quesitons , and some suggestions concerning the 
> best way to solve the whole problem ?
>
> THanks a lot  for your help..
>
> tong
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] snow parLapply standard output

2007-03-27 Thread Martin Morgan
Frank -- Perhaps what you want to do is

summarize <- function(lst) {
lapply(lst, function(elt) {
if (elt$status=="ok") elt$value
else NA
})
}

summarize(parLapply(tasks, function(elt) {
# fancy calculation, then
list(status="ok", value=...)
}))

i.e., return status along with value from the nodes, and
'post-process' the result. Perhaps status is set by capture.output on
the node. Maybe you're hoping to change how later tasks are evaluated
based on results of earlier tasks; but this makes the lapply
sequential rather than parallel.

Martin

"Frank Preiswerk" <[EMAIL PROTECTED]> writes:

> I am slightly confused by the way the standard output is redirected in a R
> snow cluster environment.
> I am using parLapply from the snow package to execute a function on my
> MPI/LAM cluster. How can I redirect standard output (produced using "cat")
> from this function back to the terminal where I invoked it? I intend to
> transmit some status information in advance to the final result of the
> function. I investigated the chain of functions called by parLapply and it
> seems that snow is designed to just retrieve the final result of the
> computation.
>
> Thanks in advance for any hints,
> Frank Preiswerk
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RMySQL on win32

2007-03-12 Thread Martin Morgan
Pete Cap <[EMAIL PROTECTED]> writes:

> List,
>
> I am still unable to compile RMySQL on XP and would appreciate any
> assistance anyone could provide.
>
> I know that setting up RMySQL on win32 is not easy.  The
> installation instructions are supposedly contained in
> ../src/README.win.  They give instructions on creating a file,
> libmysql.a, which I was able to do successfully.
>
> Here is where the instructions basically break down.  The creation
> of this file all occurs in \MySQL\..\lib\opt, after which the reader
> is instructed to copy libmysql.a to ..\lib\opt (huh?  You mean,
> where it already is?).  Then the reader is instructed to build the
> binaries with Rcmd build --binary RMySQL.
>
>>From the windows command shell, the result is:
>
> C:\>Rcmd build --binary RMySQL
> * checking for file 'RMySQL/DESCRIPTION' ... OK
> * preparing 'RMySQL':
> * checking DESCRIPTION meta-information ...'sh' is not recognized as an 
> internal  or external command, operable program or batch file.
>  OK
> * cleaning src

> 'sh' is not recognized as an internal or external command, operable program 
> or batch file.
> Error: cannot open file 'c:/TEMP/Rout381268676' for reading

> Apparently R is trying to call some shell script (from the windows
> prompt??) so I attempted this in cygwin.  Results:

Is this because you do not have the R tool chain required for building
R packages on Windows correctly installed? See the R Installation and
Administration Guide

http://cran.r-project.org/doc/manuals/R-admin.html

section 3 (Installing R under Windows); also perhaps

http://wiki.fhcrc.org/bioc/HowTo/Build_R_on_Windows

You'll eventually want Rtools\bin ahead of any cygwin paths in your
PATH variable, so that

c:\> which sh

('which' is a cygwin command, cygwin is not technically required to
install R packages on Windows) points to the R version of sh.

I seem to remember having some trouble finding reimp (used by
configure.win), with the link here

http://www.mingw.org/mingwfaq.shtml

providing some help -- you'll want reimp available in the bin
directory of MinGW; this might not be relevant.

Martin

> Can't locate R/Dcf.pm in @INC (@INC contains: c 
> \PROGRA~1\R\R-24~1.1\share\perl; /usr/lib/perl5/5.8/cygwin /usr/lib/perl5/5.8 
> /usr/lib/perl5/site_perl/5.8/cygwin /usr/lib/perl5/site_perl/5.8 
> /usr/lib/perl5/site_perl/5.8/cygwin /usr/lib/perl5/site_perl/5.8 
> /usr/lib/perl5/vendor_perl/5.8/cygwin /usr/lib/perl5/vendor_perl/5.8 
> /usr/lib/perl5/vendor_perl/5.8/cygwin /usr/lib/perl5/vendor_perl/5.8 .) at 
> c:\PROGRA~1\R\R-24~1.1/bin/build line 29.
> BEGIN failed--compilation aborted at c:\PROGRA~1\R\R-24~1.1/bin/build line 29.
>
> Dcf.pm is actually located in C:\Program Files\R\R-2.4.1\share\perl\R.  I 
> wonder if the variable @INC is simply incorrect (it's looking under R-24~1.1, 
> not sure if the truncated value is actually correct) but I have no idea in 
> which file it may be located.
>
> Anyone have any ideas?
> I have installed mingw-utils while attempting to get this up and running, if 
> it matters.
>
> If there is a simply better solution that I should try, I would appreciate 
> hearing about it as well.  All I really need to do at this point is send 
> select and join queries to the local server--perhaps I should just install 
> RSQLite from CRAN?
>
> Thanks in advance,
>
> Pete
>
>  
> -
> Be a PS3 game guru.
>
>   [[alternative HTML version deleted]]
>
> ______
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to utilise dual cores and multi-processors on WinXP

2007-03-06 Thread Martin Morgan
[EMAIL PROTECTED] writes:

> Hello,
>
> I have a question that I was wondering if anyone had a fairly
> straightforward answer to: what is the quickest and easiest way to
> take advantage of the extra cores / processors that are now
> commonplace on modern machines? And how do I do that in Windows?

> I realise that this is a complex question that is not answered easily,
> so let me refine it some more. The type of scripts that I'm dealing
> with are well suited to parallelisation - often they involve mapping
> out parameter space by changing a single parameter and then re-running
> the simulation 10 (or n times), and then brining all the results back
> to gether at the end for analysis. If I can distribute the runs over
> all the processors available in my machine, I'm going to roughly halve
> the run speed. The question is, how to do this?
>
> I've looked at many of the packages in this area: rmpi, snow, snowFT,
> rpvm, and taskPR - these all seem to have the functionality that I
> want, but don't exist for windows. The best solution is to switch to
> Linux, but unfortunately that's not an option.

Rmpi runs on windows (see http://www.stats.uwo.ca/faculty/yu/Rmpi/).

You'll end up modifying your code, probably using one of the many
parLapply-like functions (from Rmpi; comparable functions in snow and
the package papply) to do 'lapply' but spread over the different
compute processors. This is likely to require some thought, as for
instance the data transmission costs can overwhelm any speedup and the
FUN argument to the lapply-like functions should probably reference
only local variables. The classic first attempt performs the
equivalent of 1000 bootstraps on each node, rather than dividing the
1000 replicates amongst nodes (which is actually quite hard to do).

In principle I think you might also be able to use a parallelized
LAPACK, following the general instruction of the R Installation and
Administration guide. I have not done this. It would likely represent
a challenge, and would benefit (perhaps) the code that uses the LAPACK
linear algebra routines.

> Another option is to divide the task in half from the beginning, spawn
> two "slave" instances of R (e.g. via Rcmd), let them run, and then
> collate the results at the end. But how exactly to do this and how to
> know when they're done?

The Bioconductor package Biobase has a function Aggregate that might
be fun to explore; I don't think it receives much use.

> Can anyone recommend a nice solution? I'm sure that I'm not the only
> one who'd love to double their computational speed...
>
> Cheers,
>
> Mark
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
> posting guide http://www.R-project.org/posting-guide.html and provide
> commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] eval(parse(text vs. get when accessing a function

2007-01-07 Thread Martin Morgan
Hi Ramon,

It seems like a naming convention (f.xxx) and eval(parse(...)) are
standing in for objects (of class 'GeneSelector', say, representing a
function with a particular form and doing a particular operation) and
dispatch (a function 'geneConverter' might handle a converter of class
'GeneSelector' one way, user supplied ad-hoc functions more carefully;
inside geneConverter the only real concern is that the converter
argument is in fact a callable function).

eval(parse(...)) brings scoping rules to the fore as an explicit
programming concern; here scope is implicit, but that's probably better
-- R will get its own rules right.

Martin

Here's an S4 sketch:

setClass("GeneSelector",
 contains="function",
 representation=representation(description="character"),
 validity=function(object) {
 msg <- NULL
 argNames <- names(formals(object))
 if (argNames[1]!="x")
   msg <- c(msg, "\n  GeneSelector requires a first argument named 
'x'")
 if (!"..." %in% argNames)
   msg <- c(msg, "\n  GeneSelector requires '...' in its signature")
 if (0==length([EMAIL PROTECTED]))
   msg <- c(msg, "\n  Please describe your GeneSelector")
 if (is.null(msg)) TRUE else msg
 })

setGeneric("geneConverter",
   function(converter, x, ...) standardGeneric("geneConverter"),
   signature=c("converter"))

setMethod("geneConverter",
  signature(converter="GeneSelector"),
  function(converter, x, ...) {
  ## important stuff here
  converter(x, ...)
  })

setMethod("geneConverter",
  signature(converter="function"),
  function(converter, x, ...) {
  message("ad-hoc converter; hope it works!")
  converter(x, ...)
  })

and then...

> c1 <- new("GeneSelector",
+   function(x, ...) prod(x, ...),
+   description="Product of x")
> 
> c2 <- new("GeneSelector",
+   function(x, ...) sum(x, ...),
+   description="Sum of x")
> 
> geneConverter(c1, 1:4)
[1] 24
> geneConverter(c2, 1:4)
[1] 10
> geneConverter(mean, 1:4)
ad-hoc converter; hope it works!
[1] 2.5
> 
> cvterr <- new("GeneSelector", function(y) {})
Error in validObject(.Object) : invalid class "GeneSelector" object: 1: 
  GeneSelector requires a first argument named 'x'
invalid class "GeneSelector" object: 2: 
  GeneSelector requires '...' in its signature
invalid class "GeneSelector" object: 3: 
  Please describe your GeneSelector
> xxx <- 10
> geneConverter(xxx, 1:4)
Error in function (classes, fdef, mtable)  : 
unable to find an inherited method for function "geneConverter", for 
signature "numeric"


"Ramon Diaz-Uriarte" <[EMAIL PROTECTED]> writes:

> Dear Greg,
>
>
> On 1/5/07, Greg Snow <[EMAIL PROTECTED]> wrote:
>> Ramon,
>>
>> I prefer to use the list method for this type of thing, here are a couple of 
>> reasons why (maybe you are more organized than me and would never do some of 
>> the stupid things that I have, so these don't apply to you, but you can see 
>> that the general suggestion applys to some of the rest of us).
>>
>
>
> Those suggestions do apply to me of course (no claim to being
> organized nor beyond idiocy here). And actually the suggestions on
> this thread are being very useful. I think, though, that I was not
> very clear on the context and my examples were too dumbed down. So
> I'll try to give more detail (nothing here is secret, I am just trying
> not to bore people).
>
> The code is part of a web-based application, so there is no
> interactive user. The R code is passed the arguments (and optional
> user functions) from the CGI.
>
> There is one "core" function (call it cvFunct) that, among other
> things, does cross-validation. So this is one way to do things:
>
> cvFunct <- function(whatever, genefiltertype, whateverelse) {
>   internalGeneSelect <- eval(parse(text = paste("geneSelect",
>  genefiltertype, sep = ".")))
>
>   ## do things calling internalGeneSelect,
> }
>
> and now define all possible functions as
>
> geneSelect.Fratio <- function(x, y, z) {##something}
> geneSelect.Wilcoxon <- function(x, y, z) {## something else}
>
> If I want more geneSelect functions, adding them is simple. And I can
> even allow the user to pass her/his own functions, with the only
> restriction that it takes three args, x, y, z, and that the function
> is to be called: "geneSelect." and a user choosen string. (Yes, I need
> to make sure no calls to "system", etc, are in the user code, etc,
> etc, but that is another issue).
>
> The general idea is not new of course. For instance, in package
> "e1071", a somewhat similar thing is done in function "tune", and
> David Meyer there uses "do.call". However, tune is a lot more general
> than what I had in mind. For instance, "tune" deals with arbitrary
> functions, with arbitrary num

Re: [R] Memory problem on a linux cluster using a large data set [Broadcast]

2006-12-21 Thread Martin Morgan
Section 8 of the Installation and Administration guide says that on
64-bit architectures the 'size of a block of memory allocated is
limited to 2^32-1 (8 GB) bytes'.

The wording 'a block of memory' here is important, because this sets a
limit on a single allocation rather than the memory consumed by an R
session. The size of the allocation of the original poster was
something like 300,000 SNPs x 1000 individuals x 8 bytes (depending on
representation, I guess) = about 2.3 GB so there is still some room
for even larger data.

Obviously it's important to think carefully about how the statistical
analysis of such a large volume of data will proceed, and be
interpreted.

Martin

Thomas Lumley <[EMAIL PROTECTED]> writes:

> On Thu, 21 Dec 2006, Iris Kolder wrote:
>
>> Thank you all for your help!
>>
>> So with all your suggestions we will try to run it on a computer with a 
>> 64 bits proccesor. But i've been told that the new R versions all work 
>> on a 32bits processor. I read in other posts that only the old R 
>> versions were capable of larger data sets and were running under 64 bit 
>> proccesors. I also read that they are adapting the new R version for 64 
>> bits proccesors again so does anyone now if there is a version available 
>> that we could use?
>
> Huh?  R 2.4.x runs perfectly happily accessing large memory under Linux on 
> 64bit processors (and Solaris, and probably others). I think it even works 
> on Mac OS X now.
>
> For example:
>> x<-rnorm(1e9)
>> gc()
>   used   (Mb) gc trigger   (Mb)   max used   (Mb)
> Ncells 222881   12.0 467875   25.0 35   18.7
> Vcells 1000115046 7630.3 1000475743 7633.1 1000115558 7630.3
>
>
>  -thomas
>
> Thomas Lumley Assoc. Professor, Biostatistics
> [EMAIL PROTECTED] University of Washington, Seattle
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin T. Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory problem on a linux cluster using a large data set

2006-12-18 Thread Martin Morgan
Iris --

I hope the following helps; I think you have too much data for a
32-bit machine.

Martin

Iris Kolder <[EMAIL PROTECTED]> writes:

> Hello,
>  
> I have a large data set 320.000 rows and 1000 columns. All the data
> has the values 0,1,2.

It seems like a single copy of this data set will be at least a couple
of gigabytes; I think you'll have access to only 4 GB on a 32-bit
machine (see section 8 of the R Installation and Administration guide),
and R will probably end up, even in the best of situations, making at
least a couple of copies of your data. Probably you'll need a 64-bit
machine, or figure out algorithms that work on chunks of data.

> on a linux cluster with R version R 2.1.0.  which operates on a 32

This is quite old, and in general it seems like R has become more
sensitive to big-data issues and tracking down unnecessary memory
copying.

> “cannot allocate vector size 1240 kb”. I’ve searched through

use traceback() or options(error=recover) to figure out where this is
actually occurring.

> SNP <- read.table("file.txt", header=FALSE, sep="")# read in data file

This makes a data.frame, and data frames have several aspects (e.g.,
automatic creation of row names on sub-setting) that can be problematic
in terms of memory use. Probably better to use a matrix, for which:

 'read.table' is not the right tool for reading large matrices,
 especially those with many columns: it is designed to read _data
 frames_ which may have columns of very different classes. Use
 'scan' instead.

(from the help page for read.table). I'm not sure of the details of
the algorithms you'll invoke, but it might be a false economy to try
to get scan to read in 'small' versions (e.g., integer, rather than
numeric) of the data -- the algorithms might insist on numeric data,
and then make a copy during coercion from your small version to
numeric.

> SNP$total.NAs = rowSums(is.na(SN # calculate the number of NA per row 
> and adds a colum with total Na's

This adds a column to the data.frame or matrix, probably causing at
least one copy of the entire data. Create a separate vector instead,
even though this unties the coordination between columns that a data
frame provides.

> SNP  = t(as.matrix(SNP))  # transpose rows and columns

This will also probably trigger a copy; 

> snp.na<-SNP

R might be clever enough to figure out that this simple assignment
does not trigger a copy. But it probably means that any subsequent
modification of snp.na or SNP *will* trigger a copy, so avoid the
assignment if possible.

> snp.roughfix<-na.roughfix(snp.na) 
> fSNP<-factor(snp.roughfix[, 1])# Asigns factor to case 
> control status
>  
> snp.narf<- randomForest(snp.roughfix[,-1], fSNP, na.action=na.roughfix, 
> ntree=500, mtry=10, importance=TRUE, keep.forest=FALSE, do.trace=100)

Now you're entirely in the hands of the randomForest. If memory
problems occur here, perhaps you'll have gained enough experience to
point the package maintainer to the problem and suggest a possible
solution.

> set it should be able to cope with that amount. Perhaps someone has
> tried this before in R or is Fortram a better choice? I added my R

If you mean a pure Fortran solution, including coding the random
forest algorithm, then of course you have complete control over memory
management. You'd still likely be limited to addressing 4 GB of
memory. 


> I wrote a script to remove all the rows with more than 46 missing
> values. This works perfect on a smaller dataset. But the problem
> arises when I try to run it on the larger data set I get an error
> “cannot allocate vector size 1240 kb”. I’ve searched through
> previous posts and found out that it might be because i’m running it
> on a linux cluster with R version R 2.1.0.  which operates on a 32
> bit processor. But I could not find a solution for this problem. The
> cluster is a really fast one and should be able to cope with these
> large amounts of data the systems configuration are Speed: 3.4 GHz,
> memory 4GByte. Is there a way to change the settings or processor
> under R? I want to run the function Random Forest on my large data
> set it should be able to cope with that amount. Perhaps someone has
> tried this before in R or is Fortram a better choice? I added my R
> script down below.
>  
> Best regards,
>  
> Iris Kolder
>  
> SNP <- read.table("file.txt", header=FALSE, sep="")# read in data file
> SNP[SNP==9]<-NA   # change missing values 
> from a 9 to a NA
> SNP$total.NAs = rowSums(is.na(SN # calculate the number of NA per row 
> and adds a colum with total Na's
> SNP = SNP[ SNP$total.NAs < 46,  ] # create a subset with no more than 
> 5%(46) NA's 
> SNP$total.NAs=NULL  # remove added column with 
> sum of NA's
> SNP  = t(as.matrix(SNP))  # transpose rows and columns
> set.

Re: [R] S4 'properties' - creating 'slot' functions?

2006-12-12 Thread Martin Morgan
Hi Tariq

Some discussion of this topic a while ago on the R-devel newsgroup;
bottom line is that there is no consensus and a certain amount of
resistance to make R conform to the implementation of other languages
(and not exploit R's unique strengths).

https://stat.ethz.ch/pipermail/r-devel/2006-September/042854.html
https://stat.ethz.ch/pipermail/r-devel/2006-September/042864.html

You could implement something fancier (e.g., a method for "$" or "[["
or "@" or a suite of generics "get_x"), but I kind of like (because
'slot' is a function with a well-defined purpose, so why not make it a
generic?)

setGeneric("slot")

setMethod("slot",
  signature=signature(object="A", name="character"),
  function(object, name)
  ## do anything, e.g., restrict access to specific slots
  ## alternatively, dispatch on name with 'switch'
  if (name %in% slotNames(class(object))) callNextMethod()
  else stop("slot '", name, "' undefined"))

setGeneric("slot<-",
   signature=c("object", "name", "value"))

setReplaceMethod("slot",
 signature(object="A", name="character", value="ANY"),
 function(object, name, check=TRUE, value)
 if (name %in% slotNames(class(object))) callNextMethod()
 else stop("slot '", name, "' cannot be assigned"))

and then

> setClass("A",
+  representation=representation(x="numeric"))
[1] "A"
> a <- new("A", x=1:5)
> slot(a, "y")
Error in slot(a, "y") : slot 'y' undefined
> slot(a, "x") <- 5:1
> slot(a, "x")
[1] 5 4 3 2 1

A class could be defined to implement the dispatch, so other classes
just have to inherit from that.

Discussion may be veering toward the R-devel newsgroup.

Martin

"¨Tariq Khan" <[EMAIL PROTECTED]> writes:

> Dear R users!
>
> Several languages like C# or VB allow one to create properties; typically
> these are under 'get', 'set' tags. My question is this really, what are
> similar or comparable strategies in R, and what might be preferred? I have a
> couple of situations where I want a certain computation to belong to a
> class, but I do not really want to seperate it out to a stand-alone
> function.
>
> Here are a couple of options I see possible in R, and if I use the last
> described option (see option C below), is there a way to access the object
> with something like in C# or C++ where one would probably use the 'this'
> keyword or the 'me' keyword in VB from within the property?:
>
> A) [EMAIL PROTECTED] is static and must be handled carefully if it depends
> on other slots of the class. If one of the slots changes, then because this
> slot is static (containing only the value), it will not change to reflect
> the update.
>
> B) [EMAIL PROTECTED] is deprecated, and special accessor functions are used.
> The bio group seems to favor this option. So one would use:
> Property(MyObject). The drawback I see with this is that there is no strong
> relationship between Property and MyObject; they are not explicitly related
> via a class structure, and in some way this defeats the point of class
> structures.
>
> C) Let [EMAIL PROTECTED] be a function, so in order to fetch the slot object
> we would use [EMAIL PROTECTED](); the question here is can I use something
> like a 'this' keyword to access other slots of thr object without having to
> pass the whole class as a parameter to this slot function, ie the
> round-about, clumsy way would have to look like [EMAIL PROTECTED](MyObject).
> Ideally, the pseudo-code I have in mind might look like this for an object
> with two slots (A and B)
>
> [EMAIL PROTECTED] = function() return([EMAIL PROTECTED])
>
>
> Thoughts and considerations would be very interesting.
>
> -Tariq
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin T. Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tests for NULL objects

2006-11-29 Thread Martin Morgan
Bert Gunter <[EMAIL PROTECTED]> writes:

> Merely convention.

There's a similar thread in the R-devel newgroup about
'prod(numeric(0)) surprise' started by Ben Bolker, 09 Jan 2006,
(https://stat.ethz.ch/pipermail/r-devel/2006-January/036041.html) and
the post by Andy Liaw
(https://stat.ethz.ch/pipermail/r-devel/2006-January/036057.html)
proved particularly helpful to me (mathematical reasons, in addition
to convention):

> From: Liaw, Andy 
> Date: Mon 09 Jan 2006 - 18:27:41 GMT


> If you haven't seen this in your math courses, perhaps this would help:

> http://en.wikipedia.org/wiki/Empty_set

> which says, in part:

> Operations on the empty set

> Operations performed on the empty set (as a set of things to be
> operated upon) can also be confusing. (Such operations are nullary
> operations.) For example, the sum of the elements of the empty set
> is zero, but the product of the elements of the empty set is one
> (see empty product). This may seem odd, since there are no elements
> of the empty set, so how could it matter whether they are added or
> multiplied (since "they" do not exist)? Ultimately, the results of
> these operations say more about the operation in question than about
> the empty set. For instance, notice that zero is the identity
> element for addition, and one is the identity element for
> multiplication.

> Andy 

'any' is like 'sum', 'all' is like 'prod'.

Martin

> NULL == 2  <==> logical(0), that is, a logical vector of length 0. It makes
> sense (at least to me) that any(logical(0)) is FALSE, since no elements of
> the vector are TRUE. all(logical(0)) is TRUE since no elements of the vector
> are FALSE.
>
> I think these are reasonable and fairly standard conventions, but even if
> you disagree, they are certainly not worth making a fuss over and certainly
> cannot be changed without breaking a lot of code, I'm sure. 
>
> Bert Gunter
> Nonclinical Statistics
> 7-7374
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Benilton Carvalho
> Sent: Wednesday, November 29, 2006 2:21 PM
> To: R-Mailingliste
> Subject: [R] tests for NULL objects
>
> Hi Everyone,
>
> After searching the subject and not being successful, I was wondering  
> if any you could explain me the idea behind the following fact:
>
> all(NULL == 2)  ## TRUE
> any(NULL == 2) ## FALSE
>
> Thanks a lot,
>
> Benilton
>
> --
> Benilton Carvalho
> PhD Candidate
> Department of Biostatistics
> Johns Hopkins University
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin T. Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data storage/cubes and pointers in R

2006-11-09 Thread Martin Morgan
In case the other replies aren't to your liking, and you want to write
something yourself...

Piet van Remortel <[EMAIL PROTECTED]> writes:

[snip]

> Also considering implementing a similar setup myself, I started  
> wondering about the possibility of use references (or "pointers"  
> aargh) to dataframes and store them in a list etc.   Separate lists

My own experimentation with this is to create an S4 'View' class that
indexes / subsets / accesses small parts of the 'big' data, with the
actual data treated essentially as 'read-only' or otherwise abstracted
out of memory. Something along the lines of

setClass("ViewSet",
 representation=representation(
   data="environment", # environments are reference-like
   idx="list" # 1 element per dimension, or something more clever
   ))

setMethod("initialize",
  signature(.Object="ViewSet"),
  function(.Object, ...) {
  env <- new.env()
  ## get the big data: arguments to "new" / SQL query / ???
  ## assign big data to env (e.g., see below) then
  [EMAIL PROTECTED] <- env
  ## set up idx
  ## ...
  .Object
  })

setMethod("[",
  signature(x="ViewSet"),
  function (x, i, j, ..., drop = TRUE) {
  ## adjust [EMAIL PROTECTED], maybe querying [EMAIL PROTECTED] for 
help
  })

setReplaceMethod("[",
 signature(x="ViewSet"),
 function (x, i, j, ..., value) 
 ## adjust [EMAIL PROTECTED], j, ...
 ## return x, i.e., a ViewSet -- bigData not changed / copied
  })

> can then represent different 'views' on the shared instance  
> dataframes etc.   I have no knowledge if that is even possible in R,  
> and if that is even the smart way to do it.  If someone could provide  
> some help, that would be great.
>
> Other option is of course to link to MySQL and do all data handling  
> in that way.  Also considering that.

or do both, i.e., write ViewSqlSet to 'contain' ViewSet, etc.

> Any thoughts/hints would be appreciated !

Probably you could implement the same ideas in the less intimidating
S3 way, using e.g., a list with

makeView <- function(data) {
## e.g., 'data' a named list of commonly-sized elements, in or out
## of memory -- details depend on needs
env <- new.env()
for (elt in names(data)) env[[elt]] <- data[[elt]]
## initialize index
idx <- list(rows=1:nrow(data[[1]]), cols=1:ncol(data[[1]]))
lst <- list(env=env, idx=idx)
class(lst) <- "View"
lst
}

"[.View" <- function (x, i, j, ..., drop = TRUE) {
## x will be like lst from above, use i, j, etc to subset
## adjust and then return idx, e.g.,...
x$idx$rows <- x$idx$rows[i]
x
}

getData <- function(x, ...) UseMethod("getData")

getData.View <- function(x, ...) {
## return list of subsetted elements
res <- with(x,
lapply(ls(env), function(elt) env[[elt]][idx$rows, idx$cols]))
names(res) <- ls(x$env)
res
}

and then...

> bigView <- makeView(list(df=data.frame(x=1:100, y=100:1),
+ m=matrix(1:200, ncol=2)))
> smallView <- bigView[1:5,]
> getData(smallView) ## copies, but only the 'small' data
$df
  x   y
5 5  96
4 4  97
3 3  98
2 2  99
1 1 100

$m
 [,1] [,2]
[1,]5  105
[2,]4  104
[3,]3  103
[4,]2  102
[5,]1  101

Obviously a hack, but perhaps it gets you going...

> thanks,
>
> Piet
>
>
>
> --
> Dr. P. van Remortel
> Intelligent Systems Lab
> Dept. of Mathematics and Computer Science
> University of Antwerp
> Belgium
> http://www.islab.ua.ac.be
> +32 3 265 33 57 (secr.)
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin T. Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rmpi performance

2006-10-13 Thread Martin Morgan
clusterCall invokes the same function on all three nodes. You have
basically discovered the communication costs of performing the
calculation in parallel.

You'll get the easiest gains from snow (and other parallel packages in
R) with 'embarrassingly parallel' problems, where the same algorithm is
applied to different data sets / slices of data. For performance gains
from a single call to op_mat, you'd have to do some serious parallel
algorithm development to distribute the data and computations
effectively.

Hope that helps,

Martin

Michela Cameletti <[EMAIL PROTECTED]> writes:

> Dear R users,
> we are trying to do some parallel computing using library(snow).
> In particular we have a cluster with 3 nodes
>
>>cl <- makeCluster(3, type = "MPI")
> 3 slaves are spawned successfully. 0 failed.
>
>
> and we want to compute the function op_mat (see below) first with the 
> master and then with the cluster using system.time for checking the 
> computational performance.
>
> op_mat = function(mat) {
>
> +   inv = solve(mat)
> +   det_inv = det(inversa)
> +   tr_inv  = sum(diag(inversa))
> +   return(list(c(det=det_inv,tr=tr_inv)))
> + }
>
>>nn = 3000
>>XX = matrix(rnorm(nn*nn),nn,nn)
> # with the master
>> system.time(op_matrici(XX))
> [1] 42.283  1.883 44.168  0.000  0.000
> # with the cluster
>> system.time(clusterCall(cl,op_matrici,XX))
> [1] 11.523 12.612 71.562  0.000  0.000
>
> You can see that using the master it takes 44.168 seconds for computing 
> the function on matrix XX while it takes 71.562 seconds (more time!!!) 
> with the cluster. Can you give us some advice in order to understand why 
> the cluster is slower than the master?
> Thank you very much in advance,
> bye
> Michela  and Marco
> Ps: we have a gigabit ethernet between the master and the nodes
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin T. Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reserve and biobase

2006-09-05 Thread Martin Morgan
Wild speculation on my part, but I wonder if the 'graphics window'
that pops up is the 'welcome' message in Biobase? This is invoked by a
call to 'message' in the .onAttach function...

.onAttach <- function(libname, pkgname) {
   message(paste("\nWelcome to Bioconductor\n",
   [etc]

If so,

suppressMessages(library(Biobase))

might help. It would be great to hear confirmation of this as the
problem. If not, a completely streamlined example (no extra packages
loaded / actions taken) would aid in debugging.

Martin
-- 
Bioconductor

Robert Gentleman <[EMAIL PROTECTED]> writes:

> You should ask questions about Bioconductor software on the Bioconductor 
> list, you might also read the posting guide and provide version numbers 
> for all packages you are using (the output of sessionInfo is useful).
>
> Biobase does not do any plotting, so where ever the action is coming 
> from it is unlikely to be Biobase that is doing it. I don't use Rserve 
> so I cannot comment on how that might interact with other software. I 
> have no idea what you are trying to do, but somehow you seem to be doing 
> a lot more than just loading Biobase - so why not try just doing that 
> and leave out all the other code opening devices (and why open two 
> postscript devices and then close them?).
>
> Robert
>
>
> saeedeh maleki wrote:
>> Hi 
>>
>>   I am using Rserve for R2.3.1.  every time after I load Biobase
>>   library, a new Graphics window frame pops up. Could any onw know
>>   how can avoid it.
>>
>>   Best
>>   Saeede 
>>
>>   class testReserve {
>> public static void main(String[] args) {
>> RServeConnection rsCon = null;
>> Rconnection c = null;
>> Process proc = null;
>>   try {
>>   Runtime rt = Runtime.getRuntime();
>>   proc = rt.exec(generalMetaData.rserveDir);
>>   try {
>> c = new Rconnection();
>> c.eval("library(grDevices)");
>> //c.eval("graphics.off()");
>> c.eval("postscript()");
>> //load library
>> c.eval("library(tools)");
>> System.out.println(" load library tools");
>> c.eval(" postscript('foo2.ps')");
>> c.eval(" library(Biobase)");
>> c.eval("graphics.off()");
>> System.out.println(" load library Biobase");
>>
>> }
>>   catch (RSrvException ex1) {
>> System.out.println(ex1.getMessage());
>>   }
>>   }
>> catch (Exception e) {
>>   System.out.print("cannot run rserve");
>> }
>> //end of testing
>> }
>> }
>>
>> 
>>  
>> -
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>
> -- 
> Robert Gentleman, PhD
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M2-B876
> PO Box 19024
> Seattle, Washington 98109-1024
> 206-667-7700
> [EMAIL PROTECTED]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need help with difficulty loading page www.bioconductor.org

2006-08-25 Thread Martin Morgan
I didn't see a response to this, and am not an expert, but wanted to
point you in the right direction.

You should ask this question on the bioconductor mailing list.

The site is very much alive at this end (though I'm very close to it,
physically), so you'll need to provide some more information about the
problems you're experiencing. At a minimum it would help to include
the output of the R command

sessionInfo()

If you're using Windows you might also look at mailing list threads
dealing with server proxies. Searchable archives are at

http://dir.gmane.org/gmane.science.biology.informatics.conductor

this link

http://article.gmane.org/gmane.science.biology.informatics.conductor/2152/match=internet2

and searching for internet2 might be a good start.

Martin


Debashis Bhattacharya <[EMAIL PROTECTED]> writes:

> The page is either too busy, or there is something seriously wrong with 
> access to this page.
>
> Most of the time, trying to reach www.bioconductor.org results in 
> failure. Only once in a
> blue moon, do I get through.
>
> In fact, thus far, I have not been able to install bioconductor, since 
> the first source(...)
> command from the R command window -- following instruction on 
> www.bioconductor.org
> page, that I did manage to reach, one time -- has failed, every time.
>
> Please help.
>
>
>
> Debashis Bhattacharya.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] list of interdependent functions

2006-06-20 Thread Martin Morgan
Here's another way:

makeSolver <- function() {
  f1 <- function(x, K) K - x
  f2 <- function(x, r, K) r * x * f1(x, K)
  function() f1(3,4) + f2(1,2,3)
}

solverB <- makeSolver()

solverB()

makeSolver (implicitly) creates an environment, installs f1 and f2
into it, and then returns a function that gets assigned to
solverB. Calling solverB invokes the function in makeSolver, so f1
and f2 are visible to it. The principle is similar to the 'bank
account' example in section 10.7 of 'An introduction to R', available
in the manuals section of cran.

A down side is that the signature of solverB is not obvious; I'm not
sure how the documentation facilities (R CMD check) cope with this.

Nice poster at useR!

Martin

Thomas Petzoldt <[EMAIL PROTECTED]> writes:

> Hello,
>
> I discussed the following problem on the great useR conference with
> several people and wonder if someone of you knows a more elegant (or
> more common ?) solution than the one below.
>
> The problem:
> 
>
> I have several sets of interrelated functions which should be compared.
> The functions themselves have different structure, application-specific
> names (for readability) and they should be exchangeable. I want to avoid
> to construct a generic for every new function, but the functions should
> be aggregated together in a common data structure (e.g. list "eq" or an
> S4 instance) *and* it should be able for them to "see" and call each
> other with too many $ or @. These functions are used in another function
> (called "solver" here) which may be used to prepare something before the
> call to f2.
>
> The example and a possible solution (which uses an environment
> manipulating function "putInEnv()") is given below.
>
> Thanks a lot
>
> Thomas
>
>
> ##==
> ## An example
> ##==
>
> ## a small list of functions
> eq <- list(
>   f1 = function(x, K) K - x,
>   f2 = function(x, r, K) r * x * f1(x, K)
> )
>
> ## a solver fnction which calls them
> solverB <- function(eq) {
>   eq <- putInEnv(eq, environment()) # that's the trick
>   f1(3,4) + f2(1,2,3)
>  }
>
> ## and the final call (e.g. from the command line)
> solverB(eq)
>
>
> ##==
> ## One possible solution. Is there a better one???
> ##==
>
>
> putInEnv <- function(eq, e) {
>   ## clone, very important to avoid "interferences"!!!
>   eq <- as.list(unlist(eq))
>   lapply(eq, "environment<-", e)
>   nn <- names(eq)
>   for (i in 1:length(eq)) {
> assign(nn[i], eq[[i]], envir = e)
>   }
>   eq
>  }
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Fastest way to do HWE.exact test on 100K SNP data?

2006-06-05 Thread Martin Morgan
Anna --

This might be a little faster ...

all.geno.df <- data.frame(all.geno) # exploit shared structure
set.seed(123)   # once only
for (iter in 1:1000) {
permut <- sample(affSt)
control <- all.geno.df[permut==1,]
case <- all.geno.df[permut==2,]
pvalControls <- unlist(sapply(control, HWE.exact)["p.value",])
pvalCases <- unlist(sapply(case, HWE.exact)["p.value",])
# presmuably do something with these, before next iteration
}

but probably most of the time is being spent in HWE.exact so these
cosmetic changes won't help much.

Looking at the code for HWE.exact, it looks like it generates a table
based on numbers of alleles, and then uses this table to determine the
probability. Any time two SNPs have the same number of alleles, the
same table gets generated. It seems like this will happen alot. So one
strategy is to only calculate the table once, 'remember' it, and then
the next time a value is needed from that table look it up rather than
calculate it.

This is a bad strategy if most SNPs have different numbers of alleles
(because then the tables get calculated, and a lot of space and some
time is used to store results that are never accessed again).

Here is the code (not tested extensively, so please use with care!):

HWE.exact.lookup <- function() {
HWE.lookup.tbl <- new.env(hash=TRUE)
dhwe2 <- function(n11, n12, n22) {
## from body of HWE.exact
f <- function(x) lgamma(x + 1)
n <- n11 + n12 + n22
n1 <- 2 * n11 + n12
n2 <- 2 * n22 + n12
exp(log(2) * (n12) + f(n) - f(n11) - f(n12) - f(n22) - 
f(2 * n) + f(n1) + f(n2))
}
function (x) {
## adopted from HWE.exact
if (!is.genotype(x)) 
  stop("x must be of class 'genotype' or 'haplotype'")
nallele <- length(na.omit(allele.names(x)))
if (nallele != 2) 
  stop("Exact HWE test can only be computed for 2 markers with 2 
alleles")
allele.tab <- table(factor(allele(x, 1), levels = allele.names(x)), 
factor(allele(x, 2), levels = allele.names(x)))
n11 <- allele.tab[1, 1]
n12 <- allele.tab[1, 2] + allele.tab[2, 1]
n22 <- allele.tab[2, 2]
n1 <- 2 * n11 + n12
n2 <- 2 * n22 + n12

nm <- paste(n1,n2,sep=".")
if (!exists(nm, envir=HWE.lookup.tbl)) { # 'remember'
x12 <- seq(n1%%2, min(n1, n2), 2)
x11 <- (n1 - x12)/2
x22 <- (n2 - x12)/2
dist <- data.frame(n11 = x11, n12 = x12, n22 = x22,
   density = dhwe2(x11, x12, x22))
dist <- dist[order(dist$density),]
dist$density <- cumsum(dist$density)
assign(nm,dist, envir=HWE.lookup.tbl)
}

dist <- get(nm, HWE.lookup.tbl)
dist$density[dist$n11 == n11 & dist$n12 == n12 & dist$n22 == n22]
}
}

calcHWE <- HWE.exact.lookup()   # create a new lookup table & function
all.geno.df <- data.frame(all.geno) # exploit shared structure
set.seed(123)   # once only
for (iter in 1:1000) {
permut <- sample(affSt)
control <- all.geno.df[permut==1,]
case <- all.geno.df[permut==2,]
pvalControls <- sapply(control, calcHWE)
pvalCases <- sapply(case, calcHWE)
}

Let me know how it goes if you adopt this approach!

Martin

Anna Pluzhnikov <[EMAIL PROTECTED]> writes:

> Hi everyone,
>
> I'm using the function 'HWE.exact' of 'genetics' package to compute p-values 
> of
> the HWE test. My data set consists of ~600 subjects (cases and controls) typed
> at ~ 10K SNP markers; the test is applied separately to cases and controls. 
> The
> genotypes are stored in a list of 'genotype' objects, all.geno, and p-values 
> are
> calculated inside the loop over all SNP markers. 
>
> I wish to repeat this procedure multiple times (~1000) permuting the cases and
> controls (affection status). It seems straightforward to implement it like 
> this:
>
> #
>
> for (iter in 1:1000) {
>   set.seed(iter)
> # get the permuted affection status
>   permut <- sample(affSt)
>   for (j in 1:nSNPs) {
> test <- tapply(all.geno[[j]], permut, HWE.exact)
> pvalControls[j] <- test$"1"$p.value
> pvalCases[j] <- test$"2"$p.value
>   }
> }
>
> ##
>
> The problem is that it takes ~1 min/iteration (on AMD Opteron 252 processor
> running Linux). 
>
> Is there a faster/more efficient way to do this? 
>
> Thanks,
> -- 
> Anna Pluzhnikov, PhD
> Section of Genetic Medicine
> Department of Medicine
> The University of Chicago
>
>
> -
> This email is intended only for the use of the individual or...{{dropped}}
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-proje

Re: [R] parallel computing

2006-05-25 Thread Martin Morgan
Hi Moreno --

snow provides an easy interface to simple parallel types of
calculations (e.g., lapply in parallel). I quickly wanted to have more
direct control over how parallel computations were calculated, and
have been using Rmpi. Though in principle snow and Rmpi are 'easy' to
use, I found that they actually require a certain amount of
understanding about R objects and evaluation, and the underlying
communication library (MPI, or PVM).

Hope that helps,

Martin

"[EMAIL PROTECTED]" <[EMAIL PROTECTED]> writes:

> Dear R users,
>
> I have access to a Sun cluster with multiple processors , a lot of
> RAM and with RedHat installed.  I want to take advantage of its
> power for a R routine very time consuming.
>
> Whick package do I have to use? I know there are snow,snowFT and
> others package.Which is the best for my purpose?  Do someone have
> experiences with this?
>
> Thanck in advance.
>
> Moreno
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] (sans objet)

2006-05-08 Thread Martin Morgan
I guess that you are on Windows, and that you need to install the
tools required to build R packages. The instructions for doing this
are at

http://www.r-project.org/

and then visiting the "Manuals" and 'R Installation and
Administration" links. See especially section 3 and appendix F.  This
can be difficult.

You will likely face additional difficulties even after establishing
the tools required for building R packages; I do not think this will
be an 'easy' process.

Martin

miniar mansouri <[EMAIL PROTECTED]> writes:

> When I run the code : R CMD INSTALL -c SJava_0.69-0.tar.gz 
> i have:
>  
> "gzip:stdin:invalid compressed data--crc error
> tar:child returned status 1
> cannot untar the package"
>  what i  do?
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to a handle an error in a loop

2006-05-06 Thread Martin Morgan
Here's another approach -- 'wrap' the problematic function in a
safety-wrapper, so that it always returns some kind of value.

Here's the wrapper

safetyWrapper <- function(FUN, .badValue=NA)
  function(...) tryCatch(FUN(...), error=function(err) .badValue)

and the problmeatic function, stopping with probability 'prob':
 
sometimeStopFunc <- function(i, prob) {
  if (runif(1) < prob) stop("stopped")
  else i
}

Here's the problematic function in action (using sapply to produce
more compact output for email purposes)

> sapply(1:20, sometimeStopFunc, prob=.2)
Error in FUN(X[[2]], ...) : stopped

We can wrap the troublesome function in our wrapper

> neverStopFunc <- safetyWrapper(sometimeStopFunc)

and then obtain results 

> sapply(1:20, neverStopFunc, prob=.2)
 [1]  1  2  3 NA  5  6  7 NA  9 NA NA 12 NA NA NA 16 17 NA 19 20

these could be filtered...

> res <- sapply(1:20, neverStopFunc, prob=.2)
> res[!is.na(res)]
 [1]  1  2  3  4  5  6  7  8 10 11 12 13 14 15 16 18 19 20

or a different type of result used to signal failure...

> neverStopFunc <- safetyWrapper(sometimeStopFunc, .badValue=-1)
> sapply(1:20, neverStopFunc, prob=.2)
 [1]  1  2 -1  4  5  6  7  8 -1 10 11 -1 -1 -1 -1 16 17 -1 19 20

Maybe it's possible to use restarts to do the calculation over again?

Martin


"Farrel Buchinsky" <[EMAIL PROTECTED]> writes:

> "Berton Gunter" <[EMAIL PROTECTED]> wrote in message 
> news:[EMAIL PROTECTED]
>> ?try
>>
>> as in
>>
>> result<- try (some R expression...)
>> if (inherits(result,'try-error')) ...do something
>> else ...do something else
>
> No heaven on earth yet.
>
> how would I incorporate this kind of functionality into
> Resultdt<-lapply(PGWide[,240:389], tdt)
>
> everything would have to be built into the tdt spot in the above statement.
> How does one get the if...else in there? Does one have to do that as one 
> would program a function or could one write the if...else right into 
> "Resultdt<-lapply(PGWide[,240:389], tdt)"
>
> This works
>> for (few in c(9,10,11,12,243,20)) if 
>> (inherits(try(tdt(PGWide[,few])),'try-error')) print("messed up") else 
>> print("works")
> [1] "works"
> [1] "works"
> [1] "works"
> [1] "works"
> Error in rep.default(1, nrow(U)) : rep() incorrect type for second argument
> In addition: Warning messages:
> 1: 1 misinheritances in: phase.resolve(g.cs, g.mr, g.fr, as.allele.pair = 
> TRUE, allow.ambiguous = (parent ==
> 2: 2 misinheritances in: phase.resolve(g.cs, g.mr, g.fr, as.allele.pair = 
> TRUE, allow.ambiguous = (parent ==
> 3: 2 misinheritances in: phase.resolve(g.cs, g.mr, g.fr, as.allele.pair = 
> TRUE, allow.ambiguous = (parent ==
> 4: 4 misinheritances in: phase.resolve(g.cs, g.mr, g.fr, as.allele.pair = 
> TRUE, allow.ambiguous = (parent ==
> [1] "messed up"
> [1] "works"
> Warning message:
> 1 misinheritances in: phase.resolve(g.cs, g.mr, g.fr, as.allele.pair = TRUE, 
> allow.ambiguous = (parent ==
>
> BUT THIS DOES NOT
>
> lapply(PGWide[,c(9,10,11,12,,243,20)], if (inherits(try(tdt),'try-error') 
> print("messed up") else print("works"))
> Error: syntax error in "lapply(PGWide[,c(9,10,11,12,,243,20)], if 
> (inherits(try(tdt),'try-error') print"
>
> Any idea why...can it be that one cannot have multiple commands on one line
>> p=7 f=8
> Error: syntax error in "p=7 f"
>
> in the lapply, how would R know that I was sending the list to tdt?
>
>
> -- 
> Farrel Buchinsky, MD
> Pediatric Otolaryngologist
> Allegheny General Hospital
> Pittsburgh, PA
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R et Java

2006-05-06 Thread Martin Morgan
peut-être

http://www.omegahat.org/RSJava/FAQ.html:

Problem:
Trying the example for calling R from Java via

   SJava/scripts/RJava --example --gui-none

(in /library) throws an exception

Solution:
Reinstall SJava using the -c switch, i.e.

  R CMD INSTALL -c SJava_0.62-6.tar.gz

Regards,

Martin

miniar mansouri <[EMAIL PROTECTED]> writes:

> Bonjour,
>  
> Je developpe  une application java et je veux integrer du code R dans mon 
> programme . Le programme de test que j’ai fait est :
>  
>  
> import org.omegahat.R.Java.*;
>  
>  public class REvalSample {
> public static void main(String [] args) {
> String [] rargs = {"--slave", "--vanilla"};
>  
> System.out.println("Sample program to call R engine from Java");
> ROmegahatInterpreter interp = new 
> ROmegahatInterpreter(ROmegahatInterpreter.fixArgs(rargs),false);
>  
> REvaluator e = new REvaluator();
>  
>  Object val = e.eval("demo()");
> val = e.eval("x * 2.0");
>  
> if (val != null) {
> double[] objects = (double[])val;
> for (int i=0; i System.err.println("("+i+") " + objects[i]);}
>  
>  
> }
> }
>  }
>  
>  
> Lorsque je compile ce programme j’obtient les messages d’erreurs suivants :
>  
>  
> Loading RInterpreter library
> java.lang.UnsatisfiedLinkError: no RInterpreter in java.library.path
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1491)
> at java.lang.Runtime.loadLibrary0(Runtime.java:788)
> at java.lang.System.loadLibrary(System.java:834)
> at 
> org.omegahat.R.Java.ROmegahatInterpreter.(ROmegahatInterpreter.java:34)
> at rjava.REvalSample.main(REvalSample.java:11)
> Exception in thread "main".
> A mentionner , je travaille avec JBuilderX, et R.2.3.0 ,j’ai ajouter les 
> packages SJava et rjava à R mais le problème est toujours présent.
>  
> Aider moi s’il vous plait c’est très urgent .
>  
>  
>  
>  
>  
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] "a"+"b"

2006-05-02 Thread Martin Morgan
(apologies if this appears twice; I had some mail posting issues)

I think %any-symbol% defines a binary operator, so

> "%+%" <- function(x,y) paste(x,y,sep="")
> "a" %+% "b"
[1] "ab"
> 

or for more fun

> setGeneric("%+%", function(x,y) standardGeneric("%+%"))
[1] "%+%"
> setMethod("%+%", signature(x="character", y="character"), function(x,y) 
> paste(x,y,sep=""))
[1] "%+%"
> setMethod("%+%", signature(x="numeric", y="numeric"), function(x,y) x+y)
[1] "%+%"
> "a" %+% "b"
[1] "ab"
> 1 %+% 2
[1] 3
 

Martin

Matthew Dowle <[EMAIL PROTECTED]> writes:

> Thanks for the information.  By 'not easily' it sounds like it is possible,
> but how?  I'm happy to allow 'errors' like "2"+"3"="23", is that what an
> erroneous use could be?
>
> I tried the following, given the info in your reply, which works.  But is
> there a way to allow "a"+"b"="ab"?
>> x = "a"
>> `+.character` <- function(e1, e2) paste(e1,e2, sep="")
>> attr(x,"class")="character"
>> x+x
> [1] "aa"
>>
>
>> -Original Message-
>> From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
>> Sent: 02 May 2006 12:03
>> To: Matthew Dowle
>> Cc: 'r-help@stat.math.ethz.ch'
>> Subject: Re: [R] "a"+"b"
>> 
>> 
>> Not easily.  There's a comment in ?Ops that for efficiency the group 
>> generics only dispatch on objects with a "class" attribute: 
>> otherwise you 
>> could use an S3 method like
>> 
>> `+.character` <- function(e1, e2) paste(e1,e2, sep="")
>> 
>> S4 methods are built on top of S3 methods as far as internal 
>> dispatch is 
>> concerned, so the same comment will apply there AFAICS.
>> 
>> I would think that the intention was also to positively 
>> discourage messing 
>> with the basics of R, as if you were able to do this 
>> erroneous uses would 
>> likely not get caught.
>> 
>> On Tue, 2 May 2006, Matthew Dowle wrote:
>> 
>> >
>> > Hi,
>> >
>> > Is there a way to define "+" so that "a"+"b" returns "ab" ?
>> >
>> >> setMethod("+",c("character","character"),paste)
>> > Error in setMethod("+", c("character", "character"), paste) :
>> >the method for function '+' and signature e1="character", 
>> > e2="character" is sealed and cannot be re-defined
>> >> "a"+"b"
>> > Error in "a" + "b" : non-numeric argument to binary operator
>> >
>> > Thanks!
>> >
>> >
>> >
>> >[[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@stat.math.ethz.ch mailing list 
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide! 
>> > http://www.R-project.org/posting-guide.html
>> >
>> 
>> -- 
>> Brian D. Ripley,  [EMAIL PROTECTED]
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford, Tel:  +44 1865 272861 (self)
>> 1 South Parks Road, +44 1865 272866 (PA)
>> Oxford OX1 3TG, UKFax:  +44 1865 272595
>>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Parallel computing with the snow package: external file I/O possible?

2006-04-21 Thread Martin Morgan
"Waichler, Scott R" <[EMAIL PROTECTED]> writes:

> genoud()) should be independent.  However, in the code below, the
> directory created by each node has the same random number in its name.

This is probably because 'random' numbers are generated starting from
a 'seed', the seed is determined (by default) from the system time,
and the system time on the two nodes is identical. The same seed is
being used, hence the same random number sequence.

See http://www.stat.uiowa.edu/~luke/R/cluster/cluster.html

Martin

> I was expecting the contents of fun() or fn() to be independent from all
> other executions of the same function.  What am I missing here?
>
> #  Begin code
> library(snow)
> setDefaultClusterOptions(outfile="/tmp/cluster1")
> setDefaultClusterOptions(master="moab")
> cl <- makeCluster(c("moab", "escalante"), type="SOCK")
>
> # Define base pathname for output from my.test()
> base.dir <- "~"
>
> # Define a function that is called by clusterCall()
> my.test <- function(base.dir) {
>   this.host <- as.character(system("hostname", intern=T))
>   this.rnd <- sample(1:1e6, 1)
>   test.file <- paste(sep="", base.dir, this.host, "_", this.rnd)
>   file.create(test.file)
> }  # end my.test()
>
> clusterCall(cl, my.test, base.dir)
> stopCluster(cl)
> #  End code 
>
> For example, the files "moab_65835" and "escalante_65835" are created.
>
> Regards,
> Scott Waichler
> Pacific Northwest National Laboratory
> [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Mpirun with R CMD scripts

2006-04-06 Thread Martin Morgan
Your description was a bit hard for me to follow. I think what you're
trying to do is something like

TestRmpi.R
--
library(Rmpi)
if (0==mpi.comm.rank(comm=0)) {
  # 'manager', e.g., cat("manager\n")
} else {
  # 'worker', e.g., cat(mpi.comm.rank(comm=0),"worker\n")
}
mpi.quit()

TestRmpi.sh
---
R --slave < $1

and from the command line

mpirun -np 3 TestRmpi.sh TestRmpi.R

If instead you try

mpirun -np 3 R CMD BATCH TestRmpi.R

then each node executes R CMD BATCH. As a consequence, each node
writes to a file TestRmpi.Rout, and you end up with an undetermined
outcome (because the different nodes are all trying to write to the
same file).

If your file were

WRONGTestSnow.R
---
library(snow)
cl <- makeCluster(3,type="MPI")
clusterCall(cl,function() Sys.info()[c("nodename","machine")])
stopCluster(cl)

then you would execute makeCluster on each node, for a total of 9
instances of R. Probably not what you want.

I found that snow was useful for interactive and exploratory work, but
that Rmpi (or rpvm) provide the level of control that you really need
for more substantial use of parallel resources. Recent versions of
Rmpi have mpi.parLapply and related, which are like the high-level
functions in snow.

Hope that helps,

Martin

"Srividya Valivarthi" <[EMAIL PROTECTED]> writes:

> Hi,
>
>I am working on a 64-bit rocks cluster and am relatively new to the
> R package. I am trying to get Snow working with R and Rmpi and have
> run into the following issue. R is able to load the Rmpi and snow
> libraries and is able to run simple commands both interactively and
> batch as follows:
>
> ---
> [EMAIL PROTECTED] ~]$ R
> R : Copyright 2005, The R Foundation for Statistical Computing
> Version 2.2.0  (2005-10-06 r35749)
> ISBN 3-900051-07-0
>> library(snow)
>> c1<-makeCluster(3, type="MPI")
> Loading required package: Rmpi
> 3 slaves are spawned successfully. 0 failed.
>>
>>
>>
>> clusterCall(c1,function() Sys.info()[c("nodename","machine")])
> [[1]]
>nodename machine
> "cheaha.ac.uab.edu""x86_64"
>
> [[2]]
> nodename  machine
> "compute-0-12.local" "x86_64"
>
> [[3]]
> nodename  machine
> "compute-0-13.local" "x86_64"
>
>> stopCluster(c1)
> [1] 1
>> q()
> --
>
> But on running the same script with mpirun i get the following error.
> *8
> [EMAIL PROTECTED] ~]$ mpirun -np 3 R -slave R CMD BATCH TestSnow.R
> /home/srividya/R/library/snow/RMPInode.sh: line 9: 19431 Segmentation
> fault  ${RPROG:-R} --vanilla  >${OUT:-/dev/null} 2>&1 <
> library(Rmpi)
> library(snow)
>
> runMPIslave()
> EOF
> **
>
> I am not sure about what the error could be and any help is greatly
> appreaciated.
>
> Thanks,
> Srividya
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] handling warning messages

2006-03-16 Thread Martin Morgan
also tryCatch

> tryCatch( warning("a warning"),
+  warning = function(w) {
+paste("Caught:", conditionMessage( w ))
+  })
[1] "Caught: a warning"
> 



ronggui <[EMAIL PROTECTED]> writes:

> see ?options
>
>  'warn': sets the handling of warning messages.  If 'warn' is
>   negative all warnings are ignored.  If 'warn' is zero (the
>   default) warnings are stored until the top-level function
>   returns.  If fewer than 10 warnings were signalled they will
>   be printed otherwise a message saying how many (max 50) were
>   signalled.  A top-level variable called 'last.warning' is
>   created and can be viewed through the function 'warnings'.
>   If 'warn' is one, warnings are printed as they occur.  If
>   'warn' is two or larger all warnings are turned into errors.
>
>  'warning.expression': an R code expression to be called if a
>   warning is generated, replacing the standard message.  If
>   non-null it is called irrespective of the value of option
>   'warn'.
>
>  'warnings.length': sets the truncation limit for error and warning
>   messages.  A non-negative integer, with allowed values
>   100...8192, default 1000.
>
>
> 2006/3/16, Sigal Blay <[EMAIL PROTECTED]>:
>> Is there any way to store the value of warnings but avoid printing them?
>>
>> A simplified example:
>> > out <- c(0.2,0.3,0.4)
>> > var <- c(2,3,4)
>> > outcome <- glm(out ~ var, family=binomial())
>> Warning message:
>> non-integer #successes in a binomial glm! in: eval(expr, envir, enclos)
>>
>> I don't like the warning printed,
>> but I would like to be able to check it's value after the top-level function
>> has completed and than decide weather to print it or not.
>>
>> Thanks,
>>
>> Sigal Blay
>> Statistical Genetics Working Group
>> Department of Statistics and Actuarial Science
>> Simon Fraser University
>>
>> __
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>
>
>
> --
> 黄荣贵
> Deparment of Sociology
> Fudan University
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Parallel computing with the snow package: external file I/O possible?

2006-03-13 Thread Martin Morgan
Hi Scott --

It took me a bit to figure it out, but the help page for system made
it seem like system should return the exit status, rather than the
command result, if system is invoked without specifying intern =
TRUE. So why does system("hostname") actually print the host name?
it's a side-effect, representing the stdout of the system command
rather than the result of a function evaluation in R! Compare

> res <- system("hostname")
> res
0

You'll get the side effect printed to the screen, but the result
returned to R (invisibly, I guess) is the exit status -- 0.

Snow captures the return value, rather than the side effect. So the
solution is to use either

system("hostname", intern = TRUE )

or

Sys.info()[["nodename"]]

Hope that helps!

Martin

"Waichler, Scott R" <[EMAIL PROTECTED]> writes:

>  
> Hello,
>
> I am trying to do model autocalibration using the snow and rgenoud
> packages.  The function I want to run in task-parallel fashion across
> multiple machines is one that pre- and post-processes data and runs an
> external model code.  My problem is that external file I/O is happening
> only in the master node and not in the slaves.  I have followed Jasjeet
> Sekhon's suggestion to test the cluster setup, and that is fine:
>
>> library(snow)
>> 
>> #pick two machines
>> cl <- makeCluster(c("moab","escalante"))
>> 
>> clusterCall(cl, sin, 2)
>  
>> The output should be:
>> > clusterCall(cl, sin, 2)
>> [[1]]
>> [1] 0.9092974
>> 
>> [[2]]
>> [1] 0.9092974
>> 
>
> I do indeed get the above result, so I presume the network setup is ok.
> Next I tested a function that creates a file.  Here is the code that I
> sourced from the master ("moab"):
>
> # begin script
> library(snow)
>
> setDefaultClusterOptions(outfile="/tmp/cluster1")
> setDefaultClusterOptions(master="moab")
> cl <- makeCluster(c("moab", "escalante"), type="SOCK")
>
> # Define base pathname for output from my.test() 
> base.dir <- "./test"
>
> # Define a function that includes some file I/O 
> my.test <- function(base.dir) {
>   this.host <- as.character(system("hostname")) # to tag the node that
> makes the file
>   this.rnd <- sample(1:1e6, 1)  # to be 'sure' the files have different
> names
>   test.file <- paste(sep="", base.dir, "_", this.host, "_", this.rnd)
>   file.create(test.file)
> }  # end my.test()
>
> g <- clusterCall(cl, my.test, base.dir)
> print(g)
> stopCluster(cl)
> #  end script  
>
>
> The output (g) was as follows:
>  
> [[1]]
> [1] TRUE
>
> [[2]]
> [1] TRUE
>
> But there was only one file created, which I suspect is by the master
> node.  A second file was not created by the process on the slave.  Also,
> system("hostname") returns the number 0 for moab instead of the name.
> Any ideas as to what might be wrong?  
>
> Thanks,
> Scott Waichler
> scott.waichler _at_ pnl.gov
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Class & Method, S3 / S4

2006-03-02 Thread Martin Morgan
Again, R has a different paradigm from what you're used to. R is a
'pass by value' language. So 'obj' inside the method is a *copy* of
testObj, and your assignment changes the 'value' slot of the copy. An
R way of doing this might be

setMethod("test", signature=c("connect"),
  function( obj ) {
[EMAIL PROTECTED] <- [EMAIL PROTECTED] / 2
## and other manipulations...
obj
})

testObj <- test( testObj )

I 'know' these things by reading Chambers' Programming with Data, and
by looking at exsiting code (ok, and maybe some other ways, too
;). There are some example packages suggested in this thread

https://stat.ethz.ch/pipermail/r-devel/2005-November/035370.html

If you have an R source distribution, look in, for instance,

/src/library/stats4/R

At the R command prompt, do something like

library(stats4)
library(help=stats4)
?"update-methods"
getMethods("update")

A warning, I guess, is that looking at complicated code (and the help
pages for the methods package) can be confusing without the kind of
foundation that Chambers' book provides. Maybe others on the list will
provide some hints for introductory documentation on S4 classes and
methods.

Hope that's enough to get you going!

Martin


"Dominik Locher" <[EMAIL PROTECTED]> writes:

> Hi Martin
>
> Thanks a lot for your short example. If you input
>
> test(testObj)
>
> it will return
>
> 22
>
> However, how is it possible that the value will be saved in the object
>
> i.e. (does not work currently!!??)
> setMethod("test", signature=c("connect"),
>   function( obj ) { [EMAIL PROTECTED]<[EMAIL PROTECTED] / 2 })
>
> so that test(testObj) will save a new value to [EMAIL PROTECTED] or 
>
> [EMAIL PROTECTED] will return 22.
>
> Thanks.
> Nik
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Class & Method, S3 / S4

2006-03-01 Thread Martin Morgan
The object paradigm in R is different; I like to think of methods as
orthogonal to classes, instead of nested within them.

Probably what you want to try is

# a generic function, to operate on different classes
setGeneric("test", function( obj ) standardGeneric( "test" ))

# a class, containing data
setClass("connect", representation( value = "numeric" ))

# a specific method, operating on "connect" objects
setMethod("test", signature=c("connect"),
  function( obj ) { [EMAIL PROTECTED] / 2 })

# a new object of class "connect" with value slot assigned 44
testObj <- new("connect", value=44 )
# application of the generic function to an object of class "connect",
# dispatched to the method operating on objects of class "connect"
test( testObj )  

This kind of structuure makes some types of operations particularly
natural in R -- dispatch on multiple arguments, for instance.

Hope that helps,

Martin


[EMAIL PROTECTED] writes:

> Hi
>
> I'm trying to create a class. However, I've some problems...
>
> I am used to program php. There I can create a new object and call a specific
> function of this object. In R, I hoped to create a class similar and then
> call the function like:
>
> Creating the Class:
> ---
> setClass("connect",
>   representation(test="function"))
> )
>
> setMethod("test", "connect", 
> function(value){ 
> value /2
>   }
> )
> ---
>
>
> testObj = new ("connect")
> [EMAIL PROTECTED](44)
>
> That should return 22. However, unfortunately it does not work. How can I
> solve this problem. I read several instructions, but how does R work???.
> Does anybody have a very simple example like above.
>
> I highly appreciate any help.
> Thanks a lot.
> Nik
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How can I see how %*% is implemented?

2006-02-22 Thread Martin Morgan
get("%*%")

tells you that it is a primitive (i.e., implemented in C). The file
/src/main/names.c directs you to do_matprod, in file
 writes:

> I would like to see how the matrix multiplication operator %*% is implemented 
> (because I want to see which external Fortran/C routines are used). How can I 
> do so?
> Best
> Søren
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Turning control back over to the terminal

2006-02-14 Thread Martin Morgan
Hi Ross --

Not a direct answer to your question, but here's an idea.

I guess you've got some script in a file batch.sh

#! /bin/bash
R --no-save --no-restore --gui=none > `hostname`  2>&1  writes:

> I'm invoking R from withing a shell script like this
> R --no-save --no-restore --gui=none > `hostname`  2>&1 < # various commands here
> BYE
>
> I would like to regain control from the invoking terminal at some point.
> I tried source(stdin()) but got a syntax error, presumably stdin is the
> little shell here snippet (the part between < and BYE).
>
> Is there some way to accomplish this?  I am trying to regain control
> inside an R session that has been launched inside an MPI environment
> (and I'm usually running inside emacs).  This is on a Mac OS X cluster.
>
> I suppose there might be a way to do this with expect, but I'm hoping
> for something simpler.  Potentially, I could make the script itself act
> differently on the head node (the only one I want to debug right now),
> including changing the redirection there.
>
>
> -- 
> Ross Boylan  wk:  (415) 514-8146
> 185 Berry St #5700   [EMAIL PROTECTED]
> Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
> University of California, San Francisco
> San Francisco, CA 94107-1739 hm:  (415) 550-1062
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Working with R in a multi-processor machine.

2006-01-10 Thread Martin Morgan
R is not thread safe, so you must not use it in a
re-entrant way.

If you want to exploit multiple processors, you can write code (e.g.,
in C) called from R (e.g., through .Call or .C) that performs
parallel/threaded computations in a thread-safe way (e.g., without
calling back into R).

Another possibility is to replace the BLAS/LAPACK library with a
thread-safe version. This provides a boost to those R algorithms
exploiting these libraries. Haven't done this myself, but there is
some info in this post

https://stat.ethz.ch/pipermail/r-devel/2005-December/035695.html

Hope that helps,

Martin


Aitor Mata Conde <[EMAIL PROTECTED]> writes:

> Hi everyone!!
>
> This is my first message to the list, so I hope not to disturb anyone if the
> subject of my message has been already treated.
>
> The question is that I have a tool, a GUI made with Java, connected to R 
> using Rserve, and I'd like to get R and Rserve in a multi-processor machine.
> Now, when I'm going to start the migration I wonder whether R is prepared,
> 'itself' to optimize the use of multiple processors or if I should change the
> code so that it could be a real multi-processor tool.
>
> In other words... Will the R code adapt itself to the new machine (Unix with
> al least 4 processors)? Or shall I change the code to have multiple real
> threads and transform the algorithms into parallel computing strategies?
>
> Thanks in advance,
> Aitor.
>
>
>   
> -
>
> LLama Gratis a cualquier PC del Mundo.
> Llamadas a fijos y moviles desde 1 centimo por minuto.
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Install Rmpi on Fedora with mpich2 installed.

2005-12-20 Thread Martin Morgan
No direct experience with mpich2 on Fedora, but I think you can use

./configure --with-mpi=/usr/local/mpich2

from within the unpacked Rmpi tarball, or

R CMD INSTALL Rmpi_... --configure-args=--with-mpi=/usr/local/mpich2

from the command line. ... is the tab-completion to the tarball, and
/usr/local/mpich2 should be a path such that mpi.h is in
/usr/local/mpichs/include/mpi.h. Some insight is in the configure.in
file of Rmpi.

Hope that helps! Sorry for the repost, Bin, meant this originally to
reply to the newsgroup.

Martin

"Ye, Bin" <[EMAIL PROTECTED]> writes:

> Hi, everyone,
>
> I want to install Rmpi on a cluster with Fedora linux. It already installed 
> mpich2, but not lam-mpi. I installed R-2.2.0 on it already.
>
> And I got error as below:
>
> * Installing *source* package 'Rmpi' ...
> Try to find mpi.h ...
> checking for gcc... gcc
> checking for C compiler default output file name... a.out
> checking whether the C compiler works... yes
> checking whether we are cross compiling... no
> checking for suffix of executables...
> checking for suffix of object files... o
> checking whether we are using the GNU C compiler... yes
> checking whether gcc accepts -g... yes
> checking for gcc option to accept ANSI C... none needed
> checking how to run the C preprocessor... gcc -E
> checking for egrep... grep -E
> checking for ANSI C header files... yes
> checking for sys/types.h... yes
> checking for sys/stat.h... yes
> checking for stdlib.h... yes
> checking for string.h... yes
> checking for memory.h... yes
> checking for strings.h... yes
> checking for inttypes.h... yes
> checking for stdint.h... yes
> checking for unistd.h... yes
> checking mpi.h usability... no
> checking mpi.h presence... no
> checking for mpi.h... no
> Try to find mpi.h ...
> Cannot find mpi head file
> Please check if --with-mpi=/usr/local/mpich2/bin/ is right
> ERROR: configuration failed for package 'Rmpi'
> ** Removing '/usr/local/R-2.2.0/library/Rmpi'
>
> Somehow it can not find the mpi.h which is in usr/local/mpich2. Can anyone 
> kindly give me some hint on what should be done? Will installing lam-mpi 
> solve the problem? If so, should mpich2 be uninstalled first? Or just modify 
> the path will do?
>
> Thanks a lot!
>
>
> Bin
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Snow & rvpm

2005-12-01 Thread Martin Morgan
In the example at
http://www.stat.uiowa.edu/~luke/R/cluster/cluster.html clusterCall
divides the parameter R by the number of nodes -- what you've done is
calculate 999 bootstraps on each node, and compared the execution time
to 999 bootstraps on one node.

You probably want 'clusterCall' to be smart enough to partition the
bootstraps between nodes, and then collate the results. It doesn't do
that.

Martin

vittorio <[EMAIL PROTECTED]> writes:

> At office, using the internal LAN at my disposal,  I'm having a go at 
> parallel 
> computing - to begin with - with pvm, rpvm & snow.
> The two boxes are as follows
>
> Remote machine uffbsd:
> CPU: Intel(R) Pentium(R) 4 CPU 2.00GHz (1994.13-MHz 686-class CPU)
>   Origin = "GenuineIntel"  Id = 0xf24  Stepping = 4
>  real memory  = 260046848 (248 MB)
>
> This machine NbBSD:
> CPU: Mobile Intel(R) Pentium(R) 4 - M CPU 2.00GHz (1993.54-MHz 686-class CPU)
> real memory  = 536674304 (511 MB)
>
> And starting library snow under R I have the following situation
>
> clusterCall(cl, function() Sys.info()[c("sysname", 
> "release","nodename","machine")])
> [[1]]
>   sysname   release  nodename   machine
> "FreeBSD" "5.4-RELEASE" "uffbsd.myd""i386"
>
> [[2]]
>  sysname  release nodename  machine
>"FreeBSD""6.0-RELEASE" "NbBSD.myd"   "i386"
>
> NOW,
> using the example of boot in the end of page 
> http://www.stat.uiowa.edu/~luke/R/cluster/cluster.html
>
> I find this amazing result:
>
>> system.time(cl.nuke.boot<- 
> clusterCall(cl,boot,nuke.data,nuke.fun,R=999,m=1,fit.pred=new.fit,x.pred=new.data))
> [1]  0.0078125  0.0078125 27.9609375  0.000  0.000
>>  system.time(nuke.boot<- 
> boot(nuke.data,nuke.fun,R=999,m=1,fit.pred=new.fit,x.pred=new.data))
> [1] 26.976562  0.109375 28.484375  0.00  0.00
>> 
> There's not that much gain in time between my cluster and the local 
> computation, isn't it.
> Now my question is:
> What could have been gone wrong and what should I verify?
>
> Ciao
> Vittorio
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html