[R] errors with compilation
Hi, i'm trying to compile R on a Cray XT3 using pgi/7.2.1 - CNL (compute node linux) The R version is 2.8.0 this is the option -enable-R-static-lib=yes --disable-R-shlib CPICFLAGS=fpic FPICFLAGS=fpic CXXPICFLAGS=fpic SHLIB_LDFLAGS=shared --with-x=no SHLIB_CXXLDFLAGS=shared --disable-BLAS-shlib CFLAGS=-g -O2 -Kieee FFLAGS=-g -O2 -Kieee CXXFLAGS=-g -O2 -Kieee FCFLAGS=-g -O2 -Kieee CC=cc F77=ftn CXX=CC FC=ftn R is now configured for x86_64-unknown-linux-gnu Source directory: . Installation directory:/lus/nid00036/jasont/R C compiler:cc -g -O2 -Kieee Fortran 77 compiler: ftn -g -O2 -Kieee C++ compiler: CC -g -O2 -Kieee Fortran 90/95 compiler:ftn -g -O2 -Kieee Obj-C compiler:gcc -g -O2 Interfaces supported: External libraries:readline Additional capabilities: PNG, JPEG, iconv, MBCS, NLS Options enabled: static R library, R profiling, Java Recommended packages: yes The error is : cc -I../../src/extra/zlib -I../../src/extra/bzip2 -I../../src/extra/ pcre -I. -I../../src/include -I../../src/include -I/usr/local/include -DHAVE_CONFIG_H-g -O2 -Kieee -c pcre.c -o pcre.o /opt/cray/xt-asyncpe/1.2/bin/cc: INFO: linux target is being used PGC-W-0155-64-bit integral value truncated (/usr/include/wctype.h: 108) PGC-W-0155-64-bit integral value truncated (/usr/include/wctype.h: 109) PGC-W-0155-64-bit integral value truncated (/usr/include/wctype.h: 110) PGC-W-0155-64-bit integral value truncated (/usr/include/wctype.h: 111) PGC/x86-64 Linux 7.2-1: compilation completed with warnings cc -I../../src/extra/zlib -I../../src/extra/bzip2 -I../../src/extra/ pcre -I. -I../../src/include -I../../src/include -I/usr/local/include -DHAVE_CONFIG_H-g -O2 -Kieee -c platform.c -o platform.o /opt/cray/xt-asyncpe/1.2/bin/cc: INFO: linux target is being used PGC-S-0037-Syntax error: Recovery attempted by deleting identifier FALSE (platform.c: 1657) PGC-S-0094-Illegal type conversion required (platform.c: 1661) PGC/x86-64 Linux 7.2-1: compilation completed with severe errors make[3]: *** [platform.o] Error 2 make[3]: Leaving directory `/lus/nid00036/jasont/R-2.8.0/src/main' make[2]: *** [R] Error 2 make[2]: Leaving directory `/lus/nid00036/jasont/R-2.8.0/src/main' make[1]: *** [R] Error 1 make[1]: Leaving directory `/lus/nid00036/jasont/R-2.8.0/src' make: *** [R] Error 1 Jason Tan Senior Computer Support Officer Western Australian Supercomputer Program (WASP) The University of Western Australia M024 35 Stirling Highway CRAWLEY WA 6009 Ph: +618 64888742 Fax: +618 6488 8088 Email: [EMAIL PROTECTED] Web: www.wasp.uwa.edu.au __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] errors with compilation
On Tue, 9 Dec 2008, Jason Tan wrote: thanks. i noticed a syntax error. a missing semi colon. Which, as I said, was corrected weeks ago. any idea about the error below? Yes: you need to ensure that dynamic libraries are used on x86_64 Linux. That's something specific to your setup: standard R does not use pthreads. cc -I. -I../../../src/include -I../../../src/include -I/usr/local/include -DHAVE_CONFIG_H -fPIC -g -O2 -c Rsock.c -o Rsock.o /opt/cray/xt-asyncpe/1.2/bin/cc: INFO: linux target is being used cc -I. -I../../../src/include -I../../../src/include -I/usr/local/include -DHAVE_CONFIG_H -fPIC -g -O2 -c internet.c -o internet.o /opt/cray/xt-asyncpe/1.2/bin/cc: INFO: linux target is being used cc -I. -I../../../src/include -I../../../src/include -I/usr/local/include -DHAVE_CONFIG_H -fPIC -g -O2 -c nanoftp.c -o nanoftp.o /opt/cray/xt-asyncpe/1.2/bin/cc: INFO: linux target is being used cc -I. -I../../../src/include -I../../../src/include -I/usr/local/include -DHAVE_CONFIG_H -fPIC -g -O2 -c nanohttp.c -o nanohttp.o /opt/cray/xt-asyncpe/1.2/bin/cc: INFO: linux target is being used cc -I. -I../../../src/include -I../../../src/include -I/usr/local/include -DHAVE_CONFIG_H -fPIC -g -O2 -c sock.c -o sock.o /opt/cray/xt-asyncpe/1.2/bin/cc: INFO: linux target is being used cc -I. -I../../../src/include -I../../../src/include -I/usr/local/include -DHAVE_CONFIG_H -fPIC -g -O2 -c sockconn.c -o sockconn.o /opt/cray/xt-asyncpe/1.2/bin/cc: INFO: linux target is being used cc -shared -fPIC -L/usr/local/lib64 -o internet.so Rsock.o internet.o nanoftp.o nanohttp.o sock.o sockconn.o /opt/cray/xt-asyncpe/1.2/bin/cc: INFO: linux target is being used /usr/bin/ld: /usr/lib64/libpthread.a(ptw-fcntl.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC /usr/lib64/libpthread.a: could not read symbols: Bad value make[4]: *** [internet.so] Error 2 jason On 09/12/2008, at 4:50 PM, Prof Brian Ripley wrote: Please try the R-patched.version of R (see the posting guide or the FAQ). That seems to be about an error in the R 2.8.0 sources that was corrected in October, and only happens if you do not ask for X11 support (which most R users need, including it seems all the pre-release testers). On Tue, 9 Dec 2008, Jason Tan wrote: Hi, i'm trying to compile R on a Cray XT3 using pgi/7.2.1 - CNL (compute node linux) The R version is 2.8.0 this is the option -enable-R-static-lib=yes --disable-R-shlib /* The latter is the default */ CPICFLAGS=fpic FPICFLAGS=fpic CXXPICFLAGS=fpic SHLIB_LDFLAGS=shared --with-x=no SHLIB_CXXLDFLAGS=shared --disable-BLAS-shlib Do you really need to set all these? And are you sure you know the correct values? ('-shared' is standard on Linux.) CFLAGS=-g -O2 -Kieee FFLAGS=-g -O2 -Kieee CXXFLAGS=-g -O2 -Kieee FCFLAGS=-g -O2 -Kieee CC=cc F77=ftn CXX=CC FC=ftn R is now configured for x86_64-unknown-linux-gnu Source directory: . Installation directory:/lus/nid00036/jasont/R C compiler:cc -g -O2 -Kieee Fortran 77 compiler: ftn -g -O2 -Kieee C++ compiler: CC -g -O2 -Kieee Fortran 90/95 compiler:ftn -g -O2 -Kieee Obj-C compiler: gcc -g -O2 Interfaces supported: External libraries:readline Additional capabilities: PNG, JPEG, iconv, MBCS, NLS Options enabled: static R library, R profiling, Java Recommended packages: yes The error is : cc -I../../src/extra/zlib -I../../src/extra/bzip2 -I../../src/extra/pcre -I. -I../../src/include -I../../src/include -I/usr/local/include -DHAVE_CONFIG_H -g -O2 -Kieee -c pcre.c -o pcre.o /opt/cray/xt-asyncpe/1.2/bin/cc: INFO: linux target is being used PGC-W-0155-64-bit integral value truncated (/usr/include/wctype.h: 108) PGC-W-0155-64-bit integral value truncated (/usr/include/wctype.h: 109) PGC-W-0155-64-bit integral value truncated (/usr/include/wctype.h: 110) PGC-W-0155-64-bit integral value truncated (/usr/include/wctype.h: 111) PGC/x86-64 Linux 7.2-1: compilation completed with warnings cc -I../../src/extra/zlib -I../../src/extra/bzip2 -I../../src/extra/pcre -I. -I../../src/include -I../../src/include -I/usr/local/include -DHAVE_CONFIG_H -g -O2 -Kieee -c platform.c -o platform.o /opt/cray/xt-asyncpe/1.2/bin/cc: INFO: linux target is being used PGC-S-0037-Syntax error: Recovery attempted by deleting identifier FALSE (platform.c: 1657) PGC-S-0094-Illegal type conversion required (platform.c: 1661) PGC/x86-64 Linux 7.2-1: compilation completed with severe errors make[3]: *** [platform.o] Error 2 make[3]: Leaving directory `/lus/nid00036/jasont/R-2.8.0/src/main' make[2]: *** [R] Error 2 make[2]: Leaving directory `/lus/nid00036/jasont/R-2.8.0/src/main' make[1]: *** [R] Error 1 make[1]: Leaving directory `/lus/nid00036/jasont/R-2.8.0/src' make: *** [R] Error 1 Jason Tan Senior Computer Support Officer Western Australian Supercomputer Program (WASP)
[R] ANCOVA
Hello, Could you please help me in the following question: I have 16 persons 6 take 0.5 mg, 6 take 0.75 mg and 4 take placebo! Can I use the ANCOVA and t-test in this case? Is it possible in R? Thank you in advance, Samuel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error from running WinBUGS in R
Anamika Chaudhuri wrote: Has anyone ever seen an error like this from running WinBUGS in R ' modelCompile(numChains=2) # compile model with 1 chain error for node p[3421] of type GraphLogit.Node node vetor contains undefined elements No, but the error comes from WinBUGS directly. Hence I suggest to run your model directly in WinBUGS and debug interactively. Best, Uwe Ligges [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bayesm package not downloading via any mirror or repository
ekwaters wrote: I am a pretty new R user, I am running the latest linux version on xandros, updated with some extra debian packages, and I also run the latest windows version, but prefer linux. I am having trouble downloading bayesm, it won't do it all from any of the sites on the web, I resorted to this one, http://packages.debian.org/unstable/math/r-cran-bayesm. If you use install.packages() note that you are downloading a source package from a CRAN mirror rather than from some debian mirror. Uwe Ligges and got slightly further, but I think it is still not recognising a mirror. THis is my error message: install.packages(bayesm) Warning in install.packages(bayesm) : argument 'lib' is missing: using /usr/local/lib/R/site-library --- Please select a CRAN mirror for use in this session --- Loading Tcl/Tk interface ... done Warning in download.packages(unique(pkgs), destdir = tmpd, available = available, : no package 'bayesm' at the repositories The package is certainly there. It could be my version of linux, I know. All I want to do is sample from a dirichlet prior, so I want rdirichlet and ddirichlet commands basically. If anyone can help or suggest an alternate package that would be great. E __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] expected variable name error pos 98349 WInBUGS in R
Hard to help without seeing the actual files. Anyway, this is a BUGS question rather than an R question, because the error comes from the BUGS interpreter. Hence wrong mailing list. Best wishes, Uwe Ligges Anamika Chaudhuri wrote: I am using a random intercept model with SITEID as random and NAUSEA as outcome. I tried using a dataset without missing values and changed my model statement accordingly but still get the same error. Follwoing in an excerpt. anal.data - read.table(nausea.txt, header=T, sep=\t) list(names(anal.data)) [[1]] [1] SITEID NAUSEA #anal.data - read.csv(simuldat.csv, header=T) attach(anal.data) The following object(s) are masked from anal.data ( position 3 ) : NAUSEA SITEID The following object(s) are masked from anal.data ( position 4 ) : NAUSEA SITEID data.bugs - list(NAUSEA=NAUSEA,SITEID=SITEID,n.samples=n.samples,n.sites=n.sites,n.params=n.params) bugsData(data.bugs, fileName = nauseadata.txt) inits.bugs - list(alpha=rep(0,n.sites), tau=1) bugsInits(list(inits.bugs), fileName = nauseainit.txt) modelCheck(nausea_random.txt) # check model file model is syntactically correct modelData(nauseadata.txt) # read data file expected variable name error pos 98349 *MODEL* model { for (i in 1:n.samples) {NAUSEA[i] ~ dbin(p[i],1) logit(p[i]) - alpha[SITEID[i]]} #for(k in 1:n.params) #{b[k]~ dnorm(0.0,tau)} for (j in 1:n.sites) {alpha[j]~dnorm(0.0,1.0E-10)} tau ~ dgamma(0.001,0.001) } *Dataset:* SITEID NAUSEA 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 0 1 0 1 0 1 1 -Anamika [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and Scheme
Stavros Macrakis wrote: I've read in many places that R semantics are based on Scheme semantics. As a long-time Lisp user and implementor, I've tried to make this more precise, and this is what I've found so far. I've excluded trivial things that aren't basic semantic issues: support for arbitrary-precision integers; subscripting; general style; etc. I would appreciate corrections or additions from more experienced users of R -- I'm sure that some of the points below simply reflect my ignorance. ==Similarities to Scheme== R has first-class function closures. (i.e. correctly supports upward and downward funarg). R has a single namespace for functions and variables (Lisp-1). ==Important dissimilarities to Scheme (as opposed to other Lisps)== R is not properly tail-recursive. R does not have continuations or call-with-current-continuation or other mechanisms for implementing coroutines, general iterators, and the like. there is callCC, for example, which however seems kind of obsolete. R supports keyword arguments. ==Similarities to Lisp and other dynamic languages, including Scheme== R is runtime-typed and garbage-collected. R supports nested read-eval-print loops for debugging etc. R expressions are represented as user-manipulable data structures. ==Dissimilarities to all (modern) Lisps, including Scheme== R has call-by-need, not call-by-object-value. R does not have macros. R objects are values, not pointers, so a-1:10; b-a; b[1]-999; a[1] = 999. Similarly, functions cannot modify the contents of their arguments. have you actually tried this code? even if the objects are values not pointers, assignment causes, in cases such as the above, copying the value with modifications applied as needed. thus, a[1] - 1, not 999, even though after b-a b and a are the same value object. try the following: system.time(x-1:(10^8)) system.time(y-x) system.time(y[1]-0) system.time(y[2]-0) head(x) head(y) with some trickery, functions can modify the contents of their arguments, using deparse/substitute and assign: a - 1 f - function(x) assign(deparse(substitute(x)), 0, parent.frame()) f(a) a the 'cannot modify the contents' does not apply to arguments that are environments: e - new.env(parent=emptyenv()) l - list() f - function(e) e$a = 0 f(e) e$a f(l) l$a There is no equivalent to set-car!/rplaca (not even pairlists and expressions). For example, r-pairlist(1,2); r[[1]]-r does not create a circular list. And in general there doesn't seem to be substructure sharing at the semantic level (though there may be in the implementation). computations on environment objects seem not to be subject to the copy-value-on-assignment semantics: e - new.env(parent=emptyenv()) ee - e e$a - 0 ee$a R does not have multiple value return in the Lisp sense. R assignment creates a new local variable on first assignment, dynamically. So static analysis is not enough to determine variable reference (R is not referentially transparent). Example: ff - function(a){if (a) x-1; x} ; x-99; ff(T) - 1; ff(F) - 99. In R, most data types (including numeric vectors) do not have a standard external representation which can be read back in without evaluation. R coerces logicals to numbers and numbers to strings. Lisps are stricter about automatic type conversion -- except that false a.k.a. NIL == () in Lisps other than Scheme. types are not treated coherently. in some situations, r coerces doubles to complex (according to the hierarchy of types specified here and there in the man pages), in others it won't: x - as.double(-1) y - as.complex(-1) x == y sqrt(x) sqrt(y) in certain cases, r will also do implicit inverse (downward) coercion: is(y:y) vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and Scheme
Stavros Macrakis wrote: There is no equivalent to set-car!/rplaca (not even pairlists and expressions). For example, r-pairlist(1,2); r[[1]]-r does not create a circular list. And in general there doesn't seem to be substructure sharing at the semantic level (though there may be in the implementation). again, you can achieve the effect with environments: e = new.env(parent=emptyenv()) e$e = e e e$e$e$e vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for loop query
Hi, Why isn't my loop incrementing i - the outer loop to 2 and then resetting j=3? It is. It runs out of bounds with j 26 Am I missing something obvious? for (i in 1:25) + { + for (j in i+1:26) You miss parentheses. i + 1 : 26 is i + (1 : 26) as the vector 1 :26 is calculated first what happens is that for i = 1 j goes over 2 : 27, with i = 2 over 3 : 28, ... what you want is (i + 1) : 26: for (i in 1 : 25) for (j in (i + 1) : 26) cat (i, j, \n) HTH Claudia -- Claudia Beleites Dipartimento dei Materiali e delle Risorse Naturali Università degli Studi di Trieste Via Alfonso Valerio 6/a I-34127 Trieste phone: +39 (0 40) 5 58-34 47 email: [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pre-model Variable Reduction
Hi Harsh, I looked for other R packages that allow me to do variable reduction without considering a dependent variable. Have look at package subselect. This has an implementation of the genetic algorithm, along with some other methods. It should do what you want. Regards, Mark. Harsh-7 wrote: Hello All, I am trying to carry out variable reduction. I do not have information about the dependent variable, and have only the X variables as it were. In selecting variables I wish to keep, I have considered the following criteria. 1) Percentage of missing value in each column/variable 2) Variance of each variable, with a cut-off value. I recently came across Weka and found that there is an RWeka package which would allow me to make use of Weka through R. Weka provides a Genetic search variable reduction method, but I could not find its R code implementation in the RWeka Pdf file on CRAN. I looked for other R packages that allow me to do variable reduction without considering a dependent variable. I came across 'dprep' package but it does not have a Windows implementation. Moreover, I have a dataset that contains continuous and categorical variables, some categorical variables having 3 levels, 10 levels and so on, till a max 50 levels (E.g. States in the USA). Any suggestions in this regard will be much appreciated. Thank you Harsh Singhal Decision Systems, Mu Sigma, Inc. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Pre-model-Variable-Reduction-tp20912229p20914146.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] display matrix on graphics device
[EMAIL PROTECTED] wrote: Folks, Is there a way to print a matrix in tabular form onto the graphics device? I want to create a display consisting of graphs and tables, so that I can do something like: windows() opar = par(mfrow = c(3, 2)) library(plotrix) library(PerformanceAnalytics) radial.plot(...) radial.plot(...) chart.BarVaR(...) chart.RollingCorrelation(...) print.matrix.onto.graphics.device(X) par(opar) where X is a matrix with named rows and columns. If this is not easily done, is there any way I can embed graphics and tabular data into a PDF? Hi Murali, As you are using the plotrix package, will addtable2plot do what you want? Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sorry, I have attached the data - Here is the code that causes wavCWTPeaks error
-Messaggio originale- Da: [EMAIL PROTECTED] Inviato: mar 09/12/2008 13.52 A: stephen sefick; Francesco Masulli; Stefano Rovetta Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Oggetto: Here is the code that causes wavCWTPeaks error aats - create.signalSeries(aa, pos=list(from=0.0, by=0.033)) aa.cwt - wavCWT(aats) x11 (width=10,height=12) plot (aats,main=paste(insig, Cycle: ,j,sep=)) aa.maxtree - wavCWTTree (aa.cwt, type=maxima) aa.mintree - wavCWTTree (aa.cwt, type=minima) aa.maxpeak - wavCWTPeaks (aa.maxtree) aa.minpeak - wavCWTPeaks (aa.mintree) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : invalid 'row.names' length bb - -aa #EXCHANGE MAXIMA WITH MINIMA bbts - create.signalSeries(bb, pos=list(from=0.0, by=0.033)) plot (bbts,main=paste(insig, Cycle: ,j,sep=)) plot (bbts,main=paste(insig, Cycle: ,j,sep=)) bb.cwt - wavCWT(bbts) bb.maxtree - wavCWTTree (bb.cwt, type=maxima) bb.mintree - wavCWTTree (bb.cwt, type=minima) bb.maxpeak - wavCWTPeaks (bb.maxtree) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : invalid 'row.names' length bb.minpeak - wavCWTPeaks (bb.mintree) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : invalid 'row.names' length Attached are aa breathing cycle amplitude values (zip compressed) I'd like to figure out what is wrong with my data. wmTSA documentation does not mention any time series constraint. Thank you in advance. Kind regards, Maura -Messaggio originale- Da: stephen sefick [mailto:[EMAIL PROTECTED] Inviato: mar 09/12/2008 8.11 A: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Oggetto: Re: [R] package wmtsa: wavCWTPeaks error Are the names of the rows the same as the time series that you are using? I know that I am not being that helpful, but this seems like a mismatch in the time series object. look at length(rowname(your.data)) length(your.data[,1]) again it is always helpful to have reproducible code. On Tue, Dec 9, 2008 at 1:39 AM, [EMAIL PROTECTED] wrote: I keep getting the following error when I look for minima in the series: aa.peak - wavCWTPeaks (aa.tree) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : invalid 'row.names' length How can I work it around ? Thank you. Regards, Maura Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e tutti i telefonini TIM! Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e tutti i telefonini TIM! Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e tutti i telefonini TIM! Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pre-model Variable Reduction
Harsh wrote: Hello All, I am trying to carry out variable reduction. I do not have information about the dependent variable, and have only the X variables as it were. In selecting variables I wish to keep, I have considered the following criteria. 1) Percentage of missing value in each column/variable 2) Variance of each variable, with a cut-off value. I recently came across Weka and found that there is an RWeka package which would allow me to make use of Weka through R. Weka provides a Genetic search variable reduction method, but I could not find its R code implementation in the RWeka Pdf file on CRAN. I looked for other R packages that allow me to do variable reduction without considering a dependent variable. I came across 'dprep' package but it does not have a Windows implementation. Moreover, I have a dataset that contains continuous and categorical variables, some categorical variables having 3 levels, 10 levels and so on, till a max 50 levels (E.g. States in the USA). Any suggestions in this regard will be much appreciated. Thank you Harsh Singhal Decision Systems, Mu Sigma, Inc. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Take a look at the the redun function in the Hmisc package, which does redundancy analysis. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] for loop query
Hi all, apologies if this is obvious - but I can't see it and would appreciate some quick help! the matrix mhouse is 26x3 and I'm computing odds ratios. The simple code below should compute the odds vector for every pair (325) i.e. 26C2 in cols 1 and 2. On the first i=1 outer loop the inner j loop runs from 2 to 26 ok and then I get the error (Error: subscript out of bounds) Why isn't my loop incrementing i - the outer loop to 2 and then resetting j=3? Am I missing something obvious? thanks Gerard mhouse [,1] [,2] [,3] [1,] 275 81949 [2,] 593 1323 192 [3,] 813 1181 292 [4,] 2177 5189 1320 [5,] 1651 2243 270 [6,] 1061 5629 11035 [7,] 1690 2302 589 [8,] 1130 1203 345 [9,] 565 1898 655 [10,] 580 730 234 [11,] 343 176173 [12,] 372 53667 [13,] 666 1713 397 [14,] 382 918 279 [15,] 486 921 247 [16,] 1141 988 313 [17,] 626 1135 666 [18,] 438 436 168 [19,] 425 691 101 [20,] 609 71699 [21,] 467 661 141 [22,] 879 137379 [23,] 444 1101 130 [24,] 459 898 351 [25,] 995 1801 398 [26,] 396 1107 201 # set up the odds vector by declaring it to be null odds=NULL # compute the odds ratios for Individual House vs Scheme House for (i in 1:25) + { + for (j in i+1:26) + { + todds = (mhouse[i,1]*mhouse[j,2])/(mhouse[j,1]*mhouse[i,2]) + # compute the todds for row i with row j: ji + odds = c(odds,todds) + # append todds to the odds vector + } + } Error: subscript out of bounds odds [1] 0.7491244 0.4877622 0.8003391 0.4561745 1.7814132 0.4573697 0.3574670 1.1279674 0.4226138 1.7239078 0.4838053 [12] 0.8636384 0.8069156 0.6363150 0.2907502 0.6087939 0.3342421 0.5459312 0.3947703 0.4752623 0.5244818 0.8326321 [23] 0.6569199 0.6077702 0.9386447 ** The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. It is the policy of the Department of Justice, Equality and Law Reform and the Agencies and Offices using its IT services to disallow the sending of offensive material. Should you consider that the material contained in this message is offensive you should contact the sender immediately and also mailminder[at]justice.ie. Is le haghaidh an duine nó an eintitis ar a bhfuil sí dírithe, agus le haghaidh an duine nó an eintitis sin amháin, a bheartaítear an fhaisnéis a tarchuireadh agus féadfaidh sé go bhfuil ábhar faoi rún agus/nó faoi phribhléid inti. Toirmisctear aon athbhreithniú, atarchur nó leathadh a dhéanamh ar an bhfaisnéis seo, aon úsáid eile a bhaint aisti nó aon ghníomh a dhéanamh ar a hiontaoibh, ag daoine nó ag eintitis seachas an faighteoir beartaithe. Má fuair tú é seo trí dhearmad, téigh i dteagmháil leis an seoltóir, le do thoil, agus scrios an t-ábhar as aon ríomhaire. Is é beartas na Roinne Dlí agus Cirt, Comhionannais agus Athchóirithe Dlí, agus na nOifígí agus na nGníomhaireachtaí a úsáideann seirbhísí TF na Roinne, seoladh ábhair cholúil a dhícheadú. Más rud é go measann tú gur ábhar colúil atá san ábhar atá sa teachtaireacht seo is ceart duit dul i dteagmháil leis an seoltóir láithreach agus le mailminder[ag]justice.ie chomh maith. *** __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Pre-model Variable Reduction
Hello All, I am trying to carry out variable reduction. I do not have information about the dependent variable, and have only the X variables as it were. In selecting variables I wish to keep, I have considered the following criteria. 1) Percentage of missing value in each column/variable 2) Variance of each variable, with a cut-off value. I recently came across Weka and found that there is an RWeka package which would allow me to make use of Weka through R. Weka provides a Genetic search variable reduction method, but I could not find its R code implementation in the RWeka Pdf file on CRAN. I looked for other R packages that allow me to do variable reduction without considering a dependent variable. I came across 'dprep' package but it does not have a Windows implementation. Moreover, I have a dataset that contains continuous and categorical variables, some categorical variables having 3 levels, 10 levels and so on, till a max 50 levels (E.g. States in the USA). Any suggestions in this regard will be much appreciated. Thank you Harsh Singhal Decision Systems, Mu Sigma, Inc. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pre-model Variable Reduction
See: ?prcomp ?princomp On Tue, Dec 9, 2008 at 5:34 AM, Harsh [EMAIL PROTECTED] wrote: Hello All, I am trying to carry out variable reduction. I do not have information about the dependent variable, and have only the X variables as it were. In selecting variables I wish to keep, I have considered the following criteria. 1) Percentage of missing value in each column/variable 2) Variance of each variable, with a cut-off value. I recently came across Weka and found that there is an RWeka package which would allow me to make use of Weka through R. Weka provides a Genetic search variable reduction method, but I could not find its R code implementation in the RWeka Pdf file on CRAN. I looked for other R packages that allow me to do variable reduction without considering a dependent variable. I came across 'dprep' package but it does not have a Windows implementation. Moreover, I have a dataset that contains continuous and categorical variables, some categorical variables having 3 levels, 10 levels and so on, till a max 50 levels (E.g. States in the USA). Any suggestions in this regard will be much appreciated. Thank you Harsh Singhal Decision Systems, Mu Sigma, Inc. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] display matrix on graphics device
Folks, Is there a way to print a matrix in tabular form onto the graphics device? I want to create a display consisting of graphs and tables, so that I can do something like: windows() opar = par(mfrow = c(3, 2)) library(plotrix) library(PerformanceAnalytics) radial.plot(...) radial.plot(...) chart.BarVaR(...) chart.RollingCorrelation(...) print.matrix.onto.graphics.device(X) par(opar) where X is a matrix with named rows and columns. If this is not easily done, is there any way I can embed graphics and tabular data into a PDF? Thanks, Murali --- This e-mail is confidential and may be privileged. If you have received it by mistake, please notify the sender by return e-mail and delete it from your system. You should not disclose, copy or use it for any purpose. The information in this e-mail is not contractual. Fortis Investments provides no guarantee as to the correctness of this information and accepts no responsibility for any action taken on the basis of it. Fortis Investments is the trade name for all entities within the Fortis Investment Management group. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for loop query
Hi start simple! Work out *each* row combined with *each* row, to give (in your case) a 26-by-26 matrix. Only after you have got this working, start thinking about making it run faster [eg by only evaluating the upper triangular entries] To do a nested loop, do M - matrix(0,n,n) for(i in seq_len(n)){ for(j in seq_len(n)){ M[i,j] - f(i,j) } } which will fill in matrix M for you. HTH rksh Gerard M. Keogh wrote: Hi all, apologies if this is obvious - but I can't see it and would appreciate some quick help! the matrix mhouse is 26x3 and I'm computing odds ratios. The simple code below should compute the odds vector for every pair (325) i.e. 26C2 in cols 1 and 2. On the first i=1 outer loop the inner j loop runs from 2 to 26 ok and then I get the error (Error: subscript out of bounds) Why isn't my loop incrementing i - the outer loop to 2 and then resetting j=3? Am I missing something obvious? thanks Gerard mhouse [,1] [,2] [,3] [1,] 275 81949 [2,] 593 1323 192 [3,] 813 1181 292 [4,] 2177 5189 1320 [5,] 1651 2243 270 [6,] 1061 5629 11035 [7,] 1690 2302 589 [8,] 1130 1203 345 [9,] 565 1898 655 [10,] 580 730 234 [11,] 343 176173 [12,] 372 53667 [13,] 666 1713 397 [14,] 382 918 279 [15,] 486 921 247 [16,] 1141 988 313 [17,] 626 1135 666 [18,] 438 436 168 [19,] 425 691 101 [20,] 609 71699 [21,] 467 661 141 [22,] 879 137379 [23,] 444 1101 130 [24,] 459 898 351 [25,] 995 1801 398 [26,] 396 1107 201 # set up the odds vector by declaring it to be null odds=NULL # compute the odds ratios for Individual House vs Scheme House for (i in 1:25) + { + for (j in i+1:26) + { + todds = (mhouse[i,1]*mhouse[j,2])/(mhouse[j,1]*mhouse[i,2]) + # compute the todds for row i with row j: ji + odds = c(odds,todds) + # append todds to the odds vector + } + } Error: subscript out of bounds odds [1] 0.7491244 0.4877622 0.8003391 0.4561745 1.7814132 0.4573697 0.3574670 1.1279674 0.4226138 1.7239078 0.4838053 [12] 0.8636384 0.8069156 0.6363150 0.2907502 0.6087939 0.3342421 0.5459312 0.3947703 0.4752623 0.5244818 0.8326321 [23] 0.6569199 0.6077702 0.9386447 ** The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. It is the policy of the Department of Justice, Equality and Law Reform and the Agencies and Offices using its IT services to disallow the sending of offensive material. Should you consider that the material contained in this message is offensive you should contact the sender immediately and also mailminder[at]justice.ie. Is le haghaidh an duine nó an eintitis ar a bhfuil sí dírithe, agus le haghaidh an duine nó an eintitis sin amháin, a bheartaítear an fhaisnéis a tarchuireadh agus féadfaidh sé go bhfuil ábhar faoi rún agus/nó faoi phribhléid inti. Toirmisctear aon athbhreithniú, atarchur nó leathadh a dhéanamh ar an bhfaisnéis seo, aon úsáid eile a bhaint aisti nó aon ghníomh a dhéanamh ar a hiontaoibh, ag daoine nó ag eintitis seachas an faighteoir beartaithe. Má fuair tú é seo trí dhearmad, téigh i dteagmháil leis an seoltóir, le do thoil, agus scrios an t-ábhar as aon ríomhaire. Is é beartas na Roinne Dlí agus Cirt, Comhionannais agus Athchóirithe Dlí, agus na nOifígí agus na nGníomhaireachtaí a úsáideann seirbhísí TF na Roinne, seoladh ábhair cholúil a dhícheadú. Más rud é go measann tú gur ábhar colúil atá san ábhar atá sa teachtaireacht seo is ceart duit dul i dteagmháil leis an seoltóir láithreach agus le mailminder[ag]justice.ie chomh maith. *** __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Robin K. S. Hankin Uncertainty Analyst University of Cambridge 19 Silver Street Cambridge CB3
[R] (senza oggetto)
Dear R help, I use the package plm e the function plm() to analyse a panel data and estimate a dynamic model. Can I estimate a model without intercept? Thanks, Andrea Ferroni __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pre-model Variable Reduction
Harsh singhalblr at gmail.com writes: Hello All, I am trying to carry out variable reduction. I do not have information about the dependent variable, and have only the X variables as it were. ... I looked for other R packages that allow me to do variable reduction without considering a dependent variable. I came across 'dprep' package but it does not have a Windows implementation. I doubt that you will find what you are longing for, but: There is a Windows version available at the Homepage of the drep package at http://math.uprm.edu/~edgar/dprep.html. This version 2.0 can be loaded without errors into R 2.8.0 though it appears not to be fully compliant with the tests on CRAN. Moreover, I have a dataset that contains continuous and categorical variables, some categorical variables having 3 levels, 10 levels and so on, till a max 50 levels (E.g. States in the USA). Any suggestions in this regard will be much appreciated. Thank you Harsh Singhal Decision Systems, Mu Sigma, Inc. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pre-model Variable Reduction
Principal components analysis does dimensionality reduction but NOT variable reduction. However, Jolliffe's 2004 book on PCA does discuss the problem of selecting a subset of variables, with the goal of representing the internal variation of original multivariate vector as well as possible (see Section 6.3 of that book). I do not think that these methods can handle missing data. The most important issue is to think about the goal of variable reduction and then choose an appropriate optimality criterion for achieving that goal. In most instances of variable selection, the criterion that is optimized is never explicitly considered. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gabor Grothendieck Sent: Tuesday, December 09, 2008 8:00 AM To: Harsh Cc: r-help@r-project.org Subject: Re: [R] Pre-model Variable Reduction See: ?prcomp ?princomp On Tue, Dec 9, 2008 at 5:34 AM, Harsh [EMAIL PROTECTED] wrote: Hello All, I am trying to carry out variable reduction. I do not have information about the dependent variable, and have only the X variables as it were. In selecting variables I wish to keep, I have considered the following criteria. 1) Percentage of missing value in each column/variable 2) Variance of each variable, with a cut-off value. I recently came across Weka and found that there is an RWeka package which would allow me to make use of Weka through R. Weka provides a Genetic search variable reduction method, but I could not find its R code implementation in the RWeka Pdf file on CRAN. I looked for other R packages that allow me to do variable reduction without considering a dependent variable. I came across 'dprep' package but it does not have a Windows implementation. Moreover, I have a dataset that contains continuous and categorical variables, some categorical variables having 3 levels, 10 levels and so on, till a max 50 levels (E.g. States in the USA). Any suggestions in this regard will be much appreciated. Thank you Harsh Singhal Decision Systems, Mu Sigma, Inc. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Selecting rows that are the same in separate data frames
I want to compare two matrices or data frames and select or get an index for those rows which are the same in both. I have tried the following : a = matrix ( 1:10, ncol = 2 ) a b = matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 ) b a[a==b] a = as.data.frame ( matrix ( 1:10, ncol = 2 ) ) a b = as.data.frame ( matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 ) ) b a[a==b] Any ideas please. Thanks. Simon Parker Imperial College -- View this message in context: http://www.nabble.com/Selecting-rows-that-are-the-same-in-separate-data-frames-tp20916243p20916243.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need help optimizing/vectorizing nested loops
Hi, I'm analyzing a large number of large simulation datasets, and I've isolated one of the bottlenecks. Any help in speeding it up would be appreciated. `dat` is a dataframe of samples from a regular grid. The first two columns are the spatial coordinates of the samples, the remaining 20 columns are the abundances of species in each cell. I need to calculate the species richness in adjacent cells for each cell in the sample. For example, if I have nine cells in my dataframe (X = 1:3, Y = 1:3): a b c d e f g h i I need to calculate the neighbour-richness for each cell; for a, this is the richness of cells b, d and e combined. The neighbour richness of cell e would be the combined richness of all the other eight cells. The following code does what I what, but it's slow. The sample dataset 'dat', below, represents a 5x5 grid, 25 samples. It takes about 1.5 seconds on my computer. The largest samples I am working with have a 51 x 51 grid (2601 cells) and take 4.5 minutes. This is manageable, but since I have potentially hundreds of these analyses to run, trimming that down would be very helpful. After loading the function and the data, the call system.time(tmp - time.test(dat)) Will run the code. Note that I've excised this from a larger, more general function, after determining that for large datasets this section is responsible for a slowdown from 10-12 seconds to ca. 250 seconds. Thanks for your patience, Tyler time.test - function(dat) { cen - dat grps - 5 n.rich - numeric(grps^2) n.ind - 1 for(i in 1:grps) for (j in 1:grps) { n.cen - numeric(ncol(cen) - 2) neighbours - expand.grid((j-1):(j+1), (i-1):(i+1)) neighbours - neighbours[-5,] neighbours - neighbours[which(neighbours[,1] %in% 1:grps neighbours[,2] %in% 1:grps),] for (k in 1:nrow(neighbours)) n.cen - n.cen + cen[cen$X == neighbours[k,1] cen$Y == neighbours[k,2], -c(1:2)] n.rich[n.ind] - sum(as.logical(n.cen)) n.ind - n.ind + 1 } return(n.rich) } `dat` - structure(list( X = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5), Y = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5), V1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 45L, 131L, 0L, 0L, 34L, 481L, 1744L), V2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 88L, 0L, 70L, 101L, 13L, 634L, 0L, 0L, 71L, 640L, 1636L), V3 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 49L, 3L, 113L, 1L, 44L, 167L, 336L, 933L, 0L, 14L, 388L, 1180L, 1709L), V4 = c(0L, 0L, 0L, 0L, 0L, 0L, 3L, 12L, 0L, 0L, 2L, 1L, 36L, 45L, 208L, 7L, 221L, 213L, 371L, 1440L, 26L, 211L, 389L, 1382L, 1614L), V5 = c(96L, 7L, 0L, 0L, 0L, 10L, 17L, 0L, 5L, 0L, 0L, 11L, 151L, 127L, 160L, 27L, 388L, 439L, 1117L, 1571L, 81L, 598L, 1107L, 1402L, 891L), V6 = c(16L, 30L, 13L, 0L, 0L, 10L, 195L, 60L, 29L, 29L, 1L, 107L, 698L, 596L, 655L, 227L, 287L, 677L, 1477L, 1336L, 425L, 873L, 961L, 1360L, 1175L), V7 = c(249L, 101L, 69L, 0L, 18L, 186L, 331L, 291L, 259L, 248L, 336L, 404L, 642L, 632L, 775L, 455L, 801L, 697L, 1063L, 978L, 626L, 686L, 1204L, 1138L, 627L), V8 = c(300L, 163L, 65L, 145L, 377L, 257L, 690L, 655L, 420L, 288L, 346L, 461L, 1276L, 897L, 633L, 812L, 1018L, 1337L, 1295L, 1163L, 550L, 1104L, 768L, 933L, 433L), V9 = c(555L, 478L, 374L, 349L, 357L, 360L, 905L, 954L, 552L, 438L, 703L, 984L, 1616L, 1732L, 1234L, 1213L, 1518L, 1746L, 1191L, 967L, 1394L, 1722L, 1706L, 610L, 169L), V10 = c(1527L, 1019L, 926L, 401L, 830L, 833L, 931L, 816L, 1126L, 1232L, 1067L, 1169L, 1270L, 1277L, 1145L, 1159L, 1072L, 1534L, 997L, 391L, 1328L, 1414L, 1037L, 444L, 1L), V11 = c(1468L, 1329L, 1013L, 603L, 1096L, 1237L, 1488L, 1189L, 1064L, 1303L, 1258L, 1479L, 1421L, 1365L, 1101L, 1415L, 1145L, 1329L, 1325L, 236L, 1379L, 1199L, 729L, 328L, 0L), V12 = c(983L, 1459L, 791L, 898L, 911L, 1215L, 1528L, 960L, 1172L, 1286L, 1358L, 722L, 857L, 1478L, 1452L, 1502L, 1013L, 745L, 455L, 149L, 1686L, 917L, 1013L, 84L, 0L), V13 = c(1326L, 1336L, 1110L, 1737L, 1062L, 1578L, 1382L, 1537L, 1366L, 1308L, 1301L, 1357L, 746L, 622L, 934L, 1132L, 954L, 460L, 270L, 65L, 957L, 699L, 521L, 18L, 1L), V14 = c(1047L, 1315L, 1506L, 1562L, 1254L, 1336L, 1106L, 1213L, 1220L, 1457L, 858L, 1606L, 590L, 726L, 598L, 945L, 732L, 258L, 45L, 6L, 937L, 436L, 43L, 0L, 0L), V15 = c(845L, 935L, 1295L, 1077L, 1400L, 1049L, 802L, 1247L, 1449L, 1046L, 1134L, 877L, 327L, 352L, 470L, 564L, 461L, 166L, 0L, 0L, 230L, 110L, 29L, 0L, 0L), V16 = c(784L, 675L, 1157L, 1488L, 1511L, 1004L, 420L, 523L, 733L, 724L, 833L, 542L, 171L, 116L, 384L, 357L, 197L, 0L, 0L, 0L, 246L, 0L, 0L, 0L, 0L), V17 = c(444L, 873L, 530L, 596L, 448L, 431L, 109L, 446L, 378L, 243L, 284L, 148L, 69L, 30L, 6L, 71L, 32L, 131L, 0L, 0L, 120L, 0L, 0L, 0L, 0L), V18 = c(307L, 128L, 823L, 566L,
Re: [R] Pre-model Variable Reduction
Thank you everyone. The idea really is for me to get the variables themselves from a super-set of all variables. x1 -numeric continuous x2 -numeric continuous x3 - numeric Factor with 2 levels x4 -Character Factor with 10 levels x5 - numeric continuous x6 - numeric integer Variable Reduction method then, must ideally give me keep : x1, x3 and x6 drop : x2, x4 and x5 The 'redun' function from Hmisc package seems promising since it considers categorical variables as well. Variable to be dropped is the variable which can be predicted by other variables. I guess its to check for multi-colinearity. The RWeka package, as I mentioned earlier, allows one to use Weka's variable reduction/selection techniques in R. I did come across an implementation of the Genetic Search' method, but have not been able to find relevant documentation for the same to tweak to suit my needs. Thank you all for your time. Harsh Singhal Decision Systems, Mu Sigma Inc. On Tue, Dec 9, 2008 at 8:05 PM, Ravi Varadhan [EMAIL PROTECTED] wrote: Principal components analysis does dimensionality reduction but NOT variable reduction. However, Jolliffe's 2004 book on PCA does discuss the problem of selecting a subset of variables, with the goal of representing the internal variation of original multivariate vector as well as possible (see Section 6.3 of that book). I do not think that these methods can handle missing data. The most important issue is to think about the goal of variable reduction and then choose an appropriate optimality criterion for achieving that goal. In most instances of variable selection, the criterion that is optimized is never explicitly considered. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gabor Grothendieck Sent: Tuesday, December 09, 2008 8:00 AM To: Harsh Cc: r-help@r-project.org Subject: Re: [R] Pre-model Variable Reduction See: ?prcomp ?princomp On Tue, Dec 9, 2008 at 5:34 AM, Harsh [EMAIL PROTECTED] wrote: Hello All, I am trying to carry out variable reduction. I do not have information about the dependent variable, and have only the X variables as it were. In selecting variables I wish to keep, I have considered the following criteria. 1) Percentage of missing value in each column/variable 2) Variance of each variable, with a cut-off value. I recently came across Weka and found that there is an RWeka package which would allow me to make use of Weka through R. Weka provides a Genetic search variable reduction method, but I could not find its R code implementation in the RWeka Pdf file on CRAN. I looked for other R packages that allow me to do variable reduction without considering a dependent variable. I came across 'dprep' package but it does not have a Windows implementation. Moreover, I have a dataset that contains continuous and categorical variables, some categorical variables having 3 levels, 10 levels and so on, till a max 50 levels (E.g. States in the USA). Any suggestions in this regard will be much appreciated. Thank you Harsh Singhal Decision Systems, Mu Sigma, Inc. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and Scheme
On Tue, 9 Dec 2008, Wacek Kusnierczyk wrote: Stavros Macrakis wrote: I've read in many places that R semantics are based on Scheme semantics. As a long-time Lisp user and implementor, I've tried to make this more precise, and this is what I've found so far. I've excluded trivial things that aren't basic semantic issues: support for arbitrary-precision integers; subscripting; general style; etc. I would appreciate corrections or additions from more experienced users of R -- I'm sure that some of the points below simply reflect my ignorance. ==Similarities to Scheme== R has first-class function closures. (i.e. correctly supports upward and downward funarg). R has a single namespace for functions and variables (Lisp-1). ==Important dissimilarities to Scheme (as opposed to other Lisps)== R is not properly tail-recursive. R does not have continuations or call-with-current-continuation or other mechanisms for implementing coroutines, general iterators, and the like. there is callCC, for example, which however seems kind of obsolete. There is nothing obsolete about it. It supports only downward or dynamic extent continuations and so is not useful (nor intended) for the things Stavros mentions. It is useful for escaping from deeply nested function calls, for example recursive examination of tree structures -- that is why it exists. At some point upward (at least one-shot) contitnuations may be added as well, but probably not soon. luke R supports keyword arguments. ==Similarities to Lisp and other dynamic languages, including Scheme== R is runtime-typed and garbage-collected. R supports nested read-eval-print loops for debugging etc. R expressions are represented as user-manipulable data structures. ==Dissimilarities to all (modern) Lisps, including Scheme== R has call-by-need, not call-by-object-value. R does not have macros. R objects are values, not pointers, so a-1:10; b-a; b[1]-999; a[1] = 999. Similarly, functions cannot modify the contents of their arguments. have you actually tried this code? even if the objects are values not pointers, assignment causes, in cases such as the above, copying the value with modifications applied as needed. thus, a[1] - 1, not 999, even though after b-a b and a are the same value object. try the following: system.time(x-1:(10^8)) system.time(y-x) system.time(y[1]-0) system.time(y[2]-0) head(x) head(y) with some trickery, functions can modify the contents of their arguments, using deparse/substitute and assign: a - 1 f - function(x) assign(deparse(substitute(x)), 0, parent.frame()) f(a) a the 'cannot modify the contents' does not apply to arguments that are environments: e - new.env(parent=emptyenv()) l - list() f - function(e) e$a = 0 f(e) e$a f(l) l$a There is no equivalent to set-car!/rplaca (not even pairlists and expressions). For example, r-pairlist(1,2); r[[1]]-r does not create a circular list. And in general there doesn't seem to be substructure sharing at the semantic level (though there may be in the implementation). computations on environment objects seem not to be subject to the copy-value-on-assignment semantics: e - new.env(parent=emptyenv()) ee - e e$a - 0 ee$a R does not have multiple value return in the Lisp sense. R assignment creates a new local variable on first assignment, dynamically. So static analysis is not enough to determine variable reference (R is not referentially transparent). Example: ff - function(a){if (a) x-1; x} ; x-99; ff(T) - 1; ff(F) - 99. In R, most data types (including numeric vectors) do not have a standard external representation which can be read back in without evaluation. R coerces logicals to numbers and numbers to strings. Lisps are stricter about automatic type conversion -- except that false a.k.a. NIL == () in Lisps other than Scheme. types are not treated coherently. in some situations, r coerces doubles to complex (according to the hierarchy of types specified here and there in the man pages), in others it won't: x - as.double(-1) y - as.complex(-1) x == y sqrt(x) sqrt(y) in certain cases, r will also do implicit inverse (downward) coercion: is(y:y) vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [EMAIL PROTECTED] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
[R] motif search
Hi, I am very new to R and wanted to know if there is a package that, given very long nucleotide sequences, searches and identifies short (7-10nt) motifs.. I would like to look for enrichment of certain motifs in genomic sequences. I tried using MEME (not an R package, I know), but the online version only allows sequences up to MAX 6 nucleotides, and that's too short for my needs.. Thanks A __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Analysis Functions in R
On Mon, Dec 08, 2008 at 09:34:35PM -0800, Feanor22 wrote: Hi experts of R, Are there any functions in R to test a univariate series for long memory effects, structural breaks and time reversability? I've found for ARCH effects(ArchTest), for normal (Shapiro.test, KS.test(comparing with randn) and lillie.test) but not for the above mentioned. Where can I find a comprehensive list of functions available by type? Please try the CRAN Task views for EmpiricalFinance, Econometrics and TimeSeries. Dirk -- Three out of two people have difficulties with fractions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pre-model Variable Reduction
Hi All, I beg to differ with Ravi Varadhan's perspective. While it is true that principal component analysis does not itself do variable selection, it is an important method for pointing the way to what to select. This is what the methods in the subselect package rely on. (One of its authors was I believe a student of Jolliffe's). For a modern perspective on this, see the following paper: Debashis Paul, Eric Bair, Trevor Hastie and Robert Tibshirani: Preconditioning for feature selection and regression in high-dimensional problems We show that supervised principal components followed by a variable selection procedure is an effective approach for variable selection in very high dimension. Annals of Statistics 36(4), 2008, 1595-1618. http://www-stat.stanford.edu/~hastie/Papers/Preconditioning_Annals.pdf Regards, Mark. Ravi Varadhan wrote: Principal components analysis does dimensionality reduction but NOT variable reduction. However, Jolliffe's 2004 book on PCA does discuss the problem of selecting a subset of variables, with the goal of representing the internal variation of original multivariate vector as well as possible (see Section 6.3 of that book). I do not think that these methods can handle missing data. The most important issue is to think about the goal of variable reduction and then choose an appropriate optimality criterion for achieving that goal. In most instances of variable selection, the criterion that is optimized is never explicitly considered. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gabor Grothendieck Sent: Tuesday, December 09, 2008 8:00 AM To: Harsh Cc: r-help@r-project.org Subject: Re: [R] Pre-model Variable Reduction See: ?prcomp ?princomp On Tue, Dec 9, 2008 at 5:34 AM, Harsh [EMAIL PROTECTED] wrote: Hello All, I am trying to carry out variable reduction. I do not have information about the dependent variable, and have only the X variables as it were. In selecting variables I wish to keep, I have considered the following criteria. 1) Percentage of missing value in each column/variable 2) Variance of each variable, with a cut-off value. I recently came across Weka and found that there is an RWeka package which would allow me to make use of Weka through R. Weka provides a Genetic search variable reduction method, but I could not find its R code implementation in the RWeka Pdf file on CRAN. I looked for other R packages that allow me to do variable reduction without considering a dependent variable. I came across 'dprep' package but it does not have a Windows implementation. Moreover, I have a dataset that contains continuous and categorical variables, some categorical variables having 3 levels, 10 levels and so on, till a max 50 levels (E.g. States in the USA). Any suggestions in this regard will be much appreciated. Thank you Harsh Singhal Decision Systems, Mu Sigma, Inc. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Pre-model-Variable-Reduction-tp20912229p20916445.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorry, I have attached the data - Here is the code that causes wavCWTPeaks error
I'm stumped. On Tue, Dec 9, 2008 at 7:53 AM, [EMAIL PROTECTED] wrote: -Messaggio originale- Da: [EMAIL PROTECTED] Inviato: mar 09/12/2008 13.52 A: stephen sefick; Francesco Masulli; Stefano Rovetta Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Oggetto: Here is the code that causes wavCWTPeaks error aats - create.signalSeries(aa, pos=list(from=0.0, by=0.033)) aa.cwt - wavCWT(aats) x11 (width=10,height=12) plot (aats,main=paste(insig, Cycle: ,j,sep=)) aa.maxtree - wavCWTTree (aa.cwt, type=maxima) aa.mintree - wavCWTTree (aa.cwt, type=minima) aa.maxpeak - wavCWTPeaks (aa.maxtree) aa.minpeak - wavCWTPeaks (aa.mintree) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : invalid 'row.names' length bb - -aa #EXCHANGE MAXIMA WITH MINIMA bbts - create.signalSeries(bb, pos=list(from=0.0, by=0.033)) plot (bbts,main=paste(insig, Cycle: ,j,sep=)) plot (bbts,main=paste(insig, Cycle: ,j,sep=)) bb.cwt - wavCWT(bbts) bb.maxtree - wavCWTTree (bb.cwt, type=maxima) bb.mintree - wavCWTTree (bb.cwt, type=minima) bb.maxpeak - wavCWTPeaks (bb.maxtree) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : invalid 'row.names' length bb.minpeak - wavCWTPeaks (bb.mintree) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : invalid 'row.names' length Attached are aa breathing cycle amplitude values (zip compressed) I'd like to figure out what is wrong with my data. wmTSA documentation does not mention any time series constraint. Thank you in advance. Kind regards, Maura -Messaggio originale- Da: stephen sefick [mailto:[EMAIL PROTECTED] Inviato: mar 09/12/2008 8.11 A: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Oggetto: Re: [R] package wmtsa: wavCWTPeaks error Are the names of the rows the same as the time series that you are using? I know that I am not being that helpful, but this seems like a mismatch in the time series object. look at length(rowname(your.data)) length(your.data[,1]) again it is always helpful to have reproducible code. On Tue, Dec 9, 2008 at 1:39 AM, [EMAIL PROTECTED] wrote: I keep getting the following error when I look for minima in the series: aa.peak - wavCWTPeaks (aa.tree) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : invalid 'row.names' length How can I work it around ? Thank you. Regards, Maura Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e tutti i telefonini TIM! Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e tutti i telefonini TIM! Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e tutti i telefonini TIM! Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer -- Stephen Sefick Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting rows that are the same in separate data frames
I'm not sure what you want, but take a look at ?merge and %in% ppaarrkk wrote: I want to compare two matrices or data frames and select or get an index for those rows which are the same in both. I have tried the following : a = matrix ( 1:10, ncol = 2 ) a b = matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 ) b a[a==b] a = as.data.frame ( matrix ( 1:10, ncol = 2 ) ) a b = as.data.frame ( matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 ) ) b a[a==b] Any ideas please. Thanks. Simon Parker Imperial College -- View this message in context: http://www.nabble.com/Selecting-rows-that-are-the-same-in-separate-data-frames-tp20916243p20916727.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R: Sorry, I have attached the data - Here is the code that causes wavCWTPeaks error
Maxima are found if I prolong the 1-cycle time series on both sides, appaently no matter which values I use. The following sequence works fine: aal - rep (aa[1], length(aa)) aar - rep (aa[length(aa)], length(aa)) aa3 - c(aal,aa,aar) #APPEND CYCLE TO ITSELF TWICE aa3ts - create.signalSeries(aa3, pos=list(from=0.0, by=0.033)) #CONVERT AMPLITUDE INTO TIME-SERIES plot (aa3ts, main=paste(insig, Cycle: ,j,sep=)) aa3.cwt - wavCWT(aa3ts) #CWT aa3.maxtree - wavCWTTree (aa3.cwt, type=maxima)# GENERATE CWT MAXIMA TREE aa3.maxpeak - wavCWTPeaks (aa3.maxtree)#GET MAXIMUM PEAK Since the minima of aa are the maxima of -aa I reverse the sign and then repeat the above procedure. The following sequence works fine: ## EXCHANGE MAXIMA WITH MINIMA bb - -aa #REVERSE SIGN bbl - rep (bb[1], length(bb))\ bbr - rep (bb[length(bb)], length(bb)) bb3 - c(bbl,bb,bbr) #APPEND CYCLE TO ITSELF TWICE bb3ts - create.signalSeries(bb3, pos=list(from=0.0, by=0.033)) #CONVERT AMPLITUDE INTO TIME-SERIES plot (bb3ts, main=paste(insig, Cycle: ,j,sep=)) bb3.cwt - wavCWT(bb3ts) #CWT bb3.maxtree - wavCWTTree (bb3.cwt, type=maxima) #GENERATE CWT MAXIMA TREE bb3.maxpeak - wavCWTPeaks (bb3.maxtree)#GET MAXIMUM PEAKS It looks like wmTSA functions work like the Moving Average algorithm which uses a portion of the time series for kind of self-training Best regards, Maura -Messaggio originale- Da: stephen sefick [mailto:[EMAIL PROTECTED] Inviato: mar 09/12/2008 16.15 A: [EMAIL PROTECTED] Cc: Francesco Masulli; Stefano Rovetta; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Oggetto: Re: Sorry, I have attached the data - Here is the code that causes wavCWTPeaks error I'm stumped. On Tue, Dec 9, 2008 at 7:53 AM, [EMAIL PROTECTED] wrote: -Messaggio originale- Da: [EMAIL PROTECTED] Inviato: mar 09/12/2008 13.52 A: stephen sefick; Francesco Masulli; Stefano Rovetta Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Oggetto: Here is the code that causes wavCWTPeaks error aats - create.signalSeries(aa, pos=list(from=0.0, by=0.033)) aa.cwt - wavCWT(aats) x11 (width=10,height=12) plot (aats,main=paste(insig, Cycle: ,j,sep=)) aa.maxtree - wavCWTTree (aa.cwt, type=maxima) aa.mintree - wavCWTTree (aa.cwt, type=minima) aa.maxpeak - wavCWTPeaks (aa.maxtree) aa.minpeak - wavCWTPeaks (aa.mintree) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : invalid 'row.names' length bb - -aa #EXCHANGE MAXIMA WITH MINIMA bbts - create.signalSeries(bb, pos=list(from=0.0, by=0.033)) plot (bbts,main=paste(insig, Cycle: ,j,sep=)) plot (bbts,main=paste(insig, Cycle: ,j,sep=)) bb.cwt - wavCWT(bbts) bb.maxtree - wavCWTTree (bb.cwt, type=maxima) bb.mintree - wavCWTTree (bb.cwt, type=minima) bb.maxpeak - wavCWTPeaks (bb.maxtree) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : invalid 'row.names' length bb.minpeak - wavCWTPeaks (bb.mintree) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : invalid 'row.names' length Attached are aa breathing cycle amplitude values (zip compressed) I'd like to figure out what is wrong with my data. wmTSA documentation does not mention any time series constraint. Thank you in advance. Kind regards, Maura -Messaggio originale- Da: stephen sefick [mailto:[EMAIL PROTECTED] Inviato: mar 09/12/2008 8.11 A: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Oggetto: Re: [R] package wmtsa: wavCWTPeaks error Are the names of the rows the same as the time series that you are using? I know that I am not being that helpful, but this seems like a mismatch in the time series object. look at length(rowname(your.data)) length(your.data[,1]) again it is always helpful to have reproducible code. On Tue, Dec 9, 2008 at 1:39 AM, [EMAIL PROTECTED] wrote: I keep getting the following error when I look for minima in the series: aa.peak - wavCWTPeaks (aa.tree) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : invalid 'row.names' length How can I work it around ? Thank you. Regards, Maura Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e tutti i telefonini TIM! Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
[R] problem with Vista
Hello, I want to import a txt table into R but the software give me this message. I have windows vista. Errore in file(file, r) : cannot open this connection Besides: Warning message: In file(file, r) :  cannot open file 'C:/Users/Vincenzo/Desktop/prova/prova.txt': No such file or directory thank you ** Vincenzo Landi Post Doctorate student Animal genomic and breeding cell:0039/339538871@�������X����@ ����le=font-family: comic sans ms;Fax. 075-5857122 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical inconsistency
Many thanks for your help, perhaps I should have set my query in context ! I'm simply calculating an indicator variable [0,1] based on the whether the difference between two measured variables is 1 or =1. I understand the FAQ about floating point arithmetic, but am still puzzled that it only apparently applies to certain elements, as follows: 8.8 - 7.8 1 TRUE 8.3 - 7.3 1 TRUE  However,  10.2 - 9.2 1 FALSE  11.3 - 10.31  FALSE Emma Jane From: Bernardo Rangel Tura [EMAIL PROTECTED] To: Wacek Kusnierczyk [EMAIL PROTECTED] Cc: R help [EMAIL PROTECTED] Sent: Saturday, 6 December, 2008 10:00:48 Subject: Re: [R] Logical inconsistency On Fri, 2008-12-05 at 14:18 +0100, Wacek Kusnierczyk wrote: Berwin A Turlach wrote: Dear Emma, On Fri, 5 Dec 2008 04:23:53 -0800 (PST)   Please could someone kindly explain the following inconsistencies I've discovered__when performing logical calculations in R: 8.8 - 7.8 1   TRUE    8.3 - 7.3 1   TRUE    Gladly: FAQ 7.31 http://cran.at.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f  well, this answer the question only partially. this explains why a system with finite precision arithmetic, such as r, will fail to be logically correct in certain cases. it does not explain why r, a language said to isolate a user from the underlying implementational choices, would have to fail this way. there is, in principle, no problem in having a high-level language perform the computation in a logically consistent way. for example, bc is an arbitrary precision calculator language, and has no problem with examples as the above: bc 8.8 - 7.8 1 # 0, meaning 'no' bc 8.3 - 7.3 1 # 0, meaning 'no' bc 8.8 - 7.8 == 1 # 1, meaning 'yes' the fact that r (and many others, including matlab and sage, perhaps not mathematica) does not perform logically here is a consequence of its implementation of floating point arithmetic. the faq you were pointed to, and its referring to the goldberg's article, show that r does not successfully isolate a user from details of the lower-level implementation. vQ Well, first of all for 8.-7.3 is not equal to 1 [for computers] 8.3-7.3-1 [1] 8.881784e-16 But if you use only one digit precision round(8.3-7.3,1)-1 [1] 0 round(8.3-7.3,1)-10 [1] FALSE round(8.3-7.3,1)==1 [1] TRUE So the problem is the code write and no the software -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANCOVA
ANCOVA usually is used when you have a numerical covariate that you want to adjust for, your description does not include any, so a regular anova (aov function) can be used. If you do have 1 or more covariates to adjust for, then it can be thought of as a regression problem (ANCOVA is a special case of linear models/regression) and you can fit it using either the aov or lm function. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Samuel Okoye Sent: Tuesday, December 09, 2008 2:03 AM To: [EMAIL PROTECTED] Subject: [R] ANCOVA Hello, Could you please help me in the following question: I have 16 persons 6 take 0.5 mg, 6 take 0.75 mg and 4 take placebo! Can I use the ANCOVA and t-test in this case? Is it possible in R? Thank you in advance, Samuel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add accuracy, sensitivity, specificity to logistic regression output?
Dear Pufft, You might interested in the lroc function in the epicalc [1] package and the ROCR [2] package itself. [1] http://cran.r-project.org/web/packages/epicalc/index.html [2] http://cran.r-project.org/web/packages/ROCR/index.html HTH, Jorge On Mon, Dec 8, 2008 at 11:39 PM, pufftissue pufftissue [EMAIL PROTECTED] wrote: Hi, Is there a way when doing logistic regression for the output to spit out accuracy, sensitivity, and specificity? I would like to know these basic measures for my model. Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting rows that are the same in separate data frames
Dear Simon, Try this: # Index -- FALSE, TRUE sapply(1:nrow(a),function(x) all(a[x,]%in%b)) # Rows of a that are in b which(sapply(1:nrow(a),function(x) all(a[x,]%in%b))) # Reporting a[sapply(1:nrow(a),function(x) all(a[x,]%in%b)),] HTH, Jorge On Tue, Dec 9, 2008 at 9:57 AM, ppaarrkk [EMAIL PROTECTED] wrote: I want to compare two matrices or data frames and select or get an index for those rows which are the same in both. I have tried the following : a = matrix ( 1:10, ncol = 2 ) a b = matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 ) b a[a==b] a = as.data.frame ( matrix ( 1:10, ncol = 2 ) ) a b = as.data.frame ( matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 ) ) b a[a==b] Any ideas please. Thanks. Simon Parker Imperial College -- View this message in context: http://www.nabble.com/Selecting-rows-that-are-the-same-in-separate-data-frames-tp20916243p20916243.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bootstrap a GLM model with Poisson family
Hello! Anyone can help me to know how to do a bootstrap evaluation analysis to a GLM with family=Poisson? I have some R codes but they were coded only for family=binomial... thanks a lot Joana Vicente .·. CIBIO - Centro de Investigação em Biodiversidade e Recursos Genéticos. Faculdade de Ciências do Porto Departamento de Botânica -Edifício FC4 Rua Do Campo Alegre, S/N 4169-007 Porto - Portugal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting rows that are the same in separate data frames
Thanks for reply. What I want is the equivalent of this : xxx = 1:10 which(xxx %in% c(2,5)) ...but where there is more than one criterion for matching. which (b %in% a) in the code I included does nothing (not surprisingly). I'm not sure that I can use merge, because I want the whole of a, but to mark those rows which are also in b. If I do merge ( a,b ), I just get b. If I do merge ( a,b, all.x =TRUE), I get a. bartjoosen wrote: I'm not sure what you want, but take a look at ?merge and %in% ppaarrkk wrote: I want to compare two matrices or data frames and select or get an index for those rows which are the same in both. I have tried the following : a = matrix ( 1:10, ncol = 2 ) a b = matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 ) b a[a==b] a = as.data.frame ( matrix ( 1:10, ncol = 2 ) ) a b = as.data.frame ( matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 ) ) b a[a==b] Any ideas please. Thanks. Simon Parker Imperial College -- View this message in context: http://www.nabble.com/Selecting-rows-that-are-the-same-in-separate-data-frames-tp20916243p20917838.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] creating standard curves for ELISA analysis
Hello R guru's I am a newbie to R, In my research work I usually generate a lot of ELISA data in form of absorbance values. I ususally use Excel to calculate the concentrations of unknown, but it is too tedious and manual especially when I have 100's of files to process. I would appreciate some help in creating a R script to do this with minimal manual input. s A1-G1 and A2-G2 are standards serially diluted H1 and H2 are Blanks. A3 to H12 are serum samples. I am pasting the structure of my data below: A1 14821 B1 11577 C1 5781 D1 2580 E1 902 F1 264 G1 98 H1 4 A2 14569.5 B2 11060 C2 5612 D2 2535 E2 872 F2 285 G2 85 H2 3 A3 1016 B3 2951.5 C3 547 D3 1145 E3 4393 F3 4694 G3 1126 H3 1278 A4 974.5 B4 3112.5 C4 696.5 D4 2664.5 E4 184.5 F4 1908 G4 108.5 H4 1511 A5 463.5 B5 1365 C5 816 D5 806 E5 1341 F5 1157 G5 542.5 H5 749 -- View this message in context: http://www.nabble.com/creating-standard-curves-for-ELISA-analysis-tp20917182p20917182.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical inconsistency
Some (possibly all) of those numbers cannot be represented exactly, so there is a chance of round off error whenever you do some arithmetic, sometimes the errors cancel out, sometimes they don't. Consider: print(8.3-7.3, digits=20) [1] 1.001 print(11.3-10.3, digits=20) [1] 1 So in the first case the rounding error gives a value that is slightly greater than 1, so the greater than test returns true (if you round the result before comparing to 1, then it will return false). In the second case the uncertainties cancelled out so that you get exactly 1 which is not greater than 1 an so the comparison returns false. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of emma jane Sent: Tuesday, December 09, 2008 7:02 AM To: Bernardo Rangel Tura; Wacek Kusnierczyk; Chuck Cleland Cc: R help Subject: Re: [R] Logical inconsistency Many thanks for your help, perhaps I should have set my query in context ! I'm simply calculating an indicator variable [0,1] based on the whether the difference between two measured variables is 1 or =1. I understand the FAQ about floating point arithmetic, but am still puzzled that it only apparently applies to certain elements, as follows: 8.8 - 7.8 1 TRUE 8.3 - 7.3 1 TRUE However, 10.2 - 9.2 1 FALSE 11.3 - 10.31 FALSE Emma Jane From: Bernardo Rangel Tura [EMAIL PROTECTED] To: Wacek Kusnierczyk [EMAIL PROTECTED] Cc: R help [EMAIL PROTECTED] Sent: Saturday, 6 December, 2008 10:00:48 Subject: Re: [R] Logical inconsistency On Fri, 2008-12-05 at 14:18 +0100, Wacek Kusnierczyk wrote: Berwin A Turlach wrote: Dear Emma, On Fri, 5 Dec 2008 04:23:53 -0800 (PST) Please could someone kindly explain the following inconsistencies I've discovered__when performing logical calculations in R: 8.8 - 7.8 1 TRUE 8.3 - 7.3 1 TRUE Gladly: FAQ 7.31 http://cran.at.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R- th ink-these-numbers-are-equal_003f well, this answer the question only partially. this explains why a system with finite precision arithmetic, such as r, will fail to be logically correct in certain cases. it does not explain why r, a language said to isolate a user from the underlying implementational choices, would have to fail this way. there is, in principle, no problem in having a high-level language perform the computation in a logically consistent way. for example, bc is an arbitrary precision calculator language, and has no problem with examples as the above: bc 8.8 - 7.8 1 # 0, meaning 'no' bc 8.3 - 7.3 1 # 0, meaning 'no' bc 8.8 - 7.8 == 1 # 1, meaning 'yes' the fact that r (and many others, including matlab and sage, perhaps not mathematica) does not perform logically here is a consequence of its implementation of floating point arithmetic. the faq you were pointed to, and its referring to the goldberg's article, show that r does not successfully isolate a user from details of the lower-level implementation. vQ Well, first of all for 8.-7.3 is not equal to 1 [for computers] 8.3-7.3-1 [1] 8.881784e-16 But if you use only one digit precision round(8.3-7.3,1)-1 [1] 0 round(8.3-7.3,1)-10 [1] FALSE round(8.3-7.3,1)==1 [1] TRUE So the problem is the code write and no the software -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] motif search
Hi Alessia, you may want to post this kind of question on Bioc mailing list, is more appropriate. http://www.bioconductor.org/docs/mailList.html About your question , I'm not 100% sure, but check if Biostrings pkg can do what you need to do. http://www.bioconductor.org/workshops/2008/SeattleNov08/MatchAlign/ Best Regards Anna Anna Freni Sterrantino Ph.D Student Department of Statistics University of Bologna, Italy via Belle Arti 41, 40124 BO. Da: Alessia Deglincerti [EMAIL PROTECTED] A: r-help@r-project.org Inviato: Martedì 9 dicembre 2008, 16:03:55 Oggetto: [R] motif search Hi, I am very new to R and wanted to know if there is a package that, given very long nucleotide sequences, searches and identifies short (7-10nt) motifs.. I would like to look for enrichment of certain motifs in genomic sequences. I tried using MEME (not an R package, I know), but the online version only allows sequences up to MAX 6 nucleotides, and that's too short for my needs.. Thanks A __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] SFA tools moved from micEcon to frontier
Dear R users, I would like to inform you that everything of the micEcon package that is related to Stochastic Frontier Analysis (SFA) has been moved to the frontier package, because this is a more appropriate place for the functions front41WriteInput, front41ReadOutput, and front41Est, and the corresponding (S3) methods. The data sets riceProdPhil and Coelli have been removed from the micEcon package, because they were already included in the frontier package (the latter is named front41Data in the frontier package). The new frontier package (with the front41... functions) is available on CRAN now and the new micEcon package (without the front41... functions) will be available on CRAN in some time. Please use the meantime to update your code. Please don't hesitate to contact me if you have any questions or comments, Arne -- Arne Henningsen http://www.arne-henningsen.name/ ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replacing tabs with appropriate number of spaces
Colleagues, Platform: OS X (but issue applies to all platforms) Version: 2.8.0 I have a mixture of text and data that I am outputting via R to a pdf document (using a fixed-width font). The text contains tabs that align columns properly with a fixed-width font in a terminal window. However, when the PDF document is created, the concept of a tab is not invoked properly and columns do not align. I could use brute force as follows: 1. identify lines of text containing tabs 2. strsplit on tabs 3. count characters preceding the tab, then replace the tab with the appropriate number of spaces (e.g., if the string preceding the tab has 29 characters, add 3 spaces), then paste(..., sep=) However, I am sure a more elegant approach exists. Can anyone offer one? Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Applying min to numeric vectors
I was surprised this morning, that it seems as though the min() function does not work as *I* anticipated when given vector arguments. For example: a - 1:10 b - c(rep(1, times=5), rep(10, times=5)) Result: min(a,b) 1 What I actually wanted was a term by term minimum, i.e.: ifelse(a=b, a, b) 1 1 1 1 1 6 7 8 9 10 Am I losing much in terms of computation power if I use the ifelse? I'm a little worried, because in implementation my vectors are quite long, and I will be computing the min of many of them, min(a,b,c,d,e,f) where a through f are all vectors of the same length. Any insight that can be provided is much appreciated. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Applying min to numeric vectors
Try this: pmin(a, b) On Tue, Dec 9, 2008 at 2:58 PM, Brigid Mooney [EMAIL PROTECTED] wrote: I was surprised this morning, that it seems as though the min() function does not work as *I* anticipated when given vector arguments. For example: a - 1:10 b - c(rep(1, times=5), rep(10, times=5)) Result: min(a,b) 1 What I actually wanted was a term by term minimum, i.e.: ifelse(a=b, a, b) 1 1 1 1 1 6 7 8 9 10 Am I losing much in terms of computation power if I use the ifelse? I'm a little worried, because in implementation my vectors are quite long, and I will be computing the min of many of them, min(a,b,c,d,e,f) where a through f are all vectors of the same length. Any insight that can be provided is much appreciated. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bootstrap a GLM model with Poisson family
Hi, if you already have code for the binomial, can you not just replace family=binomial by family=Poisson in the code? Cheers, Daniel - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Joana Vicente Gesendet: Tuesday, December 09, 2008 11:03 AM An: r-help@r-project.org Betreff: [R] Bootstrap a GLM model with Poisson family Hello! Anyone can help me to know how to do a bootstrap evaluation analysis to a GLM with family=Poisson? I have some R codes but they were coded only for family=binomial... thanks a lot Joana Vicente .·. CIBIO - Centro de Investigação em Biodiversidade e Recursos Genéticos. Faculdade de Ciências do Porto Departamento de Botânica -Edifício FC4 Rua Do Campo Alegre, S/N 4169-007 Porto - Portugal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ARMA
Hi! Is there any package or function on R to ARMA models (Box Jenkins, without sazonality and trend) with resources to automatic identification for p and q ? Regards, Raphael Saldanha Brazil [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replacing tabs with appropriate number of spaces
On Tue, Dec 9, 2008 at 10:39 AM, Dennis Fisher [EMAIL PROTECTED] wrote: Colleagues, Platform: OS X (but issue applies to all platforms) Version: 2.8.0 I have a mixture of text and data that I am outputting via R to a pdf document (using a fixed-width font). The text contains tabs that align columns properly with a fixed-width font in a terminal window. However, when the PDF document is created, the concept of a tab is not invoked properly and columns do not align. I could use brute force as follows: 1. identify lines of text containing tabs 2. strsplit on tabs 3. count characters preceding the tab, then replace the tab with the appropriate number of spaces (e.g., if the string preceding the tab has 29 characters, add 3 spaces), then paste(..., sep=) Tabs usually just expand to a fixed number of spaces - so gsub(\t, , text) would do the job. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get Greenhouse-Geisser epsilons from anova?
Dear John and Peter, thank you both very much for your help! Everything works fine now! John, Anova also works very fine. Thank you very much! However, if I had more than 2 levels for the between factor the same thing as mentioned occured. The degrees of freedom showed that Anova calculated it as if all subjects came from the same group, for example for main effect A the dfs are 1 and 35. Since I can get those values using anova that causes no problem. I saw that the x$G to get the greenhouse-geisser epsilon do work for: x- anova(mlmfitD, X=~C+B, M=~A+C+B, test = Spherical) but does not work for y$G: y - anova(mlmfit, mlmfit0, X= ~C+B, M = ~A+C+B, idata = dd,test=Spherical) Finally, the Greenhouse-Geisser epsilons are identical using both methods and to the SPSS output. The Huynh-Feldt are not the same as them of SPSS. I will use GG instead. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pre-model Variable Reduction
Mark Difford wrote: Hi All, I beg to differ with Ravi Varadhan's perspective. While it is true that principal component analysis does not itself do variable selection, it is an important method for pointing the way to what to select. This is what the methods in the subselect package rely on. (One of its authors was I believe a student of Jolliffe's). For a modern perspective on this, see the following paper: Debashis Paul, Eric Bair, Trevor Hastie and Robert Tibshirani: Preconditioning for feature selection and regression in high-dimensional problems We show that supervised principal components followed by a variable selection procedure is an effective approach for variable selection in very high dimension. Annals of Statistics 36(4), 2008, 1595-1618. http://www-stat.stanford.edu/~hastie/Papers/Preconditioning_Annals.pdf Regards, Mark. Mark, Slightly more relevant is the unsupervised sparse principal component methods described in the following references. If anyone knows of better references for this please let me know. -Frank @Article{zou06spa, author = {Zhou, Hui and Hastie, Trevor and Tibshirani, Robert}, title ={Sparse principal component analysis}, journal = J Comp Graph Stat, year = 2006, volume = 15, pages ={265-286}, annote = {gene microarray;lasso/elastic net;multivariate analysis;data reduction;singular value decomposition;thresholding;principal components analysis that shrinks some loadings to zero} } @Article{wit08tes, author = {Witten, Daniela M. and Tibshirani, Robert}, title = {Testing significance of features by lassoed principal components}, journal = Annals Appl Stat, year = 2008, volume = 2, number = 3, pages ={986-1012}, annote = {reduction in false discovery rates over using a vector of t-statistics;borrowing strength across genes;``one would not expect a single gene to be associated with the outcome, since, in practice, many genes work together to effect a particular phenotype. LPC effectively down-weights individual genes that are associated with the outcome but that do not share an expression pattern with a larger group of genes, and instead favors large groups of genes that appear to be differentially-expressed.'';regress principal components on outcome} } Ravi Varadhan wrote: Principal components analysis does dimensionality reduction but NOT variable reduction. However, Jolliffe's 2004 book on PCA does discuss the problem of selecting a subset of variables, with the goal of representing the internal variation of original multivariate vector as well as possible (see Section 6.3 of that book). I do not think that these methods can handle missing data. The most important issue is to think about the goal of variable reduction and then choose an appropriate optimality criterion for achieving that goal. In most instances of variable selection, the criterion that is optimized is never explicitly considered. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gabor Grothendieck Sent: Tuesday, December 09, 2008 8:00 AM To: Harsh Cc: r-help@r-project.org Subject: Re: [R] Pre-model Variable Reduction See: ?prcomp ?princomp On Tue, Dec 9, 2008 at 5:34 AM, Harsh [EMAIL PROTECTED] wrote: Hello All, I am trying to carry out variable reduction. I do not have information about the dependent variable, and have only the X variables as it were. In selecting variables I wish to keep, I have considered the following criteria. 1) Percentage of missing value in each column/variable 2) Variance of each variable, with a cut-off value. I recently came across Weka and found that there is an RWeka package which would allow me to make use of Weka through R. Weka provides a Genetic search variable reduction method, but I could not find its R code implementation in the RWeka Pdf file on CRAN. I looked for other R packages that allow me to do variable reduction without considering a dependent variable. I came across 'dprep' package but it does not have a Windows implementation. Moreover, I have a dataset that contains continuous and categorical variables, some categorical variables having 3 levels, 10 levels and so on, till a max 50 levels (E.g. States in the USA). Any suggestions in this regard will be much appreciated. Thank you Harsh Singhal Decision Systems,
Re: [R] Applying min to numeric vectors
Dear Brigid, Try: pmin(a,b) HTH, Jorge On Tue, Dec 9, 2008 at 11:58 AM, Brigid Mooney [EMAIL PROTECTED] wrote: I was surprised this morning, that it seems as though the min() function does not work as *I* anticipated when given vector arguments. For example: a - 1:10 b - c(rep(1, times=5), rep(10, times=5)) Result: min(a,b) 1 What I actually wanted was a term by term minimum, i.e.: ifelse(a=b, a, b) 1 1 1 1 1 6 7 8 9 10 Am I losing much in terms of computation power if I use the ifelse? I'm a little worried, because in implementation my vectors are quite long, and I will be computing the min of many of them, min(a,b,c,d,e,f) where a through f are all vectors of the same length. Any insight that can be provided is much appreciated. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pre-model Variable Reduction
Hi Frank, If anyone knows of better references for this please let me know. Many thanks: I was not aware of the Witten paper. If I turn up anything else I will be sure to let you know. Best Regards, Mark. Frank E Harrell Jr wrote: Mark Difford wrote: Hi All, I beg to differ with Ravi Varadhan's perspective. While it is true that principal component analysis does not itself do variable selection, it is an important method for pointing the way to what to select. This is what the methods in the subselect package rely on. (One of its authors was I believe a student of Jolliffe's). For a modern perspective on this, see the following paper: Debashis Paul, Eric Bair, Trevor Hastie and Robert Tibshirani: Preconditioning for feature selection and regression in high-dimensional problems We show that supervised principal components followed by a variable selection procedure is an effective approach for variable selection in very high dimension. Annals of Statistics 36(4), 2008, 1595-1618. http://www-stat.stanford.edu/~hastie/Papers/Preconditioning_Annals.pdf Regards, Mark. Mark, Slightly more relevant is the unsupervised sparse principal component methods described in the following references. If anyone knows of better references for this please let me know. -Frank @Article{zou06spa, author ={Zhou, Hui and Hastie, Trevor and Tibshirani, Robert}, title = {Sparse principal component analysis}, journal = J Comp Graph Stat, year = 2006, volume =15, pages = {265-286}, annote ={gene microarray;lasso/elastic net;multivariate analysis;data reduction;singular value decomposition;thresholding;principal components analysis that shrinks some loadings to zero} } @Article{wit08tes, author ={Witten, Daniela M. and Tibshirani, Robert}, title = {Testing significance of features by lassoed principal components}, journal = Annals Appl Stat, year = 2008, volume =2, number =3, pages = {986-1012}, annote ={reduction in false discovery rates over using a vector of t-statistics;borrowing strength across genes;``one would not expect a single gene to be associated with the outcome, since, in practice, many genes work together to effect a particular phenotype. LPC effectively down-weights individual genes that are associated with the outcome but that do not share an expression pattern with a larger group of genes, and instead favors large groups of genes that appear to be differentially-expressed.'';regress principal components on outcome} } Ravi Varadhan wrote: Principal components analysis does dimensionality reduction but NOT variable reduction. However, Jolliffe's 2004 book on PCA does discuss the problem of selecting a subset of variables, with the goal of representing the internal variation of original multivariate vector as well as possible (see Section 6.3 of that book). I do not think that these methods can handle missing data. The most important issue is to think about the goal of variable reduction and then choose an appropriate optimality criterion for achieving that goal. In most instances of variable selection, the criterion that is optimized is never explicitly considered. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gabor Grothendieck Sent: Tuesday, December 09, 2008 8:00 AM To: Harsh Cc: r-help@r-project.org Subject: Re: [R] Pre-model Variable Reduction See: ?prcomp ?princomp On Tue, Dec 9, 2008 at 5:34 AM, Harsh [EMAIL PROTECTED] wrote: Hello All, I am trying to carry out variable reduction. I do not have information about the dependent variable, and have only the X variables as it were. In selecting variables I wish to keep, I have considered the following criteria. 1) Percentage of missing value in each column/variable 2) Variance of each variable, with a cut-off value. I recently came across Weka and found that there is an RWeka package which would allow me to make use of Weka through R. Weka provides a Genetic search variable reduction method, but I could not find its R code implementation in the RWeka Pdf file on CRAN. I looked for other R packages that allow me to do variable reduction without considering a dependent variable. I came across 'dprep' package but it
[R] Was Logical inconsistency - algorithm portability
The logical inconsistency thread has wandered a bit and discussion now veers towards portability of algorithms. On that subject, I have off-list been sharing ideas about the existence of functions to ensure optimizing compilers do not corrupt logical tests in numerical methods. Apparently the volatile adjective may have a role here (I am generally a visiting programmer in C, doing just enough to get things to work, and cribbing other folks' code). One need is for a function that ensures the stored representation of numbers is returned, call it STORED, so that one can test if STORED(x) == STORED(y). Various hacks can do this. One time we used an array of length 2. But hacks are generally ugly and vulnerable to updates in compilers, so a properly written and documented function would be helpful. And, of course, may already exist. As this is not a short-term help request but a long term need, I suggest contacting me off list about this subject. Cheers,JN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need help optimizing/vectorizing nested loops
On Tue, 9 Dec 2008, tyler wrote: Hi, I'm analyzing a large number of large simulation datasets, and I've isolated one of the bottlenecks. Any help in speeding it up would be appreciated. Cast the neighborhoods as an indicator matrix, then use matrix multiplications: system.time(tmp - time.test(dat)) user system elapsed 1.180.001.20 system.time({ + mn - with(dat,outer(1:25,1:25, function(i,j) abs(X[i]-X[j])2 abs(Y[i]-Y[j])2 i!=j )) + print(all.equal(rowSums(mn%*%as.matrix(dat[,-(1:2)])0),tmp))}) [1] TRUE user system elapsed 0 0 0 HTH, Chuck `dat` is a dataframe of samples from a regular grid. The first two columns are the spatial coordinates of the samples, the remaining 20 columns are the abundances of species in each cell. I need to calculate the species richness in adjacent cells for each cell in the sample. For example, if I have nine cells in my dataframe (X = 1:3, Y = 1:3): a b c d e f g h i I need to calculate the neighbour-richness for each cell; for a, this is the richness of cells b, d and e combined. The neighbour richness of cell e would be the combined richness of all the other eight cells. The following code does what I what, but it's slow. The sample dataset 'dat', below, represents a 5x5 grid, 25 samples. It takes about 1.5 seconds on my computer. The largest samples I am working with have a 51 x 51 grid (2601 cells) and take 4.5 minutes. This is manageable, but since I have potentially hundreds of these analyses to run, trimming that down would be very helpful. After loading the function and the data, the call system.time(tmp - time.test(dat)) Will run the code. Note that I've excised this from a larger, more general function, after determining that for large datasets this section is responsible for a slowdown from 10-12 seconds to ca. 250 seconds. Thanks for your patience, Tyler time.test - function(dat) { cen - dat grps - 5 n.rich - numeric(grps^2) n.ind - 1 for(i in 1:grps) for (j in 1:grps) { n.cen - numeric(ncol(cen) - 2) neighbours - expand.grid((j-1):(j+1), (i-1):(i+1)) neighbours - neighbours[-5,] neighbours - neighbours[which(neighbours[,1] %in% 1:grps neighbours[,2] %in% 1:grps),] for (k in 1:nrow(neighbours)) n.cen - n.cen + cen[cen$X == neighbours[k,1] cen$Y == neighbours[k,2], -c(1:2)] n.rich[n.ind] - sum(as.logical(n.cen)) n.ind - n.ind + 1 } return(n.rich) } `dat` - structure(list( X = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5), Y = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5), V1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 45L, 131L, 0L, 0L, 34L, 481L, 1744L), V2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 88L, 0L, 70L, 101L, 13L, 634L, 0L, 0L, 71L, 640L, 1636L), V3 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 49L, 3L, 113L, 1L, 44L, 167L, 336L, 933L, 0L, 14L, 388L, 1180L, 1709L), V4 = c(0L, 0L, 0L, 0L, 0L, 0L, 3L, 12L, 0L, 0L, 2L, 1L, 36L, 45L, 208L, 7L, 221L, 213L, 371L, 1440L, 26L, 211L, 389L, 1382L, 1614L), V5 = c(96L, 7L, 0L, 0L, 0L, 10L, 17L, 0L, 5L, 0L, 0L, 11L, 151L, 127L, 160L, 27L, 388L, 439L, 1117L, 1571L, 81L, 598L, 1107L, 1402L, 891L), V6 = c(16L, 30L, 13L, 0L, 0L, 10L, 195L, 60L, 29L, 29L, 1L, 107L, 698L, 596L, 655L, 227L, 287L, 677L, 1477L, 1336L, 425L, 873L, 961L, 1360L, 1175L), V7 = c(249L, 101L, 69L, 0L, 18L, 186L, 331L, 291L, 259L, 248L, 336L, 404L, 642L, 632L, 775L, 455L, 801L, 697L, 1063L, 978L, 626L, 686L, 1204L, 1138L, 627L), V8 = c(300L, 163L, 65L, 145L, 377L, 257L, 690L, 655L, 420L, 288L, 346L, 461L, 1276L, 897L, 633L, 812L, 1018L, 1337L, 1295L, 1163L, 550L, 1104L, 768L, 933L, 433L), V9 = c(555L, 478L, 374L, 349L, 357L, 360L, 905L, 954L, 552L, 438L, 703L, 984L, 1616L, 1732L, 1234L, 1213L, 1518L, 1746L, 1191L, 967L, 1394L, 1722L, 1706L, 610L, 169L), V10 = c(1527L, 1019L, 926L, 401L, 830L, 833L, 931L, 816L, 1126L, 1232L, 1067L, 1169L, 1270L, 1277L, 1145L, 1159L, 1072L, 1534L, 997L, 391L, 1328L, 1414L, 1037L, 444L, 1L), V11 = c(1468L, 1329L, 1013L, 603L, 1096L, 1237L, 1488L, 1189L, 1064L, 1303L, 1258L, 1479L, 1421L, 1365L, 1101L, 1415L, 1145L, 1329L, 1325L, 236L, 1379L, 1199L, 729L, 328L, 0L), V12 = c(983L, 1459L, 791L, 898L, 911L, 1215L, 1528L, 960L, 1172L, 1286L, 1358L, 722L, 857L, 1478L, 1452L, 1502L, 1013L, 745L, 455L, 149L, 1686L, 917L, 1013L, 84L, 0L), V13 = c(1326L, 1336L, 1110L, 1737L, 1062L, 1578L, 1382L, 1537L, 1366L, 1308L, 1301L, 1357L, 746L, 622L, 934L, 1132L, 954L, 460L, 270L, 65L, 957L, 699L, 521L, 18L, 1L), V14 = c(1047L, 1315L, 1506L, 1562L, 1254L, 1336L, 1106L, 1213L, 1220L, 1457L, 858L, 1606L, 590L, 726L, 598L, 945L, 732L, 258L, 45L, 6L, 937L, 436L, 43L, 0L, 0L), V15 = c(845L, 935L, 1295L, 1077L, 1400L, 1049L, 802L, 1247L, 1449L, 1046L, 1134L, 877L, 327L, 352L, 470L, 564L, 461L,
[R] difftime
Hi. I'm trying to take the difference in days between two times. Can you point out what's wrong, or suggest a different function? When I try the following code, The following code works fine: a - strptime(1911100807,format=%Y%m%d%H,tz=GMT) b - strptime(1911102718,format=%Y%m%d%H,tz=GMT) x - difftime(b, a, units=days) x But when I change the year, the following code returns 'NA' for the time between a and b. Thanks. a - strptime(1999100807,format=%Y%m%d%H,tz=GMT) b - strptime(1999102718,format=%Y%m%d%H,tz=GMT) x - difftime(b, a, units=days) x __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need help optimizing/vectorizing nested loops
Hello, how about changing the last loop to an apply? time.test2 - function(dat) { cen - dat grps - 5 n.rich - numeric(grps^2) n.ind - 1 for(i in 1:grps) for (j in 1:grps) { n.cen - numeric(ncol(cen) - 2) neighbours - expand.grid(X=(j-1):(j+1), Y=(i-1):(i+1)) neighbours - neighbours[-5,] neighbours - neighbours[which(neighbours[,1] %in% 1:grps neighbours[,2] %in% 1:grps),] n.rich[n.ind] - sum(as.logical(apply(merge(neighbours,cen)[,-c(1,2)],2,sum))) n.ind - n.ind + 1 } return(n.rich) } The timings on my system: system.time(time.test(dat)) user system elapsed 1.650.031.71 system.time(time.test2(dat)) user system elapsed 0.270.000.27 I'm still thinking about the optimisation for the selection of the cells, but for the moment I have no clue about any optimisation for this problem. Kind Regards Bart Tyler Smith wrote: Hi, I'm analyzing a large number of large simulation datasets, and I've isolated one of the bottlenecks. Any help in speeding it up would be appreciated. `dat` is a dataframe of samples from a regular grid. The first two columns are the spatial coordinates of the samples, the remaining 20 columns are the abundances of species in each cell. I need to calculate the species richness in adjacent cells for each cell in the sample. For example, if I have nine cells in my dataframe (X = 1:3, Y = 1:3): a b c d e f g h i I need to calculate the neighbour-richness for each cell; for a, this is the richness of cells b, d and e combined. The neighbour richness of cell e would be the combined richness of all the other eight cells. The following code does what I what, but it's slow. The sample dataset 'dat', below, represents a 5x5 grid, 25 samples. It takes about 1.5 seconds on my computer. The largest samples I am working with have a 51 x 51 grid (2601 cells) and take 4.5 minutes. This is manageable, but since I have potentially hundreds of these analyses to run, trimming that down would be very helpful. After loading the function and the data, the call system.time(tmp - time.test(dat)) Will run the code. Note that I've excised this from a larger, more general function, after determining that for large datasets this section is responsible for a slowdown from 10-12 seconds to ca. 250 seconds. Thanks for your patience, Tyler time.test - function(dat) { cen - dat grps - 5 n.rich - numeric(grps^2) n.ind - 1 for(i in 1:grps) for (j in 1:grps) { n.cen - numeric(ncol(cen) - 2) neighbours - expand.grid((j-1):(j+1), (i-1):(i+1)) neighbours - neighbours[-5,] neighbours - neighbours[which(neighbours[,1] %in% 1:grps neighbours[,2] %in% 1:grps),] for (k in 1:nrow(neighbours)) n.cen - n.cen + cen[cen$X == neighbours[k,1] cen$Y == neighbours[k,2], -c(1:2)] n.rich[n.ind] - sum(as.logical(n.cen)) n.ind - n.ind + 1 } return(n.rich) } `dat` - structure(list( X = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5), Y = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5), V1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 45L, 131L, 0L, 0L, 34L, 481L, 1744L), V2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 88L, 0L, 70L, 101L, 13L, 634L, 0L, 0L, 71L, 640L, 1636L), V3 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 49L, 3L, 113L, 1L, 44L, 167L, 336L, 933L, 0L, 14L, 388L, 1180L, 1709L), V4 = c(0L, 0L, 0L, 0L, 0L, 0L, 3L, 12L, 0L, 0L, 2L, 1L, 36L, 45L, 208L, 7L, 221L, 213L, 371L, 1440L, 26L, 211L, 389L, 1382L, 1614L), V5 = c(96L, 7L, 0L, 0L, 0L, 10L, 17L, 0L, 5L, 0L, 0L, 11L, 151L, 127L, 160L, 27L, 388L, 439L, 1117L, 1571L, 81L, 598L, 1107L, 1402L, 891L), V6 = c(16L, 30L, 13L, 0L, 0L, 10L, 195L, 60L, 29L, 29L, 1L, 107L, 698L, 596L, 655L, 227L, 287L, 677L, 1477L, 1336L, 425L, 873L, 961L, 1360L, 1175L), V7 = c(249L, 101L, 69L, 0L, 18L, 186L, 331L, 291L, 259L, 248L, 336L, 404L, 642L, 632L, 775L, 455L, 801L, 697L, 1063L, 978L, 626L, 686L, 1204L, 1138L, 627L), V8 = c(300L, 163L, 65L, 145L, 377L, 257L, 690L, 655L, 420L, 288L, 346L, 461L, 1276L, 897L, 633L, 812L, 1018L, 1337L, 1295L, 1163L, 550L, 1104L, 768L, 933L, 433L), V9 = c(555L, 478L, 374L, 349L, 357L, 360L, 905L, 954L, 552L, 438L, 703L, 984L, 1616L, 1732L, 1234L, 1213L, 1518L, 1746L, 1191L, 967L, 1394L, 1722L, 1706L, 610L, 169L), V10 = c(1527L, 1019L, 926L, 401L, 830L, 833L, 931L, 816L, 1126L, 1232L, 1067L, 1169L, 1270L, 1277L, 1145L, 1159L, 1072L, 1534L, 997L, 391L, 1328L, 1414L, 1037L, 444L, 1L), V11 = c(1468L, 1329L, 1013L, 603L, 1096L, 1237L, 1488L, 1189L, 1064L, 1303L, 1258L, 1479L, 1421L, 1365L,
[R] extract the digits of a number
Hello, Anyone knows how can I do this in a cleaner way? mynumber = 1001 as.numeric(unlist(strsplit(as.character(mynumber),))) [1] 1 0 0 1 Thanks in advance, Gustavo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] difftime
Perhaps you are getting warning messages from difftime like this: Warning messages: 1: In structure(.Internal(as.POSIXct(x, tz)), class = c(POSIXt, POSIXct), : unable to identify current timezone 'V': please set environment variable 'TZ' 2: In structure(.Internal(as.POSIXct(x, tz)), class = c(POSIXt, POSIXct), : unknwon timezone 'localtime' Try execute the code two times. On Tue, Dec 9, 2008 at 3:33 PM, eric lee [EMAIL PROTECTED] wrote: Hi. I'm trying to take the difference in days between two times. Can you point out what's wrong, or suggest a different function? When I try the following code, The following code works fine: a - strptime(1911100807,format=%Y%m%d%H,tz=GMT) b - strptime(1911102718,format=%Y%m%d%H,tz=GMT) x - difftime(b, a, units=days) x But when I change the year, the following code returns 'NA' for the time between a and b. Thanks. a - strptime(1999100807,format=%Y%m%d%H,tz=GMT) b - strptime(1999102718,format=%Y%m%d%H,tz=GMT) x - difftime(b, a, units=days) x __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to display y-axis labels in Multcomp plot
Thanks Kingsford... that does the trick perfectly! Simon -- View this message in context: http://www.nabble.com/How-to-display-y-axis-labels-in-Multcomp-plot-tp20904977p20919920.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replacing tabs with appropriate number of spaces
This is basically your approach, but automated a bit more than you describe: library(gsubfn) tmp - strsplit('one\ttwo\nthree\tfour\n12345678\t910\na\tbc\tdef\tghi\n','\n')[[1]] tmp2 - gsubfn('([^\t]+)\t', function(x) { ln - nchar(x) nsp - 8-(ln %% 8) sp - paste( rep(' ', nsp), collapse='' ) paste(x,sp, sep='') }, tmp ) tmp2 cat(tmp2, sep='\n') This is based on the assumption of tab stops every 8 columns, change the 2 8's above if you want something different. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Dennis Fisher Sent: Tuesday, December 09, 2008 9:40 AM To: [EMAIL PROTECTED] Subject: [R] Replacing tabs with appropriate number of spaces Colleagues, Platform: OS X (but issue applies to all platforms) Version: 2.8.0 I have a mixture of text and data that I am outputting via R to a pdf document (using a fixed-width font). The text contains tabs that align columns properly with a fixed-width font in a terminal window. However, when the PDF document is created, the concept of a tab is not invoked properly and columns do not align. I could use brute force as follows: 1. identify lines of text containing tabs 2. strsplit on tabs 3. count characters preceding the tab, then replace the tab with the appropriate number of spaces (e.g., if the string preceding the tab has 29 characters, add 3 spaces), then paste(..., sep=) However, I am sure a more elegant approach exists. Can anyone offer one? Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] no implicit call of the print function within loops?
Dear R-users, I wonder why some functions produce output when they are called (I suppose due to an implicit call of the print function) but within a loop they do not: attach(anscomce) exp - parse(text= lm(x1 ~ y1)) eval(exp) Here the print() function seems to be called implicitly. If I do the same within a for-loop, it is not. for (i in c(1)){ eval(exp) } I know that I have to wrap it into a print function so it would work. But why is that so? In the eval() help I don't find any clues. As this happens with other functions as well, I would like to understand the causes and thus avoid some future mistakes. TIA, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] keep function in stepAIC
Dear: Does anyone use keep function in stepAIC command? If so, could you give an example? I try to use this function to choose some variables in all of possible models. Many Thanks! Xin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re sampling with index
Hallo, All, I have a question needs your help. I just want get a sample with the original index. For example, I have a dataset, ind x 139 224 315 475 561 After resample, I want to get a new dataset like this (with the original index) ind x 315 561 139 Thank you in advance. Legendy -- View this message in context: http://www.nabble.com/Resampling-with-index-tp20919265p20919265.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] XYPLOT Help
This may be an easy question for most of you, but any prompt clues/hints/examples would be really appreciated. My dataset includes a various number of records (by time points) for a variable by subject. I use xyplot to plot line/symbol graphs of the variable at each available time point by each subject on the same plot (see attached as an example). How to suppress those symbols with no available time points (e.g. symbols for subject 1001 study days 5, 6, 7)? Below is a quick example of data: Subject Study Day% CD8 response 1001 1 0.23 1001 2 0.25 1001 3 0.27 1001 4 0.29 1001 8 0.29 1002 1 0.23 1002 2 0.25 1002 3 0.27 1002 4 0.29 1002 5 0.23 1002 6 0.25 1002 7 0.27 1002 8 0.29 1003 1 0.23 1003 2 0.25 1003 3 0.27 1003 4 0.29 1003 5 0.23 1003 6 0.25 1003 7 0.27 1003 8 0.29 1004 1 0.23 1004 2 0.25 1004 3 0.27 1004 4 0.29 1004 8 0.29 Thanks a lot in advance! James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Graphics Device margins
Hello, I am relatively new to R and am using it to run Classification and Regression Tree analysis. My only issue at this point is that numbers are always cut off on the lower nodes. I've tried changing the margins with mai=c(0, 0.5, 0.5, 0) but this has not so far worked. Any suggestions would be appreciated. Best, John Z. Metcalfe, M.D., M.P.H. Division of Pulmonary and Critical Care Medicine University of California, San Francisco San Francisco General Hospital __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extract the digits of a number
An alternative that works for any base (other than 10) is the following: library(sfsmisc) digitsBase(1001, 10) Class 'basedInt'(base = 10) [1:1] [,1] [1,]1 [2,]0 [3,]0 [4,]1 -Christos -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gustavo Carvalho Sent: Tuesday, December 09, 2008 12:49 PM To: r-help@r-project.org Subject: [R] extract the digits of a number Hello, Anyone knows how can I do this in a cleaner way? mynumber = 1001 as.numeric(unlist(strsplit(as.character(mynumber),))) [1] 1 0 0 1 Thanks in advance, Gustavo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extract the digits of a number
Try this also: library(gsubfn) strapply(as.character(mynumber), [0-9], simplify = as.numeric) On Tue, Dec 9, 2008 at 3:48 PM, Gustavo Carvalho [EMAIL PROTECTED][EMAIL PROTECTED] wrote: Hello, Anyone knows how can I do this in a cleaner way? mynumber = 1001 as.numeric(unlist(strsplit(as.character(mynumber),))) [1] 1 0 0 1 Thanks in advance, Gustavo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extract the digits of a number
I don't know if this is cleaner or not, but here is another way: mynumber - 1001 floor( mynumber/(10^(nchar(mynumber):1 -1))) %% 10 [1] 1 0 0 1 mynumber - 12345678 floor( mynumber/(10^(nchar(mynumber):1 -1))) %% 10 [1] 1 2 3 4 5 6 7 8 mynumber - 9753086421 floor( mynumber/(10^(nchar(mynumber):1 -1))) %% 10 [1] 9 7 5 3 0 8 6 4 2 1 -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Gustavo Carvalho Sent: Tuesday, December 09, 2008 10:49 AM To: r-help@r-project.org Subject: [R] extract the digits of a number Hello, Anyone knows how can I do this in a cleaner way? mynumber = 1001 as.numeric(unlist(strsplit(as.character(mynumber),))) [1] 1 0 0 1 Thanks in advance, Gustavo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re sampling with index
Try this: x[x$ind %in% sample(x$ind, 3),] On Tue, Dec 9, 2008 at 3:19 PM, legendy [EMAIL PROTECTED] wrote: Hallo, All, I have a question needs your help. I just want get a sample with the original index. For example, I have a dataset, ind x 139 224 315 475 561 After resample, I want to get a new dataset like this (with the original index) ind x 315 561 139 Thank you in advance. Legendy -- View this message in context: http://www.nabble.com/Resampling-with-index-tp20919265p20919265.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can elastic net do binary classification?
Hi, List The elastic net package (by Hastie and Zou at Stanford) is used to do regularization and variable selection, it can also do regression. I am wondering if it can perform binary classification (discrete outcome). Anybody having similar experience? Many thanks, -Jack [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package plm
Dear R help, I use the package plm e the function plm() to analyse a panel data and estimate a dynamic model. Can I estimate a model without intercept? Thanks, Andrea Ferroni __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] no implicit call of the print function within loops?
The general rule is that when you type something at the command line and that something returns a value, if you do not tell it what to do with the value, and the value has not been made invisible, then the value is printed. If you do: tmp - eval(exp) You will not see the output printed since you tell it what to do with the return value from eval, but without the assignment, it is not told what to do and so does the print. This is not unique to the eval function, but true for anything that returns a visible value, compare: 3 + 4 x - 3 + 4 (x - 3 + 4) The implicit print does not happen inside of loops, batch processing, inside of functions, and probably a few other places. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Mark Heckmann Sent: Tuesday, December 09, 2008 10:44 AM To: r-help@r-project.org Subject: [R] no implicit call of the print function within loops? Dear R-users, I wonder why some functions produce output when they are called (I suppose due to an implicit call of the print function) but within a loop they do not: attach(anscomce) exp - parse(text= lm(x1 ~ y1)) eval(exp) Here the print() function seems to be called implicitly. If I do the same within a for-loop, it is not. for (i in c(1)){ eval(exp) } I know that I have to wrap it into a print function so it would work. But why is that so? In the eval() help I don't find any clues. As this happens with other functions as well, I would like to understand the causes and thus avoid some future mistakes. TIA, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package plm
If nobody answered to your question before, you probably have to formulate it differently, trying to give some context and possibly providing a small example, as requested by the posting guides. Resending your question unchanged multiple times is not going to help. Questions about contributed packages should be directed to the package maintainer. Best, Giovanni Date: Tue, 09 Dec 2008 19:04:36 +0100 From: Andrea Ferroni [EMAIL PROTECTED] Sender: [EMAIL PROTECTED] Precedence: list Dear R help, I use the package plm e the function plm() to analyse a panel data and estimate a dynamic model. Can I estimate a model without intercept? Thanks, Andrea Ferroni __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Giovanni Petris [EMAIL PROTECTED] Associate Professor Department of Mathematical Sciences University of Arkansas - Fayetteville, AR 72701 Ph: (479) 575-6324, 575-8630 (fax) http://definetti.uark.edu/~gpetris/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Graphics Device margins
Without the reproducible example asked for it is hard to tell, but ?text.rpart did say ...: Graphical parameters may also be supplied as arguments to this function (see 'par'). As labels often extend outside the plot region it can be helpful to specify 'xpd = TRUE'. Might that be the issue (rather than your subject line)? On Tue, 9 Dec 2008, Metcalfe, John wrote: Hello, I am relatively new to R and am using it to run Classification and Regression Tree analysis. My only issue at this point is that numbers are always cut off on the lower nodes. I've tried changing the margins with mai=c(0, 0.5, 0.5, 0) but this has not so far worked. Any suggestions would be appreciated. Best, John Z. Metcalfe, M.D., M.P.H. Division of Pulmonary and Critical Care Medicine University of California, San Francisco San Francisco General Hospital __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extract the digits of a number
Actually what you have is not so bad. Here is a variation which has one fewer nested (...) as.numeric(strsplit(as.character(mynumber),)[[1]]) [1] 1 0 0 1 # or # Try strapply in gsubfn: library(gsubfn) mynmber - 1001 strapply(as.character(mynumber), ., as.numeric)[[1]] [1] 1 0 0 1 On Tue, Dec 9, 2008 at 12:48 PM, Gustavo Carvalho [EMAIL PROTECTED] wrote: Hello, Anyone knows how can I do this in a cleaner way? mynumber = 1001 as.numeric(unlist(strsplit(as.character(mynumber),))) [1] 1 0 0 1 Thanks in advance, Gustavo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need help optimizing/vectorizing nested loops
Charles C. Berry writes: On Tue, 9 Dec 2008, tyler wrote: I'm analyzing a large number of large simulation datasets, and I've isolated one of the bottlenecks. Any help in speeding it up would be appreciated. Cast the neighborhoods as an indicator matrix, then use matrix multiplications: system.time({ mn - with(dat,outer(1:25,1:25, function(i,j) abs(X[i]-X[j])2 abs(Y[i]-Y[j])2 i!=j )) Wow, that's fantastic! On my biggest matrix, your code cost me 3.854 seconds, compared to 207.769 seconds with my original function. I might be able to use `outer' to solve some other slowdowns. Thanks, Tyler HTH, Chuck `dat` is a dataframe of samples from a regular grid. The first two columns are the spatial coordinates of the samples, the remaining 20 columns are the abundances of species in each cell. I need to calculate the species richness in adjacent cells for each cell in the sample. For example, if I have nine cells in my dataframe (X = 1:3, Y = 1:3): a b c d e f g h i I need to calculate the neighbour-richness for each cell; for a, this is the richness of cells b, d and e combined. The neighbour richness of cell e would be the combined richness of all the other eight cells. The following code does what I what, but it's slow. The sample dataset 'dat', below, represents a 5x5 grid, 25 samples. It takes about 1.5 seconds on my computer. The largest samples I am working with have a 51 x 51 grid (2601 cells) and take 4.5 minutes. This is manageable, but since I have potentially hundreds of these analyses to run, trimming that down would be very helpful. After loading the function and the data, the call system.time(tmp - time.test(dat)) Will run the code. Note that I've excised this from a larger, more general function, after determining that for large datasets this section is responsible for a slowdown from 10-12 seconds to ca. 250 seconds. Thanks for your patience, Tyler time.test - function(dat) { cen - dat grps - 5 n.rich - numeric(grps^2) n.ind - 1 for(i in 1:grps) for (j in 1:grps) { n.cen - numeric(ncol(cen) - 2) neighbours - expand.grid((j-1):(j+1), (i-1):(i+1)) neighbours - neighbours[-5,] neighbours - neighbours[which(neighbours[,1] %in% 1:grps neighbours[,2] %in% 1:grps),] for (k in 1:nrow(neighbours)) n.cen - n.cen + cen[cen$X == neighbours[k,1] cen$Y == neighbours[k,2], -c(1:2)] n.rich[n.ind] - sum(as.logical(n.cen)) n.ind - n.ind + 1 } return(n.rich) } `dat` - structure(list( X = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5), Y = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5), V1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 45L, 131L, 0L, 0L, 34L, 481L, 1744L), V2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 88L, 0L, 70L, 101L, 13L, 634L, 0L, 0L, 71L, 640L, 1636L), V3 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 49L, 3L, 113L, 1L, 44L, 167L, 336L, 933L, 0L, 14L, 388L, 1180L, 1709L), V4 = c(0L, 0L, 0L, 0L, 0L, 0L, 3L, 12L, 0L, 0L, 2L, 1L, 36L, 45L, 208L, 7L, 221L, 213L, 371L, 1440L, 26L, 211L, 389L, 1382L, 1614L), V5 = c(96L, 7L, 0L, 0L, 0L, 10L, 17L, 0L, 5L, 0L, 0L, 11L, 151L, 127L, 160L, 27L, 388L, 439L, 1117L, 1571L, 81L, 598L, 1107L, 1402L, 891L), V6 = c(16L, 30L, 13L, 0L, 0L, 10L, 195L, 60L, 29L, 29L, 1L, 107L, 698L, 596L, 655L, 227L, 287L, 677L, 1477L, 1336L, 425L, 873L, 961L, 1360L, 1175L), V7 = c(249L, 101L, 69L, 0L, 18L, 186L, 331L, 291L, 259L, 248L, 336L, 404L, 642L, 632L, 775L, 455L, 801L, 697L, 1063L, 978L, 626L, 686L, 1204L, 1138L, 627L), V8 = c(300L, 163L, 65L, 145L, 377L, 257L, 690L, 655L, 420L, 288L, 346L, 461L, 1276L, 897L, 633L, 812L, 1018L, 1337L, 1295L, 1163L, 550L, 1104L, 768L, 933L, 433L), V9 = c(555L, 478L, 374L, 349L, 357L, 360L, 905L, 954L, 552L, 438L, 703L, 984L, 1616L, 1732L, 1234L, 1213L, 1518L, 1746L, 1191L, 967L, 1394L, 1722L, 1706L, 610L, 169L), V10 = c(1527L, 1019L, 926L, 401L, 830L, 833L, 931L, 816L, 1126L, 1232L, 1067L, 1169L, 1270L, 1277L, 1145L, 1159L, 1072L, 1534L, 997L, 391L, 1328L, 1414L, 1037L, 444L, 1L), V11 = c(1468L, 1329L, 1013L, 603L, 1096L, 1237L, 1488L, 1189L, 1064L, 1303L, 1258L, 1479L, 1421L, 1365L, 1101L, 1415L, 1145L, 1329L, 1325L, 236L, 1379L, 1199L, 729L, 328L, 0L), V12 = c(983L, 1459L, 791L, 898L, 911L, 1215L, 1528L, 960L, 1172L, 1286L, 1358L, 722L, 857L, 1478L, 1452L, 1502L, 1013L, 745L, 455L, 149L, 1686L, 917L, 1013L, 84L, 0L), V13 = c(1326L, 1336L, 1110L, 1737L, 1062L, 1578L, 1382L, 1537L, 1366L, 1308L, 1301L, 1357L, 746L, 622L, 934L, 1132L, 954L, 460L,
[R] glm error message when using family Gamma(link=inverse)
R 2.5 windows XP I am getting an error from glm() that I don't understand. Any help or suggestions would be appreciated. N.B. 1=AAMTCAREJ=327900 summary(data$AAMTCAREJ) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.0404.3 1430.0 6567.0 5457.0 327900.0 fitglm-glm(AAMTCAREJ~sexcat+H_AGE+SmokeCat+InsuranceCat+MedicadeCat+ + incomegrp+racecat+MARSTATJS+EdCat+bmiNewjohn,data=data,family=Gamma(link = inverse)) Error: no valid set of coefficients has been found: please supply starting values In addition: Warning message: NaNs produced in: log(x) Thanks John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replacing tabs with appropriate number of spaces
That's a clever use of gsubfn. Here is a very minor simplification using the same code but representing it in formula notation and sprintf: gsubfn('([^\t]+)\t', ~ sprintf(%s%*s, x, 8-nchar(x)%%8, ), tmp) On Tue, Dec 9, 2008 at 12:51 PM, Greg Snow [EMAIL PROTECTED] wrote: This is basically your approach, but automated a bit more than you describe: library(gsubfn) tmp - strsplit('one\ttwo\nthree\tfour\n12345678\t910\na\tbc\tdef\tghi\n','\n')[[1]] tmp2 - gsubfn('([^\t]+)\t', function(x) { ln - nchar(x) nsp - 8-(ln %% 8) sp - paste( rep(' ', nsp), collapse='' ) paste(x,sp, sep='') }, tmp ) tmp2 cat(tmp2, sep='\n') This is based on the assumption of tab stops every 8 columns, change the 2 8's above if you want something different. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Dennis Fisher Sent: Tuesday, December 09, 2008 9:40 AM To: [EMAIL PROTECTED] Subject: [R] Replacing tabs with appropriate number of spaces Colleagues, Platform: OS X (but issue applies to all platforms) Version: 2.8.0 I have a mixture of text and data that I am outputting via R to a pdf document (using a fixed-width font). The text contains tabs that align columns properly with a fixed-width font in a terminal window. However, when the PDF document is created, the concept of a tab is not invoked properly and columns do not align. I could use brute force as follows: 1. identify lines of text containing tabs 2. strsplit on tabs 3. count characters preceding the tab, then replace the tab with the appropriate number of spaces (e.g., if the string preceding the tab has 29 characters, add 3 spaces), then paste(..., sep=) However, I am sure a more elegant approach exists. Can anyone offer one? Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] French IRC channel and mailing list ?
A few personal thoughts on this: I recently joined a newly created R user group on google http://groups.google.co.uk/group/gur-ugr that started with a similar impulse. In my personal opinion, I see little overall benefit from such an approach. For one thing, a major strength of the R mailing list is the large number of very knowledgeable persons. A mailing list with only a few 10s of users will never provide as good a support as you can find in the main list. The advice you get could very easily be biased or even plain wrong without much of a peer-review, so to say. Another thing to consider is whether people who can help and understand french actually want to answer a question in french, thereby limiting their advice to a much narrower audience (people facing a similar problem subsequently may be unable to get help from an answer in this language). Perhaps even more likely is the opposite situation where the question has been solved many times in the main mailing list: it can be quite tempting to just send the link and say, well, here is the solution, let me know what you don't understand rather than doing a translator's job. Solving an R problem and translating somebody's text have very unequal appeal. I don't know what the exact policy is for this mailing list (a search for english in the posting guide didn't return anything). Perhaps it is OK to send the occasional question in french, or franglais. I know I don't mind seeing a few of these and answering them if I can, while I would not join a new list for the reasons stated above. This would have the advantage of keeping the knowledge together. Maybe a special tag could be used so that people not interested can filter out all questions posted in non-english languages. Best wishes, Baptiste On 8 Dec 2008, at 21:10, Julien Barnier wrote: Dear all, For some time now, R is becomming more and more popular in more and more countries. France is for sure one of them, but french people being french one of the obstacle they might tackle is the lack of documentation and support in their native language. To offer this support in french an IRC channel (#Rfr on irc.freenode.net) was created some months ago, beside the official english channel #R. We (me (~juba) and Pierre-Yves Chibon (~pingou)) have to recognize that it has a very low activity right now but it's related to our lack of promotion about it. Another tool that could be useful to bring support in french (and other languages) would be dedicated mailing-lists. I've searched the archives to see if similar requests have already been made but couldn't manage to find one. So I would like to ask here the question, has there been any thought on the creation of dedicated R-help mailing-lists for the major languages such as Spanish, French, Chinese and others ? We had some thought about it and we actually think that it would be something useful for users to be able to receive some help (especially when their english is not really fluent). Thanks in advance for your answers, Sincerely, Pierre-Yves Chibon Julien Barnier __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm error message when using family Gamma(link=inverse)
On Tue, 9 Dec 2008, John Sorkin wrote: R 2.5 Please 1) do as the posting guide asks, and quote version numbers accurately. 2) do as the posting guide asks, and update *before* posting. That's too old a version to support here. windows XP I am getting an error from glm() that I don't understand. Any help or suggestions would be appreciated. N.B. 1=AAMTCAREJ=327900 summary(data$AAMTCAREJ) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.0404.3 1430.0 6567.0 5457.0 327900.0 fitglm-glm(AAMTCAREJ~sexcat+H_AGE+SmokeCat+InsuranceCat+MedicadeCat+ + incomegrp+racecat+MARSTATJS+EdCat+bmiNewjohn,data=data,family=Gamma(link = inverse)) Error: no valid set of coefficients has been found: please supply starting values In addition: Warning message: NaNs produced in: log(x) That model is not necessarily valid: the linear predictor has to be strictly positive. If you really know why it is applicable you will be able to give starting values (e.g. maybe all the columns of the design matrix are positive, in which case you will be able to find suitable positive initial coefficients). Thanks John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for ...{{dropped:19}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] difftime
I guess its the same problem as this (run after your code): as.POSIXct(a, tz = ) [1] NA as.POSIXct(b, tz = ) [1] NA difftime(b, a, units=days) Time difference of NA days If you explicitly specify the tz as GMT then it works as expected: as.POSIXct(a, tz = GMT) [1] 1999-10-08 07:00:00 GMT as.POSIXct(b, tz = GMT) [1] 1999-10-27 18:00:00 GMT difftime(b, a, tz = GMT, units=days) Time difference of 19.45833 days Since the problem is confusion between the and GMT time zones the easiest thing to do is to simply set the default time zone for the remainder of the session to GMT Sys.setenv(TZ = GMT) in which case tz = and tz = GMT are the same and you should get the expected answer either way. Alternately use chron which does not have time zones in the first place so problem cannot arise. See R News 4/1. The above was done under: R.version.string # Vista [1] R version 2.8.0 Patched (2008-11-10 r46884) On Tue, Dec 9, 2008 at 12:33 PM, eric lee [EMAIL PROTECTED] wrote: Hi. I'm trying to take the difference in days between two times. Can you point out what's wrong, or suggest a different function? When I try the following code, The following code works fine: a - strptime(1911100807,format=%Y%m%d%H,tz=GMT) b - strptime(1911102718,format=%Y%m%d%H,tz=GMT) x - difftime(b, a, units=days) x But when I change the year, the following code returns 'NA' for the time between a and b. Thanks. a - strptime(1999100807,format=%Y%m%d%H,tz=GMT) b - strptime(1999102718,format=%Y%m%d%H,tz=GMT) x - difftime(b, a, units=days) x __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get Greenhouse-Geisser epsilons from anova?
Dear Nils, -Original Message- From: Skotara [mailto:[EMAIL PROTECTED] Sent: December-09-08 12:21 PM To: John Fox Cc: 'Peter Dalgaard'; r-help@r-project.org Subject: Re: [R] How to get Greenhouse-Geisser epsilons from anova? Dear John and Peter, thank you both very much for your help! Everything works fine now! John, Anova also works very fine. Thank you very much! However, if I had more than 2 levels for the between factor the same thing as mentioned occured. The degrees of freedom showed that Anova calculated it as if all subjects came from the same group, for example for main effect A the dfs are 1 and 35. That is odd -- the example that I sent had a factor with 3 levels, producing 2 df; here's a simplified version with s single between-subject factor (treatment): OBrienKaiser$treatment [1] control control control control control A A A A [10] B B B B B B B attr(,contrasts) [,1] [,2] control -20 A 1 -1 B 11 Levels: control A B mod.ok - lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, + post.1, post.2, post.3, post.4, post.5, + fup.1, fup.2, fup.3, fup.4, fup.5) ~ treatment, +data=OBrienKaiser) av.ok - Anova(mod.ok, idata=idata, idesign=~phase*hour) summary(av.ok, multivariate=FALSE) Univariate Type II Repeated-Measures ANOVA Assuming Sphericity SS num Df Error SS den Df FPr(F) treatment186.75 2 416.58 13 2.9139 0.090041 . phase167.50 292.17 26 23.6257 1.419e-06 *** treatment:phase 77.00 492.17 26 5.4304 0.002578 ** hour 106.29 472.81 52 18.9769 1.128e-09 *** treatment:hour 0.89 872.81 52 0.0798 0.999597 phase:hour11.08 8 116.96104 1.2319 0.287947 treatment:phase:hour 5.96 16 116.96104 0.3312 0.992574 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Mauchly Tests for Sphericity Test statistic p-value phase 0.85151 0.38119 treatment:phase 0.85151 0.38119 hour0.09859 0.00194 treatment:hour 0.09859 0.00194 phase:hour 0.00837 0.09038 treatment:phase:hour0.00837 0.09038 Greenhouse-Geisser and Huynh-Feldt Corrections for Departure from Sphericity GG eps Pr(F[GG]) phase0.87071 5.665e-06 *** treatment:phase 0.87071 0.004301 ** hour 0.48867 1.016e-05 *** treatment:hour 0.48867 0.986835 phase:hour 0.50283 0.308683 treatment:phase:hour 0.50283 0.950677 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 HF eps Pr(F[HF]) phase0.99390 1.515e-06 *** treatment:phase 0.99390 0.002641 ** hour 0.57413 2.190e-06 *** treatment:hour 0.57413 0.992771 phase:hour 0.75630 0.298926 treatment:phase:hour 0.75630 0.981607 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 I didn't save your earlier postings, so can't check your model and data. If you like, please feel free to send them to me privately. Regards, John Since I can get those values using anova that causes no problem. I saw that the x$G to get the greenhouse-geisser epsilon do work for: x- anova(mlmfitD, X=~C+B, M=~A+C+B, test = Spherical) but does not work for y$G: y - anova(mlmfit, mlmfit0, X= ~C+B, M = ~A+C+B, idata = dd,test=Spherical) Finally, the Greenhouse-Geisser epsilons are identical using both methods and to the SPSS output. The Huynh-Feldt are not the same as them of SPSS. I will use GG instead. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] opening multiple connections at once
Best Regards, Subrat Rout CNSI, Gaithers Dr. (240) 399 2472 [EMAIL PROTECTED]mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RCurl::postForm() -- how does one determine what the names are of each form element in an online html form?
Dear R-Help, I am looking into using the Open Calais web service (http:// sws.clearforest.com/calaisViewer/) for text mining purposes. I would like to use R to post text into one of the forms on their website. In package RCurl, there is a function called postForm(). This sounds like it would do the job. Unfortunately the URL used in the example is no longer valid (i have emailed the maintainer about this). Question: How does one determine the name of the form elements to use? is there an R function which will print out the names of these elements perhaps? [i am still learning, so please forgive me if i used the wrong terminology.] ### Example from ?postForm ### library(RCurl) # Now looking at POST method for forms. postForm(http://www.speakeasy.org/~cgires/perl_form.cgi;, some_text = Duncan, choice = Ho, radbut = eep, box = box1, box2 ) ### Example ends ### So in the above code, i believe the form elements are: some_text, choice, redbut and box. But how does one find out the names of these form elements if one is not given them previously? I hope that the above made sense, and thank you kindly in advance for any help. Tony Breyal. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] TIFF capability in R 2.8.0
Hi, I installed R 2.8.0 (binary - dmg) for Mac 10.5.5 (Macbook Pro). It seems to have broken tiff output. From the R console, capabilities() reports TIFF is not installed. And when I run CairoTIFF(), it throws an error -- TIFF not compiled. CarioTIFF() worked in R 2.7.2. Is this a bug or am I doing something wrong? Thanks, Rhead __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] chron - when seconds data not included
Works like a charm! Thanks. Gabor Grothendieck wrote: On Mon, Dec 8, 2008 at 11:52 PM, Tubin [EMAIL PROTECTED] wrote: I have date and time data which looks like this: [,1] [,2] [1,] 7/1/08 9:19 [2,] 7/1/08 9:58 [3,] 7/7/08 15:47 [4,] 7/8/08 10:03 [5,] 7/8/08 10:32 [6,] 7/8/08 15:22 [7,] 7/8/08 15:27 [8,] 7/8/08 15:40 [9,] 7/9/08 10:25 [10,] 7/9/08 10:27 I would like to use chron on it, so that I can calculate intervals in time. I can't seem to get chron to accept the time format that doesn't include seconds. Do I have to go through and append :00 on every line in order to use chron? That's one way: m - matrix( c(7/1/08,9:19, 7/1/08,9:58, 7/7/08,15:47, 7/8/08,10:03, 7/8/08,10:32, 7/8/08,15:22, 7/8/08,15:27, 7/8/08,15:40, 7/9/08,10:25, 7/9/08,10:27), nc = 2, byrow = TRUE) chron(m[,1], paste(m[,2], 0, sep = :)) # another is to use as.chron as.chron(apply(m, 1, paste, collapse = ), %m/%d/%y %H:%M) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/chron---when-seconds-data-not-included-tp20908803p20922004.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and Scheme
Luke Tierney wrote: On Tue, 9 Dec 2008, Wacek Kusnierczyk wrote: Stavros Macrakis wrote: R does not have continuations or call-with-current-continuation or other mechanisms for implementing coroutines, general iterators, and the like. there is callCC, for example, which however seems kind of obsolete. There is nothing obsolete about it. It supports only downward or dynamic extent continuations and so is not useful (nor intended) for the things Stavros mentions. It is useful for escaping from deeply nested function calls, for example recursive examination of tree structures -- that is why it exists. At some point upward (at least one-shot) contitnuations may be added as well, but probably not soon. luke ok, thanks for the explanation. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How can I draw bars
I need to make a graphic to show problems on different parts of chromosomes (think of a graphic showing the number of frayed threads as colors along different parts of a worn out rope). I want to draw bars going from left to right across a page and color different parts of the bars in different shades. Each graphic will need to have several bars of different lenghts corresponding to the different lenghts of the chromsomes. Is there a package around that can help me draw custom bars of different colors? I am a hardcore SAS guy who is just learning R so I am mostly clueless. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm error message when using family Gamma(link=inverse)
It appears that I have sinned and for this I surely will wear sack cloth and grovel until my period of penitence is fulfilled. I update R and my problem remains. Please see code snippet (as per posting guidelines) below R 2.8.0 windows XP summary(data$AAMTCAREJ) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.0404.3 1430.0 6567.0 5457.0 327900.0 fitglm-glm(AAMTCAREJ~sexcat+H_AGE+SmokeCat+InsuranceCat+MedicadeCat+ + incomegrp+racecat+MARSTATJS+EdCat+bmiNewjohn,data=data,family=Gamma(link = inverse)) Error: no valid set of coefficients has been found: please supply starting values In addition: Warning message: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Prof Brian Ripley [EMAIL PROTECTED] 12/9/2008 2:11 PM On Tue, 9 Dec 2008, John Sorkin wrote: R 2.5 Please 1) do as the posting guide asks, and quote version numbers accurately. 2) do as the posting guide asks, and update *before* posting. That's too old a version to support here. windows XP I am getting an error from glm() that I don't understand. Any help or suggestions would be appreciated. N.B. 1=AAMTCAREJ=327900 summary(data$AAMTCAREJ) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.0404.3 1430.0 6567.0 5457.0 327900.0 fitglm-glm(AAMTCAREJ~sexcat+H_AGE+SmokeCat+InsuranceCat+MedicadeCat+ + incomegrp+racecat+MARSTATJS+EdCat+bmiNewjohn,data=data,family=Gamma(link = inverse)) Error: no valid set of coefficients has been found: please supply starting values In addition: Warning message: NaNs produced in: log(x) That model is not necessarily valid: the linear predictor has to be strictly positive. If you really know why it is applicable you will be able to give starting values (e.g. maybe all the columns of the design matrix are positive, in which case you will be able to find suitable positive initial coefficients). Thanks John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for ...{{dropped:16}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical inconsistency
Patrick Connolly wrote: On Mon, 08-Dec-2008 at 02:05AM +0800, Berwin A Turlach wrote: | G'day Wacek, | | On Sat, 06 Dec 2008 10:49:24 +0100 | Wacek Kusnierczyk [EMAIL PROTECTED] wrote: | | [] | there is, in principle, no problem in having a high-level language | perform the computation in a logically consistent way. | | Is this now supposed to be a Radio Eriwan joke? As another saying | goes: in theory there is no difference between theory and practice, | in practice there is. | | no joke, sorry to disappoint you. | | Apparently it is, you seem to be a comedian without knowing it. :) I think this guy's a riot! Self-effacing humour is not easy to do, but he's really good at it. That wonderful phrase a bit further back in this thread where he referred to his 'fiercely truculent' whining is one deserving preservation. as an afterthought, patrick: i have already promised to keep my tongue behind my teeth, but I'd like it to be a fair deal -- please try not to provoke. ain't bovvered, but maybe you're getting unnecessarily personal, and it's really telling more about you than about me. i'd suggest that you actually read the funny quote at the bottom of your own post (and kept in the foot of this post). It's a shame the brilliance is lost somewhat by the use a keyboard that has no Shift key. I've found a lot of these wonderful musings too much effort to read. I'm not familiar with exotic keyboard layouts, but perhaps the Norwegian one uses Shift for something mere non-Nordics wouldn't understand. Pity that. oncE you'rE sO couRageous to strIke at tHis leveL, YOU mAy wish to con.ta.ct the developeRs of the r base package. maybe we could avoid having to navigate between design features such as fun and FUN, NA and is.na, colMeans and colnames, Map and lapply, ifelse and tryCatch, and other sorts of typographical adhockery. can you find brilliance in this? is using these not too much effort for you? vQ -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_Middle minds discuss events (:_~*~_:)Small minds discuss people (_)-(_) . Anon ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can elastic net do binary classification?
There are at least two options. There is the sparseLDA package that was written by the author of Sparse Discriminant Analysis. The paper can be found at http://www-stat.stanford.edu/~hastie/Papers/sda_line.pdf The initial version of the package is here: http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=5672 I'm working on another version that is more R-centric (e.g. object oriented, has predict methods, etc). I think that will be on CRAN soon. Another option is the sparse PLS model in the spls package that also uses methods similar to the elastic net. I've written functions to extend this to classification in the same manner as the plsda function in the caret package. I've submitted the code to the package maintainers, but I can send it to anyone who emails me off-list. Max On Tue, Dec 9, 2008 at 1:09 PM, Jack Luo [EMAIL PROTECTED] wrote: Hi, List The elastic net package (by Hastie and Zou at Stanford) is used to do regularization and variable selection, it can also do regression. I am wondering if it can perform binary classification (discrete outcome). Anybody having similar experience? Many thanks, -Jack [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Significance of slopes
Hello R community, I have a question regarding correlation and regression analysis. I have two variables, x and y. Both have a standard deviation of 1; thus, correlation and slope from the linear regression (which also must have an intercept of zero) are equal. I want to probe two particular questions: 1) Is the slope significantly different from zero? This should be easy with the lm function, as the p-value should reflect exactly that question. If I am wrong, lease correct me. 2) Is the slope significantly different from a non-zero value (e.g. 0.5)? How can I probe that hypothesis? Any ideas? I apologize if this question is too trivial and already answered somewhere, but I did not find it. Thank you for the help! Christian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] assign()ing within apply
Hello, I'm trying to convert a character column in several dataframes to lower case. ### # # Sample data and 'spp' column summaries: # dput(ban.ovs.1993[sample(row.names(ban.ovs.1993), 20), 1:4]) ban.ovs.93 - structure(list(oplt = c(43L, 43L, 38L, 26L, 35L, 8L, 39L, 1L, 34L, 50L, 10L, 29L, 31L, 24L, 18L, 12L, 27L, 49L, 28L, 51L), rplt = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_), tree = c(427L, 410L, 639L, 494L, 649L, 166L, 735L, 163L, 120L, 755L, 612L, 174L, 129L, 331L, 269L, 152L, 552L, 227L, 243L, 96L), spp = c(MH, MST, MH, HE, BE, MH, MH, MH, MH, Or, IW, Or, MH, MH, BY, MH, MH, BE, MH, MR)), .Names = c(oplt, rplt, tree, spp), row.names = c(4587L, 4570L, 3947L, 2761L, 3653L, 652L, 4136L, 64L, 3567L, 5318L, 838L, 3091L, 3366L, 2423L, 1775L, 1061L, 2893L, 5161L, 2967L, 5395L), class = data.frame) # dput(pem.ovs.1994[sample(row.names(pem.ovs.1994), 20), 1:4]) pem.ovs.94 - structure(list(oplt = c(8L, 17L, 36L, 9L, 31L, 11L, 35L, 51L, 51L, 49L, 40L, 1L, 9L, 17L, 4L, 42L, 6L, 3L, 39L, 25L), tree = c(531L, 557L, 546L, 261L, 592L, 134L, 695L, 933L, 945L, 114L, 34L, 54L, 549L, 574L, 193L, 96L, 70L, 4L, 546L, 789L), spp = c(MH, MH, MH, BF, BF, MH, IW, OR, OR, BF, MH, IW, OR, MH, SM, BE, BE, BE, OR, OR), oaz = c(38L, 205L, 140L, 277L, 329L, 209L, 222L, 24L, 67L, 187L, 156L, 181L, 174L, 248L, 42L, 279L, 273L, 357L, 160L, 183L)), .Names = c(oplt, tree, spp, oaz), row.names = c(1204L, 2943L, 5790L, 1616L, 5063L, 2013L, 5691L, 8188L, 8200L, 7822L, 6302L, 54L, 1698L, 2960L, 421L, 6690L, 775L, 245L, 6205L, 4121L), class = data.frame) # count per spp invisible(lapply(ls(pat='ovs'), function(y) { cat(y, \n) ; print(summary.factor(get(y)[,'spp'])) ; cat(\n) } ) ) ban.ovs.93 BE BY HE IW MH MR MST Or 2 1 1 1 11 1 1 2 pem.ovs.94 BE BF IW MH OR SM 3 3 2 6 5 1 # ### I have tried variants on the following and cannot remember how to have the result assign()ed back to the original dataframes. lapply(ls(pat='ovs'), function(y) { assign(paste(y, [,'spp'], sep=), tolower(get(y)[,'spp'])) } ) I *do* get the expected results: [[1]] [1] mh mst mh he be mh mh mh mh or iw or mh mh by mh mh be mh mr [[2]] [1] mh mh mh bf bf mh iw or or bf mh iw or mh sm be be be or or , I just can't remember how to get them back into the original dataframe objects. Suggestions? Thanx, DaveT. * Silviculture Data Analyst Ontario Forest Research Institute Ontario Ministry of Natural Resources [EMAIL PROTECTED] http://ontario.ca/ofri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Graphics Device margins
Metcalfe, John wrote: Hello, I am relatively new to R and am using it to run Classification and Regression Tree analysis. My only issue at this point is that numbers are always cut off on the lower nodes. I've tried changing the margins with mai=c(0, 0.5, 0.5, 0) but this has not so far worked. Any suggestions would be appreciated. Try par(xpd=NA) to turn of clipping (that's from memory, so ?par if that does not work). Also it is often helpful to set plot margins (mar) to zero and set outer margins (oma) to some reasonable values. THK -- View this message in context: http://www.nabble.com/R-Graphics-Device-margins-tp20920077p20923309.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm error message when using family Gamma(link=inverse)
On Tue, 9 Dec 2008, John Sorkin wrote: It appears that I have sinned and for this I surely will wear sack cloth and grovel until my period of penitence is fulfilled. I update R and my problem remains. Please see code snippet (as per posting guidelines) below R 2.8.0 windows XP summary(data$AAMTCAREJ) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.0404.3 1430.0 6567.0 5457.0 327900.0 fitglm-glm(AAMTCAREJ~sexcat+H_AGE+SmokeCat+InsuranceCat+MedicadeCat+ + incomegrp+racecat+MARSTATJS+EdCat+bmiNewjohn,data=data,family=Gamma(link = inverse)) Error: no valid set of coefficients has been found: please supply starting values In addition: Warning message: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced Well, this still isn't something others can _reproduce_ , but... The error message seems to be from glm.fit(). I'd guess you are throwing glm() a knuckleball and it is choking on its attempt to calculate the deviance - note the warning message. You might be able to calm glm()'s frayed nerves by supplying some decent start values as it is asking. Possibly starting with a subset of variables, using the coef()s for the subset and setting the others to zero when you attempt to fit all together will be enough to push it past its sticking point. HTH, Chuck John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Prof Brian Ripley [EMAIL PROTECTED] 12/9/2008 2:11 PM On Tue, 9 Dec 2008, John Sorkin wrote: R 2.5 Please 1) do as the posting guide asks, and quote version numbers accurately. 2) do as the posting guide asks, and update *before* posting. That's too old a version to support here. windows XP I am getting an error from glm() that I don't understand. Any help or suggestions would be appreciated. N.B. 1=AAMTCAREJ=327900 summary(data$AAMTCAREJ) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.0404.3 1430.0 6567.0 5457.0 327900.0 fitglm-glm(AAMTCAREJ~sexcat+H_AGE+SmokeCat+InsuranceCat+MedicadeCat+ + incomegrp+racecat+MARSTATJS+EdCat+bmiNewjohn,data=data,family=Gamma(link = inverse)) Error: no valid set of coefficients has been found: please supply starting values In addition: Warning message: NaNs produced in: log(x) That model is not necessarily valid: the linear predictor has to be strictly positive. If you really know why it is applicable you will be able to give starting values (e.g. maybe all the columns of the design matrix are positive, in which case you will be able to find suitable positive initial coefficients). Thanks John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for ...{{dropped:16}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Significance of slopes
Hi Christian, please give always reproducible code, so we can see what have done and give you the best answer. lm function, generally as in this example form lm man page ( ?lm) trt - c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) ctl - c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) reg=lm(trt~ctl) summary(reg) Call: lm(formula = trt ~ ctl) Residuals: Min 1Q Median 3Q Max -1.09389 -0.33069 -0.15249 0.05128 1.45497 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.7957 2.1661 3.599 0.00699 ** ctl -0.6230 0.4279 -1.456 0.18351 --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 Residual standard error: 0.7485 on 8 degrees of freedom Multiple R-squared: 0.2095,Adjusted R-squared: 0.1106 F-statistic: 2.12 on 1 and 8 DF, p-value: 0.1835 Returns you all the answer (almost) for the questions that you ask; the p-value of the intercept line, is the p-value from the test( t test) if the intercept is different form zero. the ctl line has also the same interpretation, regarding the value returned. Meaning no is not significantly different form zero. If you want to test if the estimates ( slopes or intercept) are different from a specific value as in your case different for 0.5 you can apply a test. Type on R ?t.test and you can find the all the information you need. Hope this helps Best Regards Anna Anna Freni Sterrantino Ph.D Student Department of Statistics University of Bologna, Italy via Belle Arti 41, 40124 BO. Da: Christian Arnold [EMAIL PROTECTED] A: r-help@r-project.org Inviato: Martedì 9 dicembre 2008, 21:54:23 Oggetto: [R] Significance of slopes Hello R community, I have a question regarding correlation and regression analysis. I have two variables, x and y. Both have a standard deviation of 1; thus, correlation and slope from the linear regression (which also must have an intercept of zero) are equal. I want to probe two particular questions: 1) Is the slope significantly different from zero? This should be easy with the lm function, as the p-value should reflect exactly that question. If I am wrong, lease correct me. 2) Is the slope significantly different from a non-zero value (e.g. 0.5)? How can I probe that hypothesis? Any ideas? I apologize if this question is too trivial and already answered somewhere, but I did not find it. [[elided Yahoo spam]] Christian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Significance of slopes
Hi Christian, please give always reproducible code, so we can see what have done and give you the best answer. lm function, generally as in this example form lm man page ( ?lm) trt - c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) ctl - c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) reg=lm(trt~ctl) summary(reg) Call: lm(formula = trt ~ ctl) Residuals: Min 1Q Median 3Q Max -1.09389 -0.33069 -0.15249 0.05128 1.45497 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 7.7957 2.1661 3.599 0.00699 ** ctl -0.6230 0.4279 -1.456 0.18351 --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 Residual standard error: 0.7485 on 8 degrees of freedom Multiple R-squared: 0.2095,Adjusted R-squared: 0.1106 F-statistic: 2.12 on 1 and 8 DF, p-value: 0.1835 Returns you all the answer (almost) for the questions that you ask; the p-value of the intercept line, is the p-value from the test( t test) if the intercept is different form zero. the ctl line has also the same interpretation, regarding the value returned. Meaning no is not significantly different form zero. If you want to test if the estimates ( slopes or intercept) are different from a specific value as in your case different for 0.5 you can apply a test. Type on R ?t.test and you can find the all the information you need. Hope this helps Best Regards Anna Anna Freni Sterrantino Ph.D Student Department of Statistics University of Bologna, Italy via Belle Arti 41, 40124 BO. Da: Christian Arnold [EMAIL PROTECTED] A: r-help@r-project.org Inviato: Martedì 9 dicembre 2008, 21:54:23 Oggetto: [R] Significance of slopes Hello R community, I have a question regarding correlation and regression analysis. I have two variables, x and y. Both have a standard deviation of 1; thus, correlation and slope from the linear regression (which also must have an intercept of zero) are equal. I want to probe two particular questions: 1) Is the slope significantly different from zero? This should be easy with the lm function, as the p-value should reflect exactly that question. If I am wrong, lease correct me. 2) Is the slope significantly different from a non-zero value (e.g. 0.5)? How can I probe that hypothesis? Any ideas? I apologize if this question is too trivial and already answered somewhere, but I did not find it. [[elided Yahoo spam]] Christian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] controlling axes in plot.cuminc (cmprsk library)
Dear R-help list members, I am trying to create my own axes when plotting a cumulative incidence curve using the plot.cuminc function in the CMPRSK library. The default x-axis places tick marks and labels at 0, 20, 40, 60, and 80 (my data has an upper limit of 96), whereas I want them at my own specified locations. Here is my example code: library(cmprsk) attach(MYDATA) MYCUMINC - cuminc(ftime=TIME,fstatus=STATUS,group=GROUP,rho=0,cencode=0,na.action=na.omit) plot(MYCUMINC,xlim=c(0,96),ylim=c(0,0.5),xlab=,axes=F) axis(1,at=c(0,8,16,24,32,48,72,96)) As you can see, I have tried using the axes=F parameter that works for most plotting functions, but I get the following error message: Error in legend(wh[1], wh[2], legend = curvlab, col = color, lty = lty, : unused argument(s) (axes ...) Is there anyway I can get a customized x-axis when using the plot.cuminc function? I have searched online and R help manuals to no avail and would GREATLY appreciate your input. Please let me know if you need additional info from me. Thanks, Amy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transforming a string to a variable's name? help me newbie...
Greg, I found your suggestion very efficient! And I tried testing it, however, it didn't work: ###start of file rm(list=ls()) setwd(/Users/John/Documents/Research/Experiments/expt108/) ##test out 3 files only nms - c(019-v1-msa1.data,019-v1-msa2.data,019-v1-msa3.data) my.data - list() for(i in nms) my.data[[i]] - read.table(i) setwd(/Users/John/Programs/R/) save(list=ls(all=TRUE), file=Expt108Files.Rdata) ###end of file At first, I get this error: Expt108Files.R: unexpected symbol at 10: ,019--v1-msa1.gazedata Then I renames the 3 files to the followings: nms - c(019v1msa1.data,019v1msa2.data,019v1msa3.data) And I received this error instead: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 2 did not have 36 elements Thanks! - John On Mon, Dec 8, 2008 at 4:10 PM, Greg Snow [EMAIL PROTECTED] wrote: I really don't understand your concern. Something like: nms - c('file1','file2','file3') my.data - list() for (i in nms) my.data[[ i ]] - read.table(i) Will read in the files listed in the nms vector and put them into the list my.data (each data frame is a single element of the list). This list will not take up about the same amount of memory as if you read each file into a dataframe in the global environment. And there is no transforming of data frames (into 1 row or otherwise). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: tsunhin wong [mailto:[EMAIL PROTECTED] Sent: Monday, December 08, 2008 1:34 PM To: Greg Snow Cc: Jim Holtman; r-help@r-project.org; [EMAIL PROTECTED] Subject: Re: [R] Transforming a string to a variable's name? help me newbie... I want to combine all dataframes into one large list too... But each dataframe is a 35 columns x varying number of rows structure (from 2000 to 9000 rows) I have ~1500 dataframes of these to process, and that add up to 1.5Gb of data... Combining dataframes into a single one require me to transform each single dataframe into one line, but I really don't have a good solution for the varying number of rows scenario... And also, I don't want to stall my laptop every time I run the data set: maybe I can do that when my prof give me a ~ 4Gb ram desktop to run the script ;) Thanks! :) - John On Mon, Dec 8, 2008 at 1:36 PM, Greg Snow [EMAIL PROTECTED] wrote: In the long run it will probably make your life much easier to read all the dataframes into one large list (and have the names of the elements be what your currently name the dataframes), then you can just use regular list indexing (using [[]] rather than $ in most cases) instead of having to worry about get and assign and the risks/subtleties involved in using those. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of tsunhin wong Sent: Monday, December 08, 2008 8:45 AM To: Jim Holtman Cc: r-help@r-project.org Subject: Re: [R] Transforming a string to a variable's name? help me newbie... Thanks Jim and All! It works: tmptrial - trialcompute(trialextract( get(paste(t,tmptrialinfo[1,2],tmptrialinfo[1,16],.gz,sep=)) , tmptrialinfo[1,32],secs,sdm),secs,binsize) Can I use assign instead? How should it be coded then? Thanks! - John On Mon, Dec 8, 2008 at 10:40 AM, Jim Holtman [EMAIL PROTECTED] wrote: ?get Sent from my iPhone On Dec 8, 2008, at 7:11, tsunhin wong [EMAIL PROTECTED] wrote: Dear all, I'm a newbie in R. I have a 45x2x2x8 design. A dataframe stores the metadata of trials. And each trial has its own data file: I used read.table to import every trial into R as a dataframe (variable). Now I dynamically ask R to retrieve trials that fit certain selection criteria, so I use subset, e.g. tmptrialinfo - subset(trialinfo, (Subject==24 Filename==v2msa8)) The name of the dataframe / variable of an individual trial can be obtained using: paste(t,tmptrialinfo[1,2],tmptrialinfo[1,16],.gz,sep=) Then I get a string: t24v2msa8.gz which is of the exact same name of the dataframe / variable of that trial, which is: t24v2msa8.gz Can somebody tell me how can I change that string (obtained from paste() above) to be a usable / manipulable variable name, so that I can do something, such as: (1) tmptrial - trialcompute(trialextract( paste(t,tmptrialinfo[1,2],tmptrialinfo[1,16],.gz,sep=) ,tmptrialinfo[1,32],secs,sdm),secs,binsize) instead of hardcoding: (2) tmptrial - trialcompute(trialextract(t24v2msa8.gz,tmptrialinfo[1,32],secs,sdm),sec s,binsize) Currently, 1) doesn't work... Thanks in advance for your help! Regards, John
[R] Help with subtitle for Barplot
Barplot_qpcr_Isabella_Dce08_SL_cal5.r qpcr_barplot.txt plot2_JS-W2_HKG21-Cal5.pdf I attached the R code, data and plot. I am wondering why I couldn't get subtitle showing up under each plot even though I didn't get any error message. Thanks for help in advance, Shaorong Sample NKa1b_2 NKa1c_2 NKAa3_2 NKAb1_1 Group 1556-0.14928982 -0.146699076-0.022236441-4.046102264JS_HH 1626-0.1702710840.049696065 -0.2252775850.851861452 JS_HH 1677-0.0233595670.054504736 0.299314951 0.975459979 JS_HH 1745-0.280403029-0.110558584-0.0580961410.643061398 JS_HH 13220.277170917 -1.0181827970.153302613 -4.125755573JS_UH 13550.029640828 0.013059459 0.10176975 0.73826325 JS_UH 13760.537833924 0.498861689 0.822027081 0.275237161 JS_UH 1404-0.072231121-0.0096764660.081658873 0.529725597 JS_UH 14210.269019997 0.315229777 0.164296186 0.978087392 JS_UH 14480.423323368 0.325610763 0.188268333 0.604567952 JS_UH 1520-0.152175597-0.088982454-0.00317745 0.694865259 JS_UH 16370.279098956 0.09674064 -0.60969932 0.969939918 JS_UH 1740-0.3050831250.162734477 0.139792974 -1.701646147JS_UH 17690.013103623 0.090727937 0.051869886 0.919403273 JS_UH 24 -0.1006484880.081747891 -0.091458258-4.244210716W_HH 34 0.041359599 -0.043213716-0.047314396-4.15756455 W_HH 83 -0.1181853360.004305679 -0.044859266-3.163651353W_HH 98 -0.088802214-0.056019396-0.1919499420.733591243 W_HH 122 -0.200487325-0.209960419-0.2640052580.882023718 W_HH 144 0.263879466 0.311560822 0.1009223 0.407358097 W_HH 228 0.165844087 0.179544409 0.053720993 0.756049833 W_HH 308 -0.13334169 0.059431671 0.053755453 0.684082343 W_HH 362 0.137440519 0.279497504 0.161421911 -4.165294587W_HH 75 0.16323564 0.119034461 0.217261002 0.84941149 W_UH 167 0.472053669 0.400633437 0.293564648 0.931563634 W_UH 193 0.358867692 0.420990019 0.263630909 0.649366339 W_UH 210 0.281419172 0.29477039 0.282901177 0.769709346 W_UH 214 0.272700428 0.312086763 0.345958722 -3.753016482W_UH 237 0.183989578 0.2108481 0.208665138 0.574668113 W_UH 302 -0.270415946-0.11194806 -0.1969445880.814442285 W_UH 315 0.569414972 0.615197576 0.289420307 -0.059768719W_UH 329 -0.040500835-0.030366091-0.2044811880.944737301 W_UH __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.