Re: [R] dataframe: visualization as tiles(?)
Whoops - didn't get what you meant ?mosaicplot is your friend Cheers Jason __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] dataframe: visualization as tiles(?)
> > Dear R users, > > I remember seeing somewhere a method of visualizing a set of > observations on two variables x and y in the following way Is this what you want? > ## fake data > zz <- data.frame(x=sample(0:1,20,rep=T),y=sample((-1:1),20,rep=T)) > zz > ## tabulate it > zz.tab <- data.frame(table(zz)) > zz.tab > library(lattice) > barchart(y ~ Freq | x, data=zz.tab) Cheers Jason __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] dataframe: visualization as tiles(?)
Dear R users, I remember seeing somewhere a method of visualizing a set of observations on two variables x and y in the following way x=0 x=1 |---| |---| y=-1 | | | | |---| | | | | |---| | | | | |---| y=0 | | | | |---| |---| | | |---| |---| y=1 | | |---| |---| |---| where x = 0 or 1; y = -1, 0, 1. The 'tile' area represents the count of observations with corresponding x and y values. Now, I don't remember what is the name of the functions that support such plots. I tried help.search("*tile*"); I skimmed the documentation of the 'lattice' package. Both seem not to be what I remembered. Please send me pointers. Thanks in advance Itay __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How does nlm work?
> 3) I have never heard of this step selections (line search, dogleg and > optimal step). I would like to know something about it. I would > appreciate if someone could send references for me to learn the subject. IIRC, you'll find them here: Nonlinear Regression Analysis and Its Applications Douglas M. Bates and Donald G. Watts John Wiley & Sons Inc., 1988 ISBN: 0471816434 Cheers Jason __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How does nlm work?
Dear R users, I have looked in the reference Schnabel, R. B., Koontz, J. E. and Weiss, B. E. (1985) A modular system of algorithms for unconstrained minimization. _ACM Trans. Math. Software_, *11*, 419-440. cited in the nlm help. This article says that the algorithm permits the use of step selection (line search, dogleg and optimal step), analytic or finite diference gradient and analytic, finite diference or BFGS Hessian aproximation. Looking back in the nlm help, it has the information that: a) it does just the line search step selecion; b) it has the option to inform the gradient and the Hessian by attributes if the user wants. My questions are: 1) When I do not supply the Hessian, the function does finite difference or BFGS approximation? (Is it possible to select one or other?) 2) I have already used the option to inform the gradient but I don't know how to inform the Hessian. Anybody has an example? 3) I have never heard of this step selections (line search, dogleg and optimal step). I would like to know something about it. I would appreciate if someone could send references for me to learn the subject. Sincerely, -- Frederico Zanqueta Poleto [EMAIL PROTECTED] -- "An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem." J. W. Tukey __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Makefile for installing all available packages
Below is a makefile I wrote to download and install all available R packages from the CRAN and BioConductor package repositories. The primary advantage of using this makefile instead of R's built-in install.package() and update.packages() is the creation of a separate installation log for every package. Further, if make is invoked with '-k', failure to install a single package will not derail the installation of other packages. I hope that this script may be useful to other folks. -Greg # Download and install all available R packages from the CRAN and Bioconductor # package repositories # RCMD ?= R-1.9.0 WGET ?= wget -N -nd -r -A gz -r -l 1 -nv PACKAGE_FILES = $(wildcard *.gz ) PACKAGE_LOGS = $(addsuffix .log, $(basename $(basename $(PACKAGE_FILES default: cran bioconductor install cran: $(WGET) "http://cran.r-project.org/src/contrib/PACKAGES.html"; bioconductor: bioCmain bioCcontrib bioCdata bioCmain: $(WGET) "http://www.bioconductor.org/repository/release1.3/package/html/index.html"; bioCcontrib: $(WGET) "http://www.bioconductor.org/contrib/index.html"; bioCdata: $(WGET) "http://www.bioconductor.org/data/metaData.html"; install: $(PACKAGE_LOGS) %.log: %.tar.gz $(RCMD) INSTALL $< > [EMAIL PROTECTED] 2>&1 mv [EMAIL PROTECTED] $@ LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] un-expected return by fdim
Browse[1]> Lframe v v v v v v v v 1 8 7 6 5 4 3 2 1 2 9 8 7 6 5 4 3 2 3 10 9 8 7 6 5 4 3 4 11 10 9 8 7 6 5 4 5 12 11 10 9 8 7 6 5 6 13 12 11 10 9 8 7 6 7 14 13 12 11 10 9 8 7 8 15 14 13 12 11 10 9 8 Browse[1]> fdim(Lframe,q=2) Error in slopeopt(AllPoints, Alpha) : Object "LineP" not found thanks for any feed back __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Non-homogeneity of variance - decreasing variance
Dear Simon, I'm not sure that I follow this entirely, but if error variance decreases with the level of the response, you could try raising the response to a power greater than 1. Of course, the response has to be non-negative. You might take a look at the spread.level.plot function in the car package, which will produce a suggested transformation when applied to an lm object. I hope that this helps, John > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Simon Chamaillé > Sent: Tuesday, April 13, 2004 12:36 PM > To: [EMAIL PROTECTED] > Subject: [R] Non-homogeneity of variance - decreasing variance > > Hello all, > I'm running very simple regression but face a problem of > non-homogeneity of variance, but with a decreasing variance > with increasing mean...I do not know how to deal with that. > this relationship doesn't seem to be strong, but it's my > first time to see something like that, and would like to know > what to do if one day it becomes stronger. I tested just for > fun some transformation but was not able to get a better > model. I do not know if it can help, but my predictor > variable is a kind of gamma poisson-shaped-like zero-rich > distribution (continuous of course), highly overdispersed. > If one know how to deal with decreasing variance, I would > appreciate any advice (I tried to modelize negative > variance-mean relationship in a new > quasi- family this was prohibited, only constant, mu, mu^x > (and mu(1-mu) for > binomial) were allowed). I've definitively reached the border > of the statistical black box for me. > thanks > simon > __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Matrix question
Gideon, Eigenvectors are normalized to unit length. The first eigenvector calculated by R is equal (ignoring the signs of course) to your stable distribution vector divided by its length. Andy __ Andy Jaworski 518-1-01 Process Laboratory 3M Corporate Research Laboratory - E-mail: [EMAIL PROTECTED] Tel: (651) 733-6092 Fax: (651) 736-3122 |-+> | | GIDEON WASSERBERG| | | <[EMAIL PROTECTED]>| | | Sent by: | | | [EMAIL PROTECTED]| | | ath.ethz.ch | | || | || | | 04/13/2004 18:28 | |-+> >-| | | | To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> | | cc: | | Subject: [R] Matrix question | >-| Dear Friends I am doing a simple matrix analysis to calculate the eigenvalue, eigenvector using R for the below matrix, and comparing the result to those obtained from a projection (using excel) THE MATRIX: > c [,1] [,2] [,3] [1,] 0.0 2.02 [2,] 0.8 0.00 [3,] 0.0 0.80 The dominant eigenvalue comes out comparable to that calculated numerically, but the eigenvectors do not( see below)! EIGENVALUES (calculated by R): > eigen(c) $values [1] 1.5564082+0.00i -0.7782041+0.465623i -0.7782041-0.465623i EIGENVALUE numerically calculated: 1.556408145 EIGENVECTORS (calculated by R): $vectors [,1] [,2] [,3] [1,] -0.8658084+0i 0.6476861+0.000i 0.6476861+0.000i [2,] -0.4450290+0i -0.4902997-0.2933611i -0.4902997+0.2933611i [3,] -0.2287467+0i 0.2382837+0.4441499i 0.2382837-0.4441499i Stable age distribution (calculated numerically): 0.562365145 0.289057934 0.148576921 My questions are: 1. Both eigenvalue and eigenvectors are associated with some imaginary value (i). How should I relate to that information? 2. More importantly, a. I presume the 1st eigenvector collumn [,1] should correspond to the dominant eigenvalue. How come then that it comes out different from the one calculated numerically? Is there some conversion I should do? Many thanks Gideon Gideon Wasserberg (Ph.D.) Wildlife research unit, Department of wildlife ecology, University of Wisconsin 218 Russell labs, 1630 Linden dr., Madison, Wisconsin 53706, USA. Tel.:608 265 2130, Fax: 608 262 6099 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Matrix question
On Tue, 13 Apr 2004, GIDEON WASSERBERG wrote: > Dear Friends > > I am doing a simple matrix analysis to calculate the eigenvalue, > eigenvector using R for the below matrix, and comparing the result to > those obtained from a projection (using excel) > > THE MATRIX: > > c > [,1] [,2] [,3] > [1,] 0.0 2.02 > [2,] 0.8 0.00 > [3,] 0.0 0.80 > > > The dominant eigenvalue comes out comparable to that calculated > numerically, but the eigenvectors do not( see below)! Yes, they do. Your dominant eigenvector is -0.6495461 times the R dominant eigenvector, and eigenvectors are defined only up to direction. You probably want to rescale the eigenvector so that the sums of entries are 1. > > EIGENVALUES (calculated by R): > > > eigen(c) > $values > [1] 1.5564082+0.00i -0.7782041+0.465623i -0.7782041-0.465623i > > EIGENVALUE numerically calculated: 1.556408145 > > > EIGENVECTORS (calculated by R): > $vectors > [,1] [,2] [,3] > [1,] -0.8658084+0i 0.6476861+0.000i 0.6476861+0.000i > [2,] -0.4450290+0i -0.4902997-0.2933611i -0.4902997+0.2933611i > [3,] -0.2287467+0i 0.2382837+0.4441499i 0.2382837-0.4441499i > > Stable age distribution (calculated numerically): > > 0.562365145 > 0.289057934 > 0.148576921 > > > My questions are: 1. Both eigenvalue and eigenvectors are associated > with some imaginary value (i). How should I relate to that information? The first eigenvalue has zero imaginary component, as does its eigenvector, so you may not need to relate to it. -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] "diff"^-1
Yes kjetil (and other) my question wasn't clear but what I was looking for was just diffinv!! thanks michele __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] R 1.9.0 and cursors
Even if this is not covered in the FAQ, it sure had been asked many times on R-help that it qualifies to be in the FAQ. Make sure you have the readline libraries & headers. The last output of `configure' should show `readline' if such capability is found to be working by the configure script. Andy > From: Paolo Sirabella > > Hi all, > I have successfully compiled and installed (OS: Linux > Mandrake 9.2) the last > devel version of R (ver. 1.9.0). All seems to go well, but > now I cannot use > the cursors for exploring back the command history (when I > press the cursor > key the following characters are shown: ^[[A , or ^[[B etc.). > > Hints and suggestions are welcomed. > > Thanks. > > Paolo > -- > - > Paolo Sirabella, PhD > University of Rome "La Sapienza" > Dept. of Human Physiology and Pharmacology > Building of Human Physiology > P.le Aldo Moro, 5 - 00185 - Roma - Italy > > Res Non Verba > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R 1.9.0 and cursors
Hi all, I have successfully compiled and installed (OS: Linux Mandrake 9.2) the last devel version of R (ver. 1.9.0). All seems to go well, but now I cannot use the cursors for exploring back the command history (when I press the cursor key the following characters are shown: ^[[A , or ^[[B etc.). Hints and suggestions are welcomed. Thanks. Paolo -- - Paolo Sirabella, PhD University of Rome "La Sapienza" Dept. of Human Physiology and Pharmacology Building of Human Physiology P.le Aldo Moro, 5 - 00185 - Roma - Italy Res Non Verba __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Reverse dendrogram plot in R
Is there a way to completely reverse a dendrogram plot? So, for dendrogram D, I want to generate a mirror image of plot(D). This works if one plots an hclust object in reverse order, but I need this to work as a dendrogram, since the dendrogram is plotted in a more complicated layout. Thank you, Mark Wall __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Matrix question
Dear Friends I am doing a simple matrix analysis to calculate the eigenvalue, eigenvector using R for the below matrix, and comparing the result to those obtained from a projection (using excel) THE MATRIX: > c [,1] [,2] [,3] [1,] 0.0 2.02 [2,] 0.8 0.00 [3,] 0.0 0.80 The dominant eigenvalue comes out comparable to that calculated numerically, but the eigenvectors do not( see below)! EIGENVALUES (calculated by R): > eigen(c) $values [1] 1.5564082+0.00i -0.7782041+0.465623i -0.7782041-0.465623i EIGENVALUE numerically calculated: 1.556408145 EIGENVECTORS (calculated by R): $vectors [,1] [,2] [,3] [1,] -0.8658084+0i 0.6476861+0.000i 0.6476861+0.000i [2,] -0.4450290+0i -0.4902997-0.2933611i -0.4902997+0.2933611i [3,] -0.2287467+0i 0.2382837+0.4441499i 0.2382837-0.4441499i Stable age distribution (calculated numerically): 0.562365145 0.289057934 0.148576921 My questions are: 1. Both eigenvalue and eigenvectors are associated with some imaginary value (i). How should I relate to that information? 2. More importantly, a. I presume the 1st eigenvector collumn [,1] should correspond to the dominant eigenvalue. How come then that it comes out different from the one calculated numerically? Is there some conversion I should do? Many thanks Gideon Gideon Wasserberg (Ph.D.) Wildlife research unit, Department of wildlife ecology, University of Wisconsin 218 Russell labs, 1630 Linden dr., Madison, Wisconsin 53706, USA. Tel.:608 265 2130, Fax: 608 262 6099 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Non-homogeneity of variance - decreasing variance
On 13 Apr 2004 at 19:36, Simon Chamaillé wrote: You could maybe try gls in package nlme, where you can estimate parameters in variance functions. If you need a generalized linear model, you could have a look at glmmPQL in MASS, but I don't know if that accepts models without random effects. Kjetil Halvorsen > Hello all, > I'm running very simple regression but face a problem of > non-homogeneity of variance, but with a decreasing variance with > increasing mean...I do not know how to deal with that. this > relationship doesn't seem to be strong, but it's my first time to see > something like that, and would like to know what to do if one day it > becomes stronger. I tested just for fun some transformation but was > not able to get a better model. I do not know if it can help, but my > predictor variable is a kind of gamma poisson-shaped-like zero-rich > distribution (continuous of course), highly overdispersed. If one know > how to deal with decreasing variance, I would appreciate any advice (I > tried to modelize negative variance-mean relationship in a new quasi- > family this was prohibited, only constant, mu, mu^x (and mu(1-mu) for > binomial) were allowed). I've definitively reached the border of the > statistical black box for me. thanks simon > > [[alternative HTML version deleted]] > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] lattice problem in R-1.9.0
Deepayan Sarkar wrote: On Tuesday 13 April 2004 12:51, Sundar Dorai-Raj wrote: Hi all, I just installed R-1.9.0 on Windows 2000 from binaries. Yesterday, on R-1.8.1 I ran a script that looked like: library(lattice) tmp <- expand.grid(A = 1:3, B = letters[1:2]) tmp$z <- runif(NROW(tmp)) trellis.device(png, file = "x1081.png", theme = col.whitebg) xyplot(z ~ A | B, data = tmp, panel = function(x, y, i) { panel.xyplot(x, y) ltext(1, 0.95, paste("i =", i), adj = 0) }, ylim = c(0, 1), i = 10) dev.off() In R-1.9.0, the same script gives the following error message: Error in trellis.skeleton(cond = structure(list(B = structure(as.integer(c(1, : Invalid value of index.cond ^^ I've tracked it down to including the argument "i" to the panel function. If I change the argument to xyplot(z ~ A | B, data = tmp, panel = function(x, y, I) { panel.xyplot(x, y) ltext(1, 0.95, paste("i =", I), adj = 0) }, ylim = c(0, 1), I = 10) all is copacetic. There is no argument in xyplot that starts with "i" so I don't know where the partial matching is occurring. Actually, in R 1.9.0, xyplot() does have a new argument that starts with i, namely 'index.cond' (as indicated by the error message above). This (along with many other arguments) doesn't show up in args(xyplot) because of the way arguments common to high-level lattice functions are handled by common code (they are formally part of ...); but it is documented in ?xyplot. Deepayan Sorry, should have caught that. As you suspected all I did was args(xyplot). I wasn't expecting a new argument. Thanks for the quick reply. --sundar __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] "diff"^-1
Whoops, I was thinking of something totally different. Apologies. I think you might want diffinv(). -roger Roger D. Peng wrote: What do you mean by "opposite"? Have you looked at patch? -roger michele lux wrote: Hallo all somebody knows if exist a command who makes the opposite of what "diff" command do? I'he to write code? thanks Michele __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Complex sample variances
On Tue, 13 Apr 2004, Fred Rohde wrote: > Looked through the publication, "Statistical Methods and Mathematical > Algorithms Used in Sudaan" (Shah, et al, 1993) but the only reference to > variances on quantiles is a 1991 presentation by David Binder. Googled > the title and got this link. > > http://www.amstat.org/sections/srms/Proceedings/papers/1991_005.pdf > Ok. I see. I wouldn't have called this a Taylor series method, and I notice that Binder agrees with me. They are doing interval estimation by inverting a score test, which is an interval estimation method I want to add more generally in R. It works much better than Wald tests for a number of quasilikelihood/estimating function estimators in ordinary model-based analysis, too. Taylor series methods have trouble with quantiles because the estimating function isn't differentiable. Asymptotic normality still applies, but the asymptotic standard error depends on the density of the variable at the quantile, and the asymptotic approximation is not as good as usual. Even the bootstrap needs larger sample sizes for quantiles than for many statistics. -thomas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Need advice on using R with large datasets
"Liaw, Andy" <[EMAIL PROTECTED]> writes: > On a dual Opteron 244 with 16GB ram, and > > [EMAIL PROTECTED]:cb1]% free > total used free sharedbuffers cached > Mem: 16278648 145526761725972 0 2294203691824 > -/+ buffers/cache: 106314325647216 > Swap: 2096472 134282083044 > > ... using freshly compiled R-1.9.0: > > > system.time(x <- numeric(1e9)) > [1] 3.60 8.09 15.11 0.00 0.00 > > object.size(x)/1024^3 > [1] 7.45058 Well, > system.time(mean(x)) [1] 15.80 20.94 1323.010.000.00 > object.size(x)/1024^3 [1] 7.45058 I suppose I just have to look forward to RAM prices dropping... (Actually, the OS should be able to do better. Should be able to read the data from disk at about 20s/GB.) -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] "diff"^-1
On 13 Apr 2004 at 18:58, michele lux wrote: > Hallo all > somebody knows if exist a command who makes the > opposite of what "diff" command do? > I'he to write code? > thanks Michele > As other responses has shown, your Q could have been clearer! ?diffinv note that this is not really an inverse, as the following shows: > diffinv(diff(1:10)) [1] 0 1 2 3 4 5 6 7 8 9 If you know the first element of your original series, you can do: > diffinv(diff(3:13), xi=3) [1] 3 4 5 6 7 8 9 10 11 12 13 Kjetil Halvorsen > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Opposite of 'diff'? [was: (no subject)]
On Tue, 13 Apr 2004 10:58:35 -0700, Spencer Graves <[EMAIL PROTECTED]> wrote : >What is "patch"? I don't find it in R 1.8.1. However, ?"diff" mentions >"diffinv"; that and "cumsum" perform as follows: "diff" is a Unix command to calculate differences between files. "patch" is a Unix command to apply such differences to a file. I'm pretty sure I misunderstood the question; "cumsum" is probably the right answer. Duncan Murdoch __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Opposite of 'diff'? [was: (no subject)]
Spencer Graves <[EMAIL PROTECTED]> writes: > What is "patch"? I don't find it in R 1.8.1. However, ?"diff" > mentions "diffinv"; that and "cumsum" perform as follows: diff is a Unix command for comparing files. Using output from diff to patch a file is done by a program called ... well you guessed it (an acronym for "please apply this clever hack" according to legend). I think that your guess, that it was the R function that was intended, was a better one, though. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] lattice problem in R-1.9.0
On Tuesday 13 April 2004 12:51, Sundar Dorai-Raj wrote: > Hi all, >I just installed R-1.9.0 on Windows 2000 from binaries. Yesterday, > on R-1.8.1 I ran a script that looked like: > > library(lattice) > tmp <- expand.grid(A = 1:3, B = letters[1:2]) > tmp$z <- runif(NROW(tmp)) > trellis.device(png, file = "x1081.png", theme = col.whitebg) > xyplot(z ~ A | B, data = tmp, > panel = function(x, y, i) { > panel.xyplot(x, y) > ltext(1, 0.95, paste("i =", i), adj = 0) > }, > ylim = c(0, 1), > i = 10) > dev.off() > > In R-1.9.0, the same script gives the following error message: > > Error in trellis.skeleton(cond = structure(list(B = > structure(as.integer(c(1, : > Invalid value of index.cond ^^ > I've tracked it down to including the argument "i" to the panel > function. If I change the argument to > > xyplot(z ~ A | B, data = tmp, > panel = function(x, y, I) { > panel.xyplot(x, y) > ltext(1, 0.95, paste("i =", I), adj = 0) > }, > ylim = c(0, 1), > I = 10) > > all is copacetic. There is no argument in xyplot that starts with "i" > so I don't know where the partial matching is occurring. Actually, in R 1.9.0, xyplot() does have a new argument that starts with i, namely 'index.cond' (as indicated by the error message above). This (along with many other arguments) doesn't show up in args(xyplot) because of the way arguments common to high-level lattice functions are handled by common code (they are formally part of ...); but it is documented in ?xyplot. Deepayan __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Need advice on using R with large datasets
Liaw, Andy wrote: I was under the impression that R has been run on 64-bit Solaris (and other 64-bit Unices) for quite a while (as 64-bit app). Yes, on Solaris it has worked for quite a while. I don't use it a lot, but have one problem that I have been running from time to time for a few years. There are two "issues" that I know about. 1/ Some extra capabilities (like png I think) also need to be compiled as 64 bit apps, and in some cases this is a non-trivial effort (on Solaris for someone like me that does not do that kind of thing often). For this reason I have both a 32-bit version for regular use and a 64-bit version for special problems. 2/ Some R functions make copies of the data sets used and attach them to the result. For small data sets that can be very useful. If the result is then used as an argument to another function then very quickly there are multiple copies. If the data set is large then one is quickly making heavy use of swap, and the processing is very slow. This is not just a 64-bit problem, but with a 32-bit architecture it is hard to work on a data set big enough that this becomes an issue. In some cases performance can be improved a lot by hacking the code and not attaching the dataset to the result (with some risk that functions using the result get broken). Paul Gilbert We've been running 64-bit R on amd64 for a few months (and had quite a few oppertunities to get the R processes using over 8GB of RAM). Not much problem as far as I can see... Best, Andy From: Roger D. Peng As far as I know, R does compile on AMD Opterons and runs as a 64-bit application. So it can store objects larger than 4GB. However, I don't think R gets tested very often on 64-bit machines with such large objects so there may be yet undiscovered bugs. -roger Sunny Ho wrote: Hello everyone, I would like to get some advices on using R with some really large datasets. I'm using RH9 Linux R 1.8.1 for a research with a lot of numerical data. The datasets total to around 200Mb (shown by memory.size). During my data manipulation, the system memory usage grew to 1.5Gb, and this caused a lot of swapping activities on my 1Gb PC. This is just a small-scale experiment, the full-scale one will be using data 30 times as large (on a 4Gb machine). I can see that I'll need to deal with memory usage problem very soon. I notice that R keeps all datasets in memory at all times. I wonder whether there is any way to instruct R to push some of the less-frequently-used data tables out of main memory, so as to free up memory for those that are actively in used. It'll be even better if R can keep only part of a table in memory only when that part is needed. Using save & load could help, but I just wonder whether R is intelligent enough to do this by itself, so I don't need to keep track of memory usage at all times. Another thought is to use a 64-bit machine (AMD64). I find there is a pre-compiled R for Fedora Linux on AMD64. Anyone knows whether this version of R runs as 64-bit? If so, then will R be able to go beyond the 32-bit 4Gb memory limit? Also, from the manual, I find that the RPgSQL package (for PostgreSQL database) supports a feature "proxy data frame". Does anyone have experience with this? Can "proxy data frame" handle memory efficiently for very large datasets? Say, if I have a 6Gb database table defined as a proxy data frame, will R & RPgSQL be able to handle it with just 4Gb of memory? Any comments will be useful. Many thanks. Sunny Ho (Hong Kong University of Science & Technology) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Opposite of 'diff'? [was: (no subject)]
What is "patch"? I don't find it in R 1.8.1. However, ?"diff" mentions "diffinv"; that and "cumsum" perform as follows: cumsum(diff(1:11)) [1] 1 2 3 4 5 6 7 8 9 10 > diffinv(diff(1:11)) [1] 0 1 2 3 4 5 6 7 8 9 10 > spencer graves Duncan Murdoch wrote: On Tue, 13 Apr 2004 18:57:59 +0200 (CEST), michele lux <[EMAIL PROTECTED]> wrote : Hallo all somebody knows if exist a command who makes the opposite of what "diff" command do? I'he to write code? Sounds like "patch" is what you want. Duncan Murdoch __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] lattice problem in R-1.9.0
Hi all, I just installed R-1.9.0 on Windows 2000 from binaries. Yesterday, on R-1.8.1 I ran a script that looked like: library(lattice) tmp <- expand.grid(A = 1:3, B = letters[1:2]) tmp$z <- runif(NROW(tmp)) trellis.device(png, file = "x1081.png", theme = col.whitebg) xyplot(z ~ A | B, data = tmp, panel = function(x, y, i) { panel.xyplot(x, y) ltext(1, 0.95, paste("i =", i), adj = 0) }, ylim = c(0, 1), i = 10) dev.off() In R-1.9.0, the same script gives the following error message: Error in trellis.skeleton(cond = structure(list(B = structure(as.integer(c(1, : Invalid value of index.cond I've tracked it down to including the argument "i" to the panel function. If I change the argument to xyplot(z ~ A | B, data = tmp, panel = function(x, y, I) { panel.xyplot(x, y) ltext(1, 0.95, paste("i =", I), adj = 0) }, ylim = c(0, 1), I = 10) all is copacetic. There is no argument in xyplot that starts with "i" so I don't know where the partial matching is occurring. Thanks, Sundar __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] (no subject)
On Tue, 13 Apr 2004 18:57:59 +0200 (CEST), michele lux <[EMAIL PROTECTED]> wrote : >Hallo all >somebody knows if exist a command who makes the >opposite of what "diff" command do? >I'he to write code? Sounds like "patch" is what you want. Duncan Murdoch __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Non-homogeneity of variance - decreasing variance
Hello all, I'm running very simple regression but face a problem of non-homogeneity of variance, but with a decreasing variance with increasing mean...I do not know how to deal with that. this relationship doesn't seem to be strong, but it's my first time to see something like that, and would like to know what to do if one day it becomes stronger. I tested just for fun some transformation but was not able to get a better model. I do not know if it can help, but my predictor variable is a kind of gamma poisson-shaped-like zero-rich distribution (continuous of course), highly overdispersed. If one know how to deal with decreasing variance, I would appreciate any advice (I tried to modelize negative variance-mean relationship in a new quasi- family this was prohibited, only constant, mu, mu^x (and mu(1-mu) for binomial) were allowed). I've definitively reached the border of the statistical black box for me. thanks simon [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] "diff"^-1
What do you mean by "opposite"? Have you looked at patch? -roger michele lux wrote: Hallo all somebody knows if exist a command who makes the opposite of what "diff" command do? I'he to write code? thanks Michele __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] "diff"^-1
?cumsum michele lux wrote: Hallo all somebody knows if exist a command who makes the opposite of what "diff" command do? I'he to write code? thanks Michele __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] "diff"^-1
Hallo all somebody knows if exist a command who makes the opposite of what "diff" command do? I'he to write code? thanks Michele __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] (no subject)
Hallo all somebody knows if exist a command who makes the opposite of what "diff" command do? I'he to write code? thanks Michele __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Need advice on using R with large datasets
On a dual Opteron 244 with 16GB ram, and [EMAIL PROTECTED]:cb1]% free total used free sharedbuffers cached Mem: 16278648 145526761725972 0 2294203691824 -/+ buffers/cache: 106314325647216 Swap: 2096472 134282083044 ... using freshly compiled R-1.9.0: > system.time(x <- numeric(1e9)) [1] 3.60 8.09 15.11 0.00 0.00 > object.size(x)/1024^3 [1] 7.45058 Andy > From: Peter Dalgaard > > "Roger D. Peng" <[EMAIL PROTECTED]> writes: > > > I've been running R on 64-bit SuSE Linux on Opterons for a > few months > > now and it certainly runs fine in what I would call standard > > situations. In particular there seems to be no problem with > > workspaces > 4GB. But I seldom handle single objects (like > matrices, > > vectors) that are > 4GB. The only exception is lists, but I think > > those are okay since they are composed of various sub-objects (like > > Peter mentioned). > > I just tried, and x <- numeric(1e9) (~8GB) doesn't appear to be a > problem, except that it takes "forever" since the machine in question > has only 1GB of memory, and numeric() zero fills the allocated > block... > > -- >O__ Peter Dalgaard Blegdamsvej 3 > c/ /'_ --- Dept. of Biostatistics 2200 Cph. N > (*) \(*) -- University of Copenhagen Denmark Ph: > (+45) 35327918 > ~~ - ([EMAIL PROTECTED]) FAX: > (+45) 35327907 > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Need advice on using R with large datasets
"Roger D. Peng" <[EMAIL PROTECTED]> writes: > I've been running R on 64-bit SuSE Linux on Opterons for a few months > now and it certainly runs fine in what I would call standard > situations. In particular there seems to be no problem with > workspaces > 4GB. But I seldom handle single objects (like matrices, > vectors) that are > 4GB. The only exception is lists, but I think > those are okay since they are composed of various sub-objects (like > Peter mentioned). I just tried, and x <- numeric(1e9) (~8GB) doesn't appear to be a problem, except that it takes "forever" since the machine in question has only 1GB of memory, and numeric() zero fills the allocated block... -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] randomForest: more than one variable needed?
On Tue, 13 Apr 2004, Hui Han wrote: > Hi, > > I am doing feature selection for my dataset. The following is > the extreme case where only one feature is left. But I got > the error below. So my question is that do I have to use > more than one features? > > sample.subset > udomain.edu hpclass > 1-1.0 not > 2-1.0 not > 3-0.2 not > 4 1.0 hp > 5 1.0 hp > > randomForest(hpclass ~., data=sample.subset, importance=TRUE); > Error in if (n == 0) stop("data (x) has 0 rows") : > argument is of length zero > no idea about the error message, but there is no need for feature selection before using random forests - give it a try without preselection of variables. best Torsten > Best regards, > Hui Han > Department of Computer Science and Engineering, > The Pennsylvania State University > University Park, PA,16802 > email: [EMAIL PROTECTED] > homepage: http://www.cse.psu.edu/~hhan > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] randomForest: more than one variable needed?
With only one `x' variable, RF will be identical to bagging. This looks like a bug. I will check it out. Andy > From: Hui Han > > I agree with you about the less practical meaning of this sample of > the extreme case. I am just curious about the "grammar" syntax of > randomForest. > > Thanks. > Hui > > On Tue, Apr 13, 2004 at 05:29:06PM +0200, Philippe Grosjean wrote: > > I don't see much why to use random forest with only one > predictive variable! > > Recall that random forest grow trees with a random subset > of variables "in > > competition" for growing each node of the trees in the > forest... How do you > > make such a random subset with only one predictive > variable? there is no > > point here! > > > > Philippe Grosjean > > > > -Original Message- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] Behalf Of Hui Han > > Sent: Tuesday, 13 April, 2004 17:16 > > To: [EMAIL PROTECTED] > > Subject: [R] randomForest: more than one variable needed? > > > > > > Hi, > > > > I am doing feature selection for my dataset. The following is > > the extreme case where only one feature is left. But I got > > the error below. So my question is that do I have to use > > more than one features? > > > > sample.subset > > udomain.edu hpclass > > 1-1.0 not > > 2-1.0 not > > 3-0.2 not > > 4 1.0 hp > > 5 1.0 hp > > > randomForest(hpclass ~., data=sample.subset, importance=TRUE); > > Error in if (n == 0) stop("data (x) has 0 rows") : > > argument is of length zero > > > > Best regards, > > Hui Han > > Department of Computer Science and Engineering, > > The Pennsylvania State University > > University Park, PA,16802 > > email: [EMAIL PROTECTED] > > homepage: http://www.cse.psu.edu/~hhan > > > > __ > > [EMAIL PROTECTED] mailing list > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > > > > > > Hui Han > Department of Computer Science and Engineering, > The Pennsylvania State University > University Park, PA,16802 > email: [EMAIL PROTECTED] > homepage: http://www.cse.psu.edu/~hhan > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Need advice on using R with large datasets
I've been running R on 64-bit SuSE Linux on Opterons for a few months now and it certainly runs fine in what I would call standard situations. In particular there seems to be no problem with workspaces > 4GB. But I seldom handle single objects (like matrices, vectors) that are > 4GB. The only exception is lists, but I think those are okay since they are composed of various sub-objects (like Peter mentioned). -roger Liaw, Andy wrote: I was under the impression that R has been run on 64-bit Solaris (and other 64-bit Unices) for quite a while (as 64-bit app). We've been running 64-bit R on amd64 for a few months (and had quite a few oppertunities to get the R processes using over 8GB of RAM). Not much problem as far as I can see... Best, Andy From: Roger D. Peng As far as I know, R does compile on AMD Opterons and runs as a 64-bit application. So it can store objects larger than 4GB. However, I don't think R gets tested very often on 64-bit machines with such large objects so there may be yet undiscovered bugs. -roger Sunny Ho wrote: Hello everyone, I would like to get some advices on using R with some really large datasets. I'm using RH9 Linux R 1.8.1 for a research with a lot of numerical data. The datasets total to around 200Mb (shown by memory.size). During my data manipulation, the system memory usage grew to 1.5Gb, and this caused a lot of swapping activities on my 1Gb PC. This is just a small-scale experiment, the full-scale one will be using data 30 times as large (on a 4Gb machine). I can see that I'll need to deal with memory usage problem very soon. I notice that R keeps all datasets in memory at all times. I wonder whether there is any way to instruct R to push some of the less-frequently-used data tables out of main memory, so as to free up memory for those that are actively in used. It'll be even better if R can keep only part of a table in memory only when that part is needed. Using save & load could help, but I just wonder whether R is intelligent enough to do this by itself, so I don't need to keep track of memory usage at all times. Another thought is to use a 64-bit machine (AMD64). I find there is a pre-compiled R for Fedora Linux on AMD64. Anyone knows whether this version of R runs as 64-bit? If so, then will R be able to go beyond the 32-bit 4Gb memory limit? Also, from the manual, I find that the RPgSQL package (for PostgreSQL database) supports a feature "proxy data frame". Does anyone have experience with this? Can "proxy data frame" handle memory efficiently for very large datasets? Say, if I have a 6Gb database table defined as a proxy data frame, will R & RPgSQL be able to handle it with just 4Gb of memory? Any comments will be useful. Many thanks. Sunny Ho (Hong Kong University of Science & Technology) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] randomForest: more than one variable needed?
I agree with you about the less practical meaning of this sample of the extreme case. I am just curious about the "grammar" syntax of randomForest. Thanks. Hui On Tue, Apr 13, 2004 at 05:29:06PM +0200, Philippe Grosjean wrote: > I don't see much why to use random forest with only one predictive variable! > Recall that random forest grow trees with a random subset of variables "in > competition" for growing each node of the trees in the forest... How do you > make such a random subset with only one predictive variable? there is no > point here! > > Philippe Grosjean > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Hui Han > Sent: Tuesday, 13 April, 2004 17:16 > To: [EMAIL PROTECTED] > Subject: [R] randomForest: more than one variable needed? > > > Hi, > > I am doing feature selection for my dataset. The following is > the extreme case where only one feature is left. But I got > the error below. So my question is that do I have to use > more than one features? > > sample.subset > udomain.edu hpclass > 1-1.0 not > 2-1.0 not > 3-0.2 not > 4 1.0 hp > 5 1.0 hp > > randomForest(hpclass ~., data=sample.subset, importance=TRUE); > Error in if (n == 0) stop("data (x) has 0 rows") : > argument is of length zero > > Best regards, > Hui Han > Department of Computer Science and Engineering, > The Pennsylvania State University > University Park, PA,16802 > email: [EMAIL PROTECTED] > homepage: http://www.cse.psu.edu/~hhan > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > > Hui Han Department of Computer Science and Engineering, The Pennsylvania State University University Park, PA,16802 email: [EMAIL PROTECTED] homepage: http://www.cse.psu.edu/~hhan __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] randomForest: more than one variable needed?
I don't see much why to use random forest with only one predictive variable! Recall that random forest grow trees with a random subset of variables "in competition" for growing each node of the trees in the forest... How do you make such a random subset with only one predictive variable? there is no point here! Philippe Grosjean -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Hui Han Sent: Tuesday, 13 April, 2004 17:16 To: [EMAIL PROTECTED] Subject: [R] randomForest: more than one variable needed? Hi, I am doing feature selection for my dataset. The following is the extreme case where only one feature is left. But I got the error below. So my question is that do I have to use more than one features? sample.subset udomain.edu hpclass 1-1.0 not 2-1.0 not 3-0.2 not 4 1.0 hp 5 1.0 hp > randomForest(hpclass ~., data=sample.subset, importance=TRUE); Error in if (n == 0) stop("data (x) has 0 rows") : argument is of length zero Best regards, Hui Han Department of Computer Science and Engineering, The Pennsylvania State University University Park, PA,16802 email: [EMAIL PROTECTED] homepage: http://www.cse.psu.edu/~hhan __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] fractal calculation using fdim
Is that how you got your data or are > you using real data? I am using some synthatic data I happend to have. > argument X is to be a dataframe > not a matrix (mat??). Could that be giving you > problems? Do you get > better results with as.data.frame(mat)? no, even with data.frame, it gives the same error __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] randomForest: more than one variable needed?
Hi, I am doing feature selection for my dataset. The following is the extreme case where only one feature is left. But I got the error below. So my question is that do I have to use more than one features? sample.subset udomain.edu hpclass 1-1.0 not 2-1.0 not 3-0.2 not 4 1.0 hp 5 1.0 hp > randomForest(hpclass ~., data=sample.subset, importance=TRUE); Error in if (n == 0) stop("data (x) has 0 rows") : argument is of length zero Best regards, Hui Han Department of Computer Science and Engineering, The Pennsylvania State University University Park, PA,16802 email: [EMAIL PROTECTED] homepage: http://www.cse.psu.edu/~hhan __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Need advice on using R with large datasets
"Roger D. Peng" <[EMAIL PROTECTED]> writes: > As far as I know, R does compile on AMD Opterons and runs as a 64-bit > application. So it can store objects larger than 4GB. However, I > don't think R gets tested very often on 64-bit machines with such > large objects so there may be yet undiscovered bugs. There are a few such machines around among R users, and R seems to work OK on them. One slight gotcha is that the Fortran numeric libraries (Lapack, ATLAS) tend to use integer indexing, which might overflow for large objects. Things like data frames which consist of multiple subobjects might be less sensitive to this. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Need advice on using R with large datasets
I was under the impression that R has been run on 64-bit Solaris (and other 64-bit Unices) for quite a while (as 64-bit app). We've been running 64-bit R on amd64 for a few months (and had quite a few oppertunities to get the R processes using over 8GB of RAM). Not much problem as far as I can see... Best, Andy > From: Roger D. Peng > > As far as I know, R does compile on AMD Opterons and runs as a > 64-bit application. So it can store objects larger than 4GB. > However, I don't think R gets tested very often on 64-bit > machines with such large objects so there may be yet undiscovered > bugs. > > -roger > > Sunny Ho wrote: > > > Hello everyone, > > > > I would like to get some advices on using R with some > really large datasets. > > > > I'm using RH9 Linux R 1.8.1 for a research with a lot of > numerical data. The datasets total to around 200Mb (shown by > memory.size). During my data manipulation, the system memory > usage grew to 1.5Gb, and this caused a lot of swapping > activities on my 1Gb PC. This is just a small-scale > experiment, the full-scale one will be using data 30 times as > large (on a 4Gb machine). I can see that I'll need to deal > with memory usage problem very soon. > > > > I notice that R keeps all datasets in memory at all times. > I wonder whether there is any way to instruct R to push some > of the less-frequently-used data tables out of main memory, > so as to free up memory for those that are actively in used. > It'll be even better if R can keep only part of a table in > memory only when that part is needed. Using save & load could > help, but I just wonder whether R is intelligent enough to do > this by itself, so I don't need to keep track of memory usage > at all times. > > > > Another thought is to use a 64-bit machine (AMD64). I find > there is a pre-compiled R for Fedora Linux on AMD64. Anyone > knows whether this version of R runs as 64-bit? If so, then > will R be able to go beyond the 32-bit 4Gb memory limit? > > > > Also, from the manual, I find that the RPgSQL package (for > PostgreSQL database) supports a feature "proxy data frame". > Does anyone have experience with this? Can "proxy data frame" > handle memory efficiently for very large datasets? Say, if I > have a 6Gb database table defined as a proxy data frame, will > R & RPgSQL be able to handle it with just 4Gb of memory? > > > > Any comments will be useful. Many thanks. > > > > Sunny Ho > > (Hong Kong University of Science & Technology) > > > > __ > > [EMAIL PROTECTED] mailing list > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > > > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R apache and PHP
I've developed a web application in PHP and R my script is ... exec("R CMD BATCH --silent /home/marcello/R_in/myfile.bat /home/marcello/R_out/myfile.out"); ... ?> This script execute in R batch mode and write the myfile.out. On Win2000 the similar script is ok, but on linux I've a problem. I suppose is a permession problem because the same script on shell run fine and on Zend debugger (my IDE for php) is also ok. In this case the owner is "marcello" , if I run the script by browser the owner is "apache". I've overwritted all the ownerships of R directory and bin to apache user but not work. If a run exec("ls > mydir.txt"); is ok (is not a PHP general problem!) Someone can help me? Thanks (and excuse my for my poor english) Marcello Verona __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R , apache and PHP
I've developed a web application in PHP and R my script is ... exec("R CMD BATCH --silent /home/marcello/R_in/myfile.bat /home/marcello/R_out/myfile.out"); ... ?> This script execute in R batch mode and write the myfile.out. On Win2000 the similar script is ok, but on linux I've a problem. I suppose is a permession problem because the same script on shell run fine and on Zend debugger (my IDE for php) is also ok. In this case the owner is "marcello" , if I run the script by browser the owner is "apache". I've overwritted all the ownerships of R directory and bin to apache user but not work. If a run exec("ls > mydir.txt"); is ok (is not a PHP general problem!) Someone can help me? Thanks (and excuse my for my poor english) Marcello Verona __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Need advice on using R with large datasets
On Tue, 13 Apr 2004, Roger D. Peng wrote: > As far as I know, R does compile on AMD Opterons and runs as a > 64-bit application. So it can store objects larger than 4GB. > However, I don't think R gets tested very often on 64-bit > machines with such large objects so there may be yet undiscovered > bugs. Using more than 4Gb memory is reasonably tested now. Single objects of that size may not be -- I think you still can't have a vector whose length() is more than 2^31, for example. -thomas > > -roger > > Sunny Ho wrote: > > > Hello everyone, > > > > I would like to get some advices on using R with some really large datasets. > > > > I'm using RH9 Linux R 1.8.1 for a research with a lot of numerical data. The > > datasets total to around 200Mb (shown by memory.size). During my data > > manipulation, the system memory usage grew to 1.5Gb, and this caused a lot of > > swapping activities on my 1Gb PC. This is just a small-scale experiment, the > > full-scale one will be using data 30 times as large (on a 4Gb machine). I can see > > that I'll need to deal with memory usage problem very soon. > > > > I notice that R keeps all datasets in memory at all times. I wonder whether there > > is any way to instruct R to push some of the less-frequently-used data tables out > > of main memory, so as to free up memory for those that are actively in used. It'll > > be even better if R can keep only part of a table in memory only when that part is > > needed. Using save & load could help, but I just wonder whether R is intelligent > > enough to do this by itself, so I don't need to keep track of memory usage at all > > times. > > > > Another thought is to use a 64-bit machine (AMD64). I find there is a pre-compiled > > R for Fedora Linux on AMD64. Anyone knows whether this version of R runs as > > 64-bit? If so, then will R be able to go beyond the 32-bit 4Gb memory limit? > > > > Also, from the manual, I find that the RPgSQL package (for PostgreSQL database) > > supports a feature "proxy data frame". Does anyone have experience with this? Can > > "proxy data frame" handle memory efficiently for very large datasets? Say, if I > > have a 6Gb database table defined as a proxy data frame, will R & RPgSQL be able > > to handle it with just 4Gb of memory? > > > > Any comments will be useful. Many thanks. > > > > Sunny Ho > > (Hong Kong University of Science & Technology) > > > > __ > > [EMAIL PROTECTED] mailing list > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Re: model-based clustering
Dear Talita, you may start with library(mclust) ?EMclust example(EMclust) # The example is with Iris data. To understand what you're doing, read a paper from the web page of the developers, cited on the help page. Best, Christian PS: What a "good" or "optimal" clustering is, is by no means well defined. Some cluster algorithms will find 2 or more than 3 clusters on Iris, and that's not necessarily an argument against these algorithms. On Tue, 13 Apr 2004, Talita Leite wrote: > Hello again, > > Let me explain this better. I've been working in clustering methods during > this year and now i'm starting (or trying to) with model-based clustering. > I've been searching for help to understand how the functions works and i > found some. The problem is that i don't know the steps to follow. For > example: working with the data set IRIS. What steps do i have to follow and > what functions do i have to use to make a good clustering? To find the three > groups on that case? > > Thanks, > > Talita > > _ > MSN Messenger: converse com os seus amigos online. > http://messenger.msn.com.br > *** Christian Hennig Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg [EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/ ### ich empfehle www.boag-online.de __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] model-based clustering
In the time that you posted the numerous uninformative messages, you can do yourself a great favor by following the posting guide mentioned in the footer. Try: install.packages("mclust") library(mclust) ?mclust example(mclust) Andy > From: Talita Leite > > Hello again, > > Let me explain this better. I've been working in clustering > methods during > this year and now i'm starting (or trying to) with > model-based clustering. > I've been searching for help to understand how the functions > works and i > found some. The problem is that i don't know the steps to follow. For > example: working with the data set IRIS. What steps do i have > to follow and > what functions do i have to use to make a good clustering? To > find the three > groups on that case? > > Thanks, > > Talita > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Complex sample variances
On Mon, 12 Apr 2004, Fred Rohde wrote: > Thanks. I'll update the survey package. Sudaan does the standard > errors on quantiles using Taylor series. If I can hunt down the formula > it uses, could you add that to svyquantile? If I can bring myself to believe it. Computing standard errors for the normal approximation to the median is not easy even in simple random samples. -thomas > Fred > > Thomas Lumley <[EMAIL PROTECTED]> wrote: > On Mon, 12 Apr 2004, Fred Rohde wrote: > > > Hello, > > Is there a way to get complex sample variances in the survey package on > > summary statistics other than means? If not, can they be added to a > > future version? It would be be great to have them on totals, quantiles, > > ratios, and tables (eg row percent, columns percent, etc). > > > > svytotal() and svyratio() will do this for totals and ratios if you have a > new enough version. At the moment the easiest way to get row or column > percentages is to think of them them as ratios of means of binary > variables and use svyratio(). > > Quantiles are more difficult, since neither Taylor series nor jackknife > approaches work. > > -thomas > > > - > Do you Yahoo!? > Yahoo! Tax Center - File online by April 15th Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Execute function at startup
It would be convenient to have something like Rgui runfist="myfunction()" in Windows. The reason: AFAIK Rgui does not accept piped input (RGui < myfile.R does not seem to work). A solution could be to put a few fuctions in Rprofile and then give the name for one of these functions to be executed at startup as a command line parameter to Rgui. Can something like this be done? -- Erich Neuwirth, Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-38624 Fax: +43-1-4277-9386 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] model-based clustering
Hello again, Let me explain this better. I've been working in clustering methods during this year and now i'm starting (or trying to) with model-based clustering. I've been searching for help to understand how the functions works and i found some. The problem is that i don't know the steps to follow. For example: working with the data set IRIS. What steps do i have to follow and what functions do i have to use to make a good clustering? To find the three groups on that case? Thanks, Talita __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] model-based clustering
Dear Talita, no help is possible unless you do not tell us what exactly you want to do and what exactly your difficulties are. Best, Christian On Tue, 13 Apr 2004, Talita Leite wrote: > Hello, > > I'm trying to use the model-based clustering functions but i'm having some > difficulties. Does anybody could help me to make a good analisys of a data > set using these functions?? > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > *** Christian Hennig Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg [EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/ ### ich empfehle www.boag-online.de __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Need advice on using R with large datasets
As far as I know, R does compile on AMD Opterons and runs as a 64-bit application. So it can store objects larger than 4GB. However, I don't think R gets tested very often on 64-bit machines with such large objects so there may be yet undiscovered bugs. -roger Sunny Ho wrote: Hello everyone, I would like to get some advices on using R with some really large datasets. I'm using RH9 Linux R 1.8.1 for a research with a lot of numerical data. The datasets total to around 200Mb (shown by memory.size). During my data manipulation, the system memory usage grew to 1.5Gb, and this caused a lot of swapping activities on my 1Gb PC. This is just a small-scale experiment, the full-scale one will be using data 30 times as large (on a 4Gb machine). I can see that I'll need to deal with memory usage problem very soon. I notice that R keeps all datasets in memory at all times. I wonder whether there is any way to instruct R to push some of the less-frequently-used data tables out of main memory, so as to free up memory for those that are actively in used. It'll be even better if R can keep only part of a table in memory only when that part is needed. Using save & load could help, but I just wonder whether R is intelligent enough to do this by itself, so I don't need to keep track of memory usage at all times. Another thought is to use a 64-bit machine (AMD64). I find there is a pre-compiled R for Fedora Linux on AMD64. Anyone knows whether this version of R runs as 64-bit? If so, then will R be able to go beyond the 32-bit 4Gb memory limit? Also, from the manual, I find that the RPgSQL package (for PostgreSQL database) supports a feature "proxy data frame". Does anyone have experience with this? Can "proxy data frame" handle memory efficiently for very large datasets? Say, if I have a 6Gb database table defined as a proxy data frame, will R & RPgSQL be able to handle it with just 4Gb of memory? Any comments will be useful. Many thanks. Sunny Ho (Hong Kong University of Science & Technology) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] mts
Hi! I am new to R. I need your help. I have got the time series of fifteen variables in data file. I would like to plot it in R in separate ps pages, not in same ps page. I was reading about mts, but I could not figure out how to do it. Can anyone help me out? with regards; Santosh -- Santosh Kumar URL http://www.igidr.ac.in/~santosh/ PhD Student Indira Gandhi Institute of Development Research Gen A. K. Vaidya Marg Goregaon ( East ) Mumbai pin 400065 India Phone 28400919 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] model-based clustering
Hello, I'm trying to use the model-based clustering functions but i'm having some difficulties. Does anybody could help me to make a good analisys of a data set using these functions?? __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Complex sample variances
Looked through the publication, "Statistical Methods and Mathematical Algorithms Used in Sudaan" (Shah, et al, 1993) but the only reference to variances on quantiles is a 1991 presentation by David Binder. Googled the title and got this link. http://www.amstat.org/sections/srms/Proceedings/papers/1991_005.pdf Point estimation (section 1 of this reference) is already implented in R; variance estimatation for quantiles is presented in the last part of section 3. Can you make sense of it? It's beyond me. Fred Fred Rohde <[EMAIL PROTECTED]> wrote: Thanks. I'll update the survey package. Sudaan does the standard errors on quantiles using Taylor series. If I can hunt down the formula it uses, could you add that to svyquantile? Fred Thomas Lumley wrote: On Mon, 12 Apr 2004, Fred Rohde wrote: > Hello, > Is there a way to get complex sample variances in the survey package on > summary statistics other than means? If not, can they be added to a > future version? It would be be great to have them on totals, quantiles, > ratios, and tables (eg row percent, columns percent, etc). > svytotal() and svyratio() will do this for totals and ratios if you have a new enough version. At the moment the easiest way to get row or column percentages is to think of them them as ratios of means of binary variables and use svyratio(). Quantiles are more difficult, since neither Taylor series nor jackknife approaches work. -thomas - [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html - [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] model-based clustering
Hello, I'm trying to use the model-based clustering functions R provides but i'm having some difficulties. Does anybody could help me how to make a good analisys of a data set with these functions?? __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Need advice on using R with large datasets
Hello everyone, I would like to get some advices on using R with some really large datasets. I'm using RH9 Linux R 1.8.1 for a research with a lot of numerical data. The datasets total to around 200Mb (shown by memory.size). During my data manipulation, the system memory usage grew to 1.5Gb, and this caused a lot of swapping activities on my 1Gb PC. This is just a small-scale experiment, the full-scale one will be using data 30 times as large (on a 4Gb machine). I can see that I'll need to deal with memory usage problem very soon. I notice that R keeps all datasets in memory at all times. I wonder whether there is any way to instruct R to push some of the less-frequently-used data tables out of main memory, so as to free up memory for those that are actively in used. It'll be even better if R can keep only part of a table in memory only when that part is needed. Using save & load could help, but I just wonder whether R is intelligent enough to do this by itself, so I don't need to keep track of memory usage at all times. Another thought is to use a 64-bit machine (AMD64). I find there is a pre-compiled R for Fedora Linux on AMD64. Anyone knows whether this version of R runs as 64-bit? If so, then will R be able to go beyond the 32-bit 4Gb memory limit? Also, from the manual, I find that the RPgSQL package (for PostgreSQL database) supports a feature "proxy data frame". Does anyone have experience with this? Can "proxy data frame" handle memory efficiently for very large datasets? Say, if I have a 6Gb database table defined as a proxy data frame, will R & RPgSQL be able to handle it with just 4Gb of memory? Any comments will be useful. Many thanks. Sunny Ho (Hong Kong University of Science & Technology) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
class of seq(length=n) values (was Re: [R] Zero Index Origin?)
Dear Brian Ripley, following your advice (very much appreciated) of using seq(length=n) instead of 1:n constructs in loop constructions (usually prepended by the not so elegant if (n > 0)) I was somewhat surprised to find that (in R 1.8.1 & R 1.9.0 beta (2004-03-29) on Debian 3.0) and using the methods package carrel>a <- seq(1,5) carrel>class(a) [1] "integer" carrel>class(seq(along=a)) [1] "integer" but carrel>class(seq(length=length(a))) [1] "numeric" and from ?seq ... Value: The result is of 'mode' '"integer"' if 'from' is (numerically equal to an) integer and 'by' is not specified. ... it is not (to me, of course) obvious that if length is specified and from omitted a numeric result sequence is returned. This might only matter in conjunction with S4 class slot assignments where the integer class requirement is important. OTOH a for (i in as.integer(seq(length=length(a { ... } is not so elegant either. Could you please enlighten me as to why this behaviour was chosen and if there is any more elegant way than using as.integer(seq(length=length(a))) to get the desired result (given that I have to use a loop in the first place). Regards, Matthias Prof Brian Ripley wrote: Much of R is itself written in R, so you cannot possibly change something as fundamental as this. Further, index 0 has a special meaning that you would lose if R have 0-based indexing. However, the R thinking is to work with whole objects (vectors, arrays, lists ...) and you rather rarely need to know what numbers are in an index vector. There are usages such as 1:n, and those are quite often wrong: they should be seq(length=n) or seq(along=x) or some such, since n might be zero. If you are writing code that works with single elements, you are probably a lot better off writing C code to link into R (and C is 0-based ...). On Wed, 31 Mar 2004, Bob Cain wrote: I'm very new to R and utterly blown away by not only the language but the unbelievable set of packages and the documentation and the documentation standards and... I was an early APL user and never lost my love for it and in R I find most of the essential things I loved about APL except for one thing. At this early stage of my learning I can't yet determine if there is a way to effect what in APL was zero index origin, the ordinality of indexes starts with 0 instead of 1. Is it possible to effect that in R without a lot of difficulty? I come here today from the world of DSP research and development where Matlab has a near hegemony. I see no reason whatsoever that R couldn't replace it with a _far_ better and _far_ less idiosyncratic framework. I'd be interested in working on a Matlab equivalent DSP package for R (if that isn't being done by someone) and one of the things most criticized about Matlab from the standpoint of the DSP programmer is its insistence on 1 origin indexing. Any feedback greatly appreciated. Thanks, Bob -- Matthias Burger Bioinformatics R&D Epigenomics AG www.epigenomics.com Kleine Präsidentenstraße 1 fax: +49-30-24345-555 10178 Berlin Germanyphone: +49-30-24345-0 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] from .csv file to a pca plot
Hallo On 13 Apr 2004 at 13:25, Dansen, Ing. M.C. wrote: > Hi, > I'm just a beginner, who has just encountered a problem! > > 1. - I wanted to load a csv file with names in the rows (1st column) > and and numbers in the 2nd til 10th column. The file contains > names in the headers. > >- I used; a <- as.matrix(read.table("filename", sep=',", > row.names=1, header=TRUE) maybe read.csv("filename") will do the same without need for other specifications. Why do you convert it to matrix? What is wrong with data frame? > > Question; 1 - I would like to select the first four columns a[1:4,] > 2 - and execute a pca(plot) from the mva package on > those four columns go through examples in mva > 3 - how can I set the data type eg(string, integer, > double) separate for each column During loading process you can use colClasses argument to read.csv or read.table. Or you can change the column type by as.xxx statement > > Can anyone help me out, Help! > > Thanks in advance, > > Marinus Cheers Petr > > > > > > > This e-mail and its contents are subject to the DISCLAIMER at > http://www.tno.nl/disclaimer/email.html > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] par() in .Rprofile
Hi, it's in the release notes from Peter: >Users may notice that code in .Rprofile is run with only the >new base loaded and so functions may now not be found. For >example, ps.options(horizontal = TRUE) should be preceded by >library(graphics) or called as graphics::ps.options or, >better, set as a hook -- see ?setHook. detlef On Tue, 13 Apr 2004 13:39:20 +0200 "Petr Pikal" <[EMAIL PROTECTED]> wrote: > Dear all > > I installed new version (from binaries) and I noticed that > > par(bg="white") > > which I have in my .Rprofile causes error message on startup > But if I issued this command immediately after startup everything worked as > expected. I did not see any note in changes file or elsewhere. -- Detlef Steuer --- http://fawn.unibw-hamburg.de/steuer.html * Encrypted mail preferred * "Die herrschenden Ideen sind die Ideen der Herrschenden." --- K. Marx __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Question
Hi On 13 Apr 2004 at 14:51, Ivan Yegorov wrote: > I use R for Windows. I got error "Can not allocate 100 Mb for vector". > Does R use only physical memory or it can operate with virtual memory? > What should I do if it does. Thanks in advance. Starting R with --max-mem-size 550M option can help. The figure depends on your memory size. Newer versions are better in using memmory resources. Cheers Petr > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] par() in .Rprofile
Dear all I installed new version (from binaries) and I noticed that par(bg="white") which I have in my .Rprofile causes error message on startup But if I issued this command immediately after startup everything worked as expected. I did not see any note in changes file or elsewhere. Should I specify white background in .Rprofile differently? Or is there some other recommended way to set up white background on startup? Everything was OK in 1.8.1 version. Using W2000. Startup example R : Copyright 2004, The R Foundation for Statistical Computing Version 1.9.0 (2004-04-12), ISBN 3-900051-00-3 Attaching package 'fun': The following object(s) are masked from package:base : interaction Error: couldn't find function "par" [Previously saved workspace restored] > par("bg") [1] "transparent" > par(bg="white") > par("bg") [1] "white" > my .Rprofile library(fun) par(bg="white") RNGkind("Mersenne-Twister", "Inversion") data(stand) Thank you Best regards. Petr Pikal [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] from .csv file to a pca plot
Hi, I'm just a beginner, who has just encountered a problem! 1. -I wanted to load a csv file with names in the rows (1st column) and and numbers in the 2nd til 10th column. The file contains names in the headers. -I used; a <- as.matrix(read.table("filename", sep=',", row.names=1, header=TRUE) Question; 1 - I would like to select the first four columns 2 - and execute a pca(plot) from the mva package on those four columns 3 - how can I set the data type eg(string, integer, double) separate for each column Can anyone help me out, Help! Thanks in advance, Marinus This e-mail and its contents are subject to the DISCLAIMER at http://www.tno.nl/disclaimer/email.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] mts
Hi! I am new to R. I need your help. I have got the time series of fifteen variables in data file. I would like to plot it in R in separate ps pages, not in same ps page. I was reading about mts, but I could not figure out how to do it. Can anyone help me out? with regards; Santosh -- Santosh Kumar URL http://www.igidr.ac.in/~santosh/ PhD Student Indira Gandhi Institute of Development Research Gen A. K. Vaidya Marg Goregaon ( East ) Mumbai pin 400065 India Phone 28400919 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Question
I use R for Windows. I got error "Can not allocate 100 Mb for vector". Does R use only physical memory or it can operate with virtual memory? What should I do if it does. Thanks in advance. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R 1.9.0 is release
On 12 Apr 2004 14:05:25 +0200, Peter Dalgaard <[EMAIL PROTECTED]> wrote : >I've rolled up R-1.9.0.tgz a short while ago. This is a new version >with a number of new features, most notably a substantial >reorganization of the standard packages, a major update of the grid >package, and the fact that underscore can now be used as a regular >character in variable names. I've just uploaded the Windows build. It should appear on CRAN and the mirrors by tomorrow. The main Windows-specific changes are the following: - A "stay on top" option for windows. - Rcmd can now be written R CMD, as on Unix. - Tony Plate's "Paste commands only" to paste the commands from a copied block of output There are many other changes and bug fixes. Here's an extract from the CHANGES file: rw1090 == Both Rterm and Rgui now give usage information via the --help or -h command-line flag. There is now a "Misc|Break to debugger" menu option, enabled when a debugger is detected (somewhat fallibly), or infallibly by the "--debug" command line option. This will cause a trap to an external debugger, e.g. for running Rgui under gdb. If the menu item is selected when not running under a debugger R is likely to crash. If the "--debug" option is used, R will break to the debugger during command line processing, allowing the startup process to be debugged. Added "stay" argument to bringToTop(), to allow the user to specify that a window should stay on top of other windows. Also added "stay on top" item to the popup menus. All of these require R to be running in SDI mode ("Rgui --sdi" or via the settings in file `Rconsole'). Changed windows() so that new windows fit within the MDI client area. Added winMenuNames() and winMenuItems() functions to query user menus. Added menu items for www.r-project.org and CRAN on the help menu. (Wishlist PR#6492) Added "R" command to be similar to Unix invocation of scripts, e.g. "R CMD INSTALL" is the same as "Rcmd INSTALL". Rcmd still exists for backwards compatibility (and to avoid conflicts over the name `R'). All of R, R CMD and Rcmd now accept --help. Rcmd Rd2dvi can now be specified as such rather than as Rcmd Rd2dvi.sh. Added "Paste commands only" to edit and popup menus in the Rgui console. This allows copying of a block of output, but pasting only the commands back to the console for re-execution. (Code contributed by Tony Plate.) Installation Parallel make (make -j2, say) can be used, but only usefully on dual-processor (or perhaps hyperthreaded) hosts with at least 384Mb of memory. Installing now sorts in the C locale to ensure that a consistent sort order is used. (Some aspects of sorting used to be done in the locale of the host machine, but Perl and the cygwin-based tools used the ASCII collation order.) The long-untested support for making Windows .hlp files has been withdrawn. There is support for using K. Goto's fast BLAS. On a 2.6Ghz P4 with 1Gb RAM and A a 1000 x 1000 matrix we had the following timings R BLAS ATLAS Goto A %*% A 3.70.650.56 svd(A) 16.27.776.83 Note that using a fast BLAS is much less effective for smaller matrices as are more common in statistical applications. Faster assembler code for exponentiation is used. Cross-building of R itself now works again. (It had been broken since 1.8.0.) Building/installing packages R CMD INSTALL/build/check map path names with spaces in to their short forms. R CMD INSTALL now supports versioned install via --with-package-versions. Installing (binary) package bundles now checks the MD5 sums and reports success, just as for packages. Added "* DONE" to the end of INSTALL logs so --install option to CHECK will work. (This is a repository maintainer option; see src/scripts/check.in for docs). Internal changes The fast bmp/png/jpeg code introduced in R 1.8.0 is used even for 256-color displays (as we have now been able to test it on such). R's internal malloc etc are now remapped to Rm_malloc etc and only used in allocating memory for R objects, the Wilcoxon tests and a few other memory-intensive applications. Improved malloc routines from the current version of Doug Lea's malloc (as suggested by David Teller) should enable large memory areas to be used more effectively, in particular those over 2Gb where OS support has been enabled. The initially requested memory is no longer reserved, but as this malloc is able to work with non-contiguous memory chunks that should not matter. The installer uses LZMA compression, so Inno Setup >= 4.1.5 is required. Version 1.2.5 of libpng is now used in binary builds. Bug fixes - Fixed list.files() to properly handle paths like "C:", etc. Fixed unlink() to accept empty file list for Unix consistency. Fixed handling of whitespace in Rd2dvi.sh processing of DESCRIPTION files. Fixed handling of "--max-mem-size" syntax error on command line. In RGui, ^T would n