Re: [R] EM unsupervised clustering
Federico, you might also want to have a look at packages flexclust or flexmix, so you can take into account that you have binary data. The mclust package can be used to estimate mixtures of Gaussian distributions. flexclust implements kmeans-like algorithms, but you can specify a distance measure appropriate for binary data. flexmix allows latent class analysis with binary data using FLXMCmvbinary() for the component specific model. Best, Bettina Federico Calboli wrote: Hi All, I have a n x m matrix. The n rows are individuals, the m columns are variables. The matrix is in itself a collection of 1s (if a variable is observed for an individual), and 0s (is there is no observation). Something like: [,1] [,2] [,3] [,4] [,5] [,6] [1,]101100 [2,]101100 [3,]101100 [4,]010000 [5,]101100 [6,]010010 I use kmeans to find 2 or 3 clusters in this matrix k2 = kmeans(data, 2, 1000) k3 = kmeans(data, 3, 1000) but I would like to use something a bit more refined, so I though about a EM based clustering. I am using the Mclust() function from the mclust package, but I get the following (to me incomprehensible) error message: plot(Mclust(as.data.frame(data)), as.data.frame(data)) Hit Return to see next plot: Hit Return to see next plot: Hit Return to see next plot: Error in 1:L : NA/NaN argument In addition: Warning messages: 1: best model occurs at the min or max # of components considered in: summary.mclustBIC(Bic, data, G = G, modelNames = modelNames) 2: optimal number of clusters occurs at min choice in: Mclust(as.data.frame(anc.st.mat)) 3: insufficient input for specified plot in: coordProj(data = data, parameters = x$parameters, z = x$z, what = classification, That's puzzling because the example given by ?Mclust is something like plot(Mclust(iris[,-5]), iris[,-5]) which is pretty simple and dumbproof and works flawlessly... best, Federico __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R
Hello! I am using for logistic regression in survey data the svyglm procedure. I wondered how does the strata effect estimates SE (in addition to the weights given proportional to population size). I know that for simple regression measurements of each strata is assumed to have different variance. But in a logistic model this is not the case. Can anyone help me here? Thank you Ron [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RAM, swap, Error: cannot allocate vector of size, Linux:
Feldman, Maximilian Jeffrey wrote: Dear Community, I am very new to the world of Linux and R and I have stumbled upon a problem that I cannot seem to resolve on my own. Here is the relevant background: I am working on a 64-bit Linux Fedora Core 6 OS. I using R version 2.5.1. I have 3.8 Gb of RAM and 1.9 Gb of swap. As I see it, there are no restraints on the amount of memory that R can use imposed by this particular OS build. When I type in the 'ulimit' command at the command line the response is 'unlimited'. Here is the problem: I have uploaded and normalized 48 ATH1 microarray slides using the justRMA function. library(affy) setwd(/Data/cel) Data-justRMA() The next step in my analysis is to calculate a distance matrix for my dataset using bioDist package. This is where I get my error. library(bioDist) x-cor.dist(exprs(Data)) Error: cannot allocate vector of size 3.9 Gb I used the following function to examine my memory limitations: mem.limits() nsize vsize NA NA I believe this means there isn't any specified limit to the amount of memory R can allocate to my task. I realize I only have 3.8 Gb of RAM but I would expect that R would use my 1.9 Gb of swap. It does, if swap works at all on your machine. However, the error message is relates to the object that R fails to create, not the total memory usage. I.e. this might very well be the _second_ object of size 3.9Gb that you are trying to fit into 5.7Gb of memory. You could try increasing the swap space (the expedient, although perhaps not efficient, way is to find a file system with a few tens of Gb to spare and create a large swapfile on it.) Does R not use my swap space? Can I explicitly tell R to use my swap space for large tasks such as this? I was not able to find any information regarding this particular issue in the R Linux manual, Linux FAQ, or on previous listserv threads. Many of the users who had similar questions resolved their problems in a different manner. Thanks to anyone who thinks they can provide assistance! Max Graduate Student Molecular Plant Sciences Washington State University [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Classification
For all who sent help on topic Classification: Thank you very much folks. I have got some inspiration how to solve this task. Michael - Original Message - From: Marc Schwartz [EMAIL PROTECTED] To: Ing. Michal Kneifl, Ph.D. [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Wednesday, July 18, 2007 7:53 PM Subject: Re: [R] Classification On Wed, 2007-07-18 at 19:36 +0200, Ing. Michal Kneifl, Ph.D. wrote: Hi, I am also a quite new user of R and would like to ask you for help: I have a data frame where all columns are numeric variables. My aim is to convert one columnt in factors. Example: MD 0.2 0.1 0.8 0.3 0.7 0.6 0.01 0.2 0.5 1 1 I want to make classes: 0-0.2 A 0.21-0.4 B 0.41-0.6 C . and so on So after classification I wil get: MD A A D B . . . and so on Please could you give an advice to a newbie? Thanks a lot in advance.. Michael See ?cut You can then do something like: DF MD 1 0.20 2 0.10 3 0.80 4 0.30 5 0.70 6 0.60 7 0.01 8 0.20 9 0.50 10 1.00 11 1.00 cut(DF$MD, breaks = c(seq(0, 1, .2)), labels = LETTERS[1:5]) [1] A A D B D C A A C E E Levels: A B C D E HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange warning in summary.lm
ONKELINX, Thierry wrote: Dear useRs, Lately I noticed a strange warning in the summary of a lm-object. Any idea what this warning is about? I'm using R 2.5.1 on Win XP pro. x - rnorm(100) y - rnorm(100) summary(lm(y~x)) Call: lm(formula = y ~ x) Residuals: Min 1Q Median 3Q Max -1,77809 -0,68438 -0,04409 0,63891 2,30863 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0,002170,09244 -0,0230,981 x0,013150,09628 0,1370,892 Residual standard error: 0,9236 on 98 degrees of freedom Multiple R-Squared: 0.0001903, Adjusted R-squared: -0.01001 F-statistic: 0.01866 on 1 and 98 DF, p-value: 0,8916 Warning message: NAs introduced by coercion in: as.double.default(Cf[okP]) Probably you have an object in your workspace or another attached environment in the search path that conflicts with objects that are required to call summary(lm(...)). E.g. some lm... oder summary... function? Best, Uwe Ligges Thanks, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange warning in summary.lm
The problem also exists in a clean workspace. But I've found the troublemaker. I had set options(OutDec = ,). Resetting this to options(OutDec = .) solved the problem. Thanks, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -Oorspronkelijk bericht- Van: Uwe Ligges [mailto:[EMAIL PROTECTED] Verzonden: donderdag 19 juli 2007 10:39 Aan: ONKELINX, Thierry CC: r-help@stat.math.ethz.ch Onderwerp: Re: [R] Strange warning in summary.lm ONKELINX, Thierry wrote: Dear useRs, Lately I noticed a strange warning in the summary of a lm-object. Any idea what this warning is about? I'm using R 2.5.1 on Win XP pro. x - rnorm(100) y - rnorm(100) summary(lm(y~x)) Call: lm(formula = y ~ x) Residuals: Min 1Q Median 3Q Max -1,77809 -0,68438 -0,04409 0,63891 2,30863 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0,002170,09244 -0,0230,981 x0,013150,09628 0,1370,892 Residual standard error: 0,9236 on 98 degrees of freedom Multiple R-Squared: 0.0001903, Adjusted R-squared: -0.01001 F-statistic: 0.01866 on 1 and 98 DF, p-value: 0,8916 Warning message: NAs introduced by coercion in: as.double.default(Cf[okP]) Probably you have an object in your workspace or another attached environment in the search path that conflicts with objects that are required to call summary(lm(...)). E.g. some lm... oder summary... function? Best, Uwe Ligges Thanks, Thierry -- -- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tapply
hello, i want to compute the mean of a variable (aps) for every class (1,2, and 3). every id have a few obs., aps and class are constant over id. like this: id aps class 1 11 2 1 11 2 1 11 2 1 11 2 1 11 2 2 83 2 83 2 83 3 12 2 3 12 2 . . i tried: tapply(icu1$aps_st, icu1$hidclass, function(z) mean(unique(z))) but it's counting every row and not every id. thank you, Sigalit. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a facility in R similar to MatLab
Dylan Arena darena at stanford.edu writes: Hi, I'm trying to use R to get eigenvalues and eigenvectors of a matrix whose elements are of the form (2 * lambda), -(lambda + mu), etc. I'd like R to treat this matrix as a numeric matrix without treating lambda and mu as variable names but rather as some sort of atomic quantities (and hence give eigenvectors in terms of mu and/or lambda). MatLab and Mathematica both do this, but I'm not sure whether R does. Does anyone have any ideas about how to do this? Please let me know, Dylan You are asking for a Computer Algebra problem and thus you should apply a Computer Algebra System (CAS) such as Maple, Mathematica, MuPAD, or the free software Maxima. By the way, Matlab does _not_ do this, it utilizes the Maple kernel in its Symbolic Toolbox. In R you can try the Ryacas package, but I personally would prefer one of the more extended systems. Maxima with wxMaxima interface is quite nice, and for MuPAD there is a non-expensive personal copy available. Hans Werner Borchers __ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subsetting dataframes
Dear all! W2k, R 2.5.1 I am working with an ongoing malting barley variety evaluation within Sweden. The structure is 25 cultivars tested each year at four sites, in field trials with three replicates and 'lattice' structure (the replicates are divided into five sub blocks in a structured way). As we are normally keeping around 15 varieties from each year to the next, and take in 10 new for next year, we have tested totally 72 different varieties during five years. I store the data in a field trial database, and generate text tables with the subset of data I want and import the frame to R. I take in all cultivars in R and use 'subset' to select what I want to look at. Using lme{nlme} works with no problems to get mean results over the years, but as I now have a number of years I want to analyse the general site x cultivar relation. I am testing AMMI{agricolae} for this and it seems to work except for the subsetting. This is what happens: If I do the subsetting like this: x62_samvar - subset(x62_5, cn %in% c(Astoria,Barke,Christina,Makof, Prestige,Publican,Quench)) A test run with AMMI seems to work in the first part: AMMI(site, cn, rep, yield) ANALYSIS AMMI: yield Class level information ENV: Hag Klb Bjt Ska GEN: Astoria Prestige Makof Christina Publican Quench REP: 1 2 3 Number of observations: 240 model Y: yield ~ ENV + REP%in%ENV + GEN + ENV:GEN Analysis of Variance Table Response: Y DfSum Sq Mean Sq F valuePr(F) ENV 3 120092418 40030806 90.0424 1.665e-06 *** REP(ENV)8 3556620444578 0.5674 0.803923 GEN 5 21376142 4275228 5.4564 9.680e-05 *** ENV:GEN15 28799807 1919987 2.4504 0.002555 ** Residuals 208 162973213783525 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Coeff var Mean yield 13.08629 6764.098 After this something goes wrong, as AMMI finds a cultivar name not selected in the subsetting. (The plotting might go wrong anyhow, but I haven´t got that far yet): Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : factor 'y' has new level(s) Arkadia Looking at the dataframe using edit(x62_samvar) only shows the selected lines, but using levels() gives another answer as levels(x62_samvar$cn) gives back all 72 cultivar names used during the five years (starting with Arcadia). Where do I go wrong and how do I use subset in a proper way? Thanks /CG -- CG Pettersson, PhD Swedish University of Agricultural Sciences (SLU) Dept. of Crop Production Ecology. Box 7043. SE-750 07 Uppsala, Sweden [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mfrow is ignored by some plots
Hi all, I was just informed that the plots in the radial.plot family in the plotrix package do not plot correctly when using mfrow or mfcol to subdivide the plot window. I found one related message, an answer from Deepayan Sarkar to a question about lattice graphics, but that shed no light on this problem. If I invoke par(mfrow=c(2,2)) and run radial.plot a few times, the plots appears in the upper left corner. _Sometimes_ the second plot appears in the upper right corner once, but then returns to the upper left. If I invoke par(new=TRUE), the plots appear in the lower right corner. For instance, the following just keeps writing the plots in the upper left. par(mfrow=c(2,2)) radial.plot(rnorm(7),0:6,line.col=red,lwd=2) radial.plot(rnorm(7),0:6,line.col=red,lwd=2) radial.plot(rnorm(7),0:6,line.col=red,lwd=2) radial.plot(rnorm(7),0:6,line.col=red,lwd=2) If I then enter: par(new=TRUE) and repeat the plots, they all accumulate in the lower right. I think this may have something to do with the calls to points or lines or symbols within radial.plot, but I could find no information on this in the help for plot, par, or several other pages. Any clues? Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] write.table linebreaks
Hello R users, I am a newby using R 2.5.0 on a Apple Power Book G4 with Mac OS X 10.4.10. when I use the write.table function, I always get the output in Unix linebreaks that I have to change to McIntosh linebreaks to be able to Import the data in Excel 2004 for Mac. Is there a possibility to do this automatically in R and respectively in the write.table function? Thanks in advance. Birgit Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fSeries GARCH(1,1)
Hello all, I am trying to use the garchFit function in the fSeries Package to fit a Garch(1,1) Model with t distribution. I am using the following codes. fit - garchFit(~garch(1,1),data,cond.dist=dstd) fitted(fit) I was expecting the fitted(fit) would return the fitted volatility, but the result turns out to be a series of repeated same value. I tried to change the distribution to normal, and the same thing happened. Could anyone give me some advice? Many thanks. -- View this message in context: http://www.nabble.com/fSeries-GARCH%281%2C1%29-tf4109574.html#a11686281 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mfrow is ignored by some plots
I'd ask the plotrix maintainer to fix his code! radial.plot saves and restores all the par() values, including mfrow and mfg. When you do par(mfrow=c(2,2)) the plot position is reset to the bottom right, and the next plot will advance to the top left (but par(new=TRUE) negates that). Please restore only the par values you have changed, not all the no.readonly ones. That values can be set does not make it a good idea to do so. On Thu, 19 Jul 2007, Jim Lemon wrote: Hi all, I was just informed that the plots in the radial.plot family in the plotrix package do not plot correctly when using mfrow or mfcol to subdivide the plot window. I found one related message, an answer from Deepayan Sarkar to a question about lattice graphics, but that shed no light on this problem. If I invoke par(mfrow=c(2,2)) and run radial.plot a few times, the plots appears in the upper left corner. _Sometimes_ the second plot appears in the upper right corner once, but then returns to the upper left. If I invoke par(new=TRUE), the plots appear in the lower right corner. For instance, the following just keeps writing the plots in the upper left. par(mfrow=c(2,2)) radial.plot(rnorm(7),0:6,line.col=red,lwd=2) radial.plot(rnorm(7),0:6,line.col=red,lwd=2) radial.plot(rnorm(7),0:6,line.col=red,lwd=2) radial.plot(rnorm(7),0:6,line.col=red,lwd=2) If I then enter: par(new=TRUE) and repeat the plots, they all accumulate in the lower right. I think this may have something to do with the calls to points or lines or symbols within radial.plot, but I could find no information on this in the help for plot, par, or several other pages. Any clues? Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mfrow is ignored by some plots
Prof Brian Ripley wrote: I'd ask the plotrix maintainer to fix his code! radial.plot saves and restores all the par() values, including mfrow and mfg. When you do par(mfrow=c(2,2)) the plot position is reset to the bottom right, and the next plot will advance to the top left (but par(new=TRUE) negates that). Please restore only the par values you have changed, not all the no.readonly ones. That values can be set does not make it a good idea to do so. Thank you Prof. Ripley, an erudite and functional answer as usual. Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and Copula
I have installed fast all packages for copula, I have installed package 'gnml' and the others from the Jim Lindsey's web site and the other packages like 'repeted' which is necessary for calculating copula. but when I start with copula it write: 'gausscop' function is not found. And also 'fitted.gausscop(z)' and 'residuals.gausscop(z)'. also what I want to say is: when I have installed package 'repeted' from Jim Lindsey's web site in R project, this what I have received: utils:::menuInstallLocal() updating HTML package descriptions what have I to do next to solve this problem and to continue with copula. Thanks a lot for previous help! gyadav wrote: hi meeryana, may be this time nobnody is responding. but dont worry you will get a lot of help eventually, so always post a copy to the mailing list. The reason is there are a lot many newbies, although i am also not so old enough, who have even the simplest questions but are hesitant to ask. the objective of the list is to help, and thus feel free to ask, stand and contribute :-) i hope you will understand not today then tomorrow as you will get associated with the mailing list more closely. Personally i have found friends here. Further, never post a mail with a loose subject and secondly try to maintain the thread i.e. reply to the mail which you want to reply(for more details refer to the posting guide) :-) that would really help. Yes now regarding your question :-) Step1 : List of all packages can be found at this link :-) http://cran.r-project.org/src/contrib/PACKAGES.html Step2: Click on the package you want to install :-) http://cran.r-project.org/src/contrib/Descriptions/copula.html Step3: Then download the binary of your Operating system. If windows then download corresponding zip file. for copula it is http://cran.r-project.org/bin/windows/contrib/r-release/copula_0.5-3.zip save zip file on your system Step4 Open your R rpogram Step5: Goto Packages - Install packages from Local Zip file Step6: Select your package zip file which you want to install Step7: Sit back and relax Step8: load the library using library(LibraryName) on R prompt There are alternate ways of installing the package directly from R prompt. It didn't worked for me a long time back, so i always adopt this method. Somebody on the list may help you in this regards :-) bye and learn Join and stand with Open Source and R community Cheers and Chiao, Welcome -gaurav dear Mr. Yadav, I want to thank for help, and for that you are only who is willing to help, but I have one question: because I'm new with R project also, I think I should install a package for copula. I have only installed R program. How should I install this package? And is it what I have also to do with credit metrics, Value at Risk, matix and the other formulas, I mean install packages. I hope that you have a little time for me and my problem, and I hope I'm not disturbing you. thank you for all you can do for me and best regards, Mirjana gyadav wrote: hi see the code below i hope this will make your understanding of copulas better this code plots two normal distribution and their joint distribution N[0,2] N[0,4] HTH ##code library(copula) ###copy in two parts in R# ##PART A## ## construct a bivariate distribution whose marginals ## are normal and Normal respectively, coupled ## together via a normal copula op - par(mfrow = c(2, 2), # 2 x 2 pictures on one plot pty = s) # square plotting region, # independent of device size x - mvdc(normalCopula(0.75), c(norm, norm), list(list(mean = 0, sd =2),list(mean = 0, sd =4))) x.samp - rmvdc(x, 1) par(mfrow=c(2,3)) hist(x.samp[,1],xlab=Normal) hist(x.samp[,2],xlab=Normal) plot(x.samp[,2],x.samp[,1],pch=21,xlab=Normal,ylab=Normal) plot(dmvdc(x, x.samp)) plot(pmvdc(x, x.samp)) ## At end of plotting, reset to previous settings: ###PART B### par(op) for (i in seq(1:360)){ persp(x, dmvdc, xlim = c(-4, 4), ylim=c(0, 1),theta=i) } Regards, Gaurav Yadav +++ Assistant Manager, CCIL, Mumbai (India) Mob: +919821286118 Email: [EMAIL PROTECTED] Bhagavad Gita: Man is made by his Belief, as He believes, so He is copula [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 07/17/2007 12:53 PM To r-help@stat.math.ethz.ch cc Subject Re: [R] R and Copula it would be great when somebody will help me thanks copula wrote: hi, first I want to say that I'm new here, and new with copula and R. That is the reason why I'm writing, if somebody can help me. I have to make an example of Copula. On internet I've found this forum and that copula can
[R] df manipulation
I have multicolumn data.frames with the first comumn giving ordinal observation index ( ex.: 1 4 7 9 11 13 etc). I would like to fill up the missing observations (i.e. 2 3 5 6 8 etc) with NAs. Thank you __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write.table linebreaks
What do you think the 'eol' argument to write.table is for? I don't have a Mac to hand, but eol='\r' does this on Linux and Windows. On Thu, 19 Jul 2007, Birgit Lemcke wrote: Hello R users, I am a newby using R 2.5.0 on a Apple Power Book G4 with Mac OS X 10.4.10. when I use the write.table function, I always get the output in Unix linebreaks that I have to change to McIntosh linebreaks to be able to Import the data in Excel 2004 for Mac. Is there a possibility to do this automatically in R and respectively in the write.table function? Thanks in advance. Birgit Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] [[alternative HTML version deleted]] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error for TukeyHSD ?
Hi, one question about TukeyHSD in R: it seems to work only for models without Error terms but not when an Error is specified: Examples: aov1-aov(MMN_ind~(GenCond*Lang)+Error(VP),data=cutie_all) TukeyHSD(aov1,Lang,ordered=TRUE,conf.level = 0.95) -- the German error message tells me its not a method that can be used with TukeyHSD: Fehler in TukeyHSD(aov1, ordered = TRUE, conf.level = 0.95) : keine anwendbare Methode für TukeyHSD A model like this is no problem: aov1-aov(MMN_ind~(GenCond*Lang,data=cutie_all) I am very grateful for any help! Silvia -- Dr. Silvia C. Lipski University of Konstanz, Department of Linguistics, Neurolinguistics Fach D 185, 78457 Konstanz, Germany Tel: +49 7531 88-2927 Fax: +49 7531 88-2741 Email: [EMAIL PROTECTED] www.uni-konstanz.de/FuF/Philo/Sprachwiss/neuroling/Lipski.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange warning in summary.lm
ONKELINX, Thierry wrote: The problem also exists in a clean workspace. But I've found the troublemaker. I had set options(OutDec = ,). Resetting this to options(OutDec = .) solved the problem. Thanks, Thierry Oups. That sounds like there's a bug somewhere. Can you cook up a minimal example which shows the behaviour? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange warning in summary.lm
Dear Peter, Here's an example. Notice the warning in the last two lines of the summary with options(OutDec = ,). It's not present with options(OutDec = .). Cheers, Thierry x - runif(100) y - rnorm(100) options(OutDec = ,) summary(lm(y~x)) Call: lm(formula = y ~ x) Residuals: Min1QMedian3Q Max -2,389749 -0,607002 0,006969 0,689535 1,713197 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0,033970,17774 0,1910,849 x -0,092190,29518 -0,3120,755 Residual standard error: 0,868 on 98 degrees of freedom Multiple R-Squared: 0.0009943, Adjusted R-squared: -0.0092 F-statistic: 0.09754 on 1 and 98 DF, p-value: 0,7555 Warning message: NAs introduced by coercion in: as.double.default(Cf[okP]) options(OutDec = .) summary(lm(y~x)) Call: lm(formula = y ~ x) Residuals: Min1QMedian3Q Max -2.389749 -0.607002 0.006969 0.689535 1.713197 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.033970.17774 0.1910.849 x -0.092190.29518 -0.3120.755 Residual standard error: 0.868 on 98 degrees of freedom Multiple R-Squared: 0.0009943, Adjusted R-squared: -0.0092 F-statistic: 0.09754 on 1 and 98 DF, p-value: 0.7555 ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -Oorspronkelijk bericht- Van: Peter Dalgaard [mailto:[EMAIL PROTECTED] Verzonden: donderdag 19 juli 2007 13:37 Aan: ONKELINX, Thierry CC: Uwe Ligges; r-help@stat.math.ethz.ch Onderwerp: Re: [R] Strange warning in summary.lm ONKELINX, Thierry wrote: The problem also exists in a clean workspace. But I've found the troublemaker. I had set options(OutDec = ,). Resetting this to options(OutDec = .) solved the problem. Thanks, Thierry Oups. That sounds like there's a bug somewhere. Can you cook up a minimal example which shows the behaviour? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange warning in summary.lm
On Thu, 19 Jul 2007, Peter Dalgaard wrote: ONKELINX, Thierry wrote: The problem also exists in a clean workspace. But I've found the troublemaker. I had set options(OutDec = ,). Resetting this to options(OutDec = .) solved the problem. Thanks, Thierry Oups. That sounds like there's a bug somewhere. Can you cook up a minimal example which shows the behaviour? Any use of summary.lm will do it (e.g. example(lm)). The problem is in printCoefmat, at x0 - (xm[okP] == 0) != (as.numeric(Cf[okP]) == 0) and yes, it looks like an infelicity to me. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test about distribution of data in a single population
Dear all, I would like to know how can I test which are the intervals of my data that have significant less or more counts than the other intervals. Example: Interval[1:200][200:400][400:600][600:800] ... more 900 hundred columns Count 122875 Thanks in advance, Best regards João Fadista Ph.d. student UNIVERSITY OF AARHUS Faculty of Agricultural Sciences Dept. of Genetics and Biotechnology Blichers Allé 20, P.O. BOX 50 DK-8830 Tjele Phone: +45 8999 1900 Direct: +45 8999 1900 E-mail: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Web: www.agrsci.org http://www.agrsci.org/ News and news media http://www.agrsci.org/navigation/nyheder_og_presse . This email may contain information that is confidential. Any use or publication of this email without written permission from Faculty of Agricultural Sciences is not allowed. If you are not the intended recipient, please notify Faculty of Agricultural Sciences immediately and delete this email. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tapply
I do not understand what you want. If aps is constant over each class then the mean for each class is equal to any value of aps. Using your example you can do tapply(icu1$aps, icu1$d, mean) but it does not give you anything new. Can you explain the problem a bit more? --- sigalit mangut-leiba [EMAIL PROTECTED] wrote: hello, i want to compute the mean of a variable (aps) for every class (1,2, and 3). every id have a few obs., aps and class are constant over id. like this: id aps class 1 11 2 1 11 2 1 11 2 1 11 2 1 11 2 2 83 2 83 2 83 3 12 2 3 12 2 . . i tried: tapply(icu1$aps_st, icu1$hidclass, function(z) mean(unique(z))) but it's counting every row and not every id. thank you, Sigalit. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multinomial logit estimation
Good morning, I'd like to estimate a simple multinomial logit model in R (not a McFadden conditional logit). For instance, I'd like to estimate the probability of someone having one of eight titles in a company with the independent variables being the company characteristics. A binary logit is well documented. What about the multinomial? Thanks, Walt Paczkowski __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error for TukeyHSD ?
TukeyHSD, and glht in the multcomp package, are designed for aov objects. They do not have methods for aovlist objects. The workaround is to download and install the HH package and look at the maiz example. library(HH) ?MMC In that example I show how to get the same sequential sum of squares without the Error() term as you would get with the Error term. Rich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange warning in summary.lm
Prof Brian Ripley wrote: On Thu, 19 Jul 2007, Peter Dalgaard wrote: ONKELINX, Thierry wrote: The problem also exists in a clean workspace. But I've found the troublemaker. I had set options(OutDec = ,). Resetting this to options(OutDec = .) solved the problem. Thanks, Thierry Oups. That sounds like there's a bug somewhere. Can you cook up a minimal example which shows the behaviour? Any use of summary.lm will do it (e.g. example(lm)). The problem is in printCoefmat, at x0 - (xm[okP] == 0) != (as.numeric(Cf[okP]) == 0) and yes, it looks like an infelicity to me. Ick. Any better ideas than printsAs0 - scan(con - textConnection(Cf[okP), dec=options(outDec)) ; close(con) x0 - (xm[okP] == 0) != printsAs0 ? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] df manipulation
Hi, see below: df index value 1 1 1 2 4 6 3 7 4 4 9 5 511 3 613 2 foo - function(x){ + index - ifelse(x %in% df$index, df$value[which(df$index %in% x)], NA) + return(index) + } df_ok - data.frame(index=1:13, value=sapply(1:13, foo)) df_ok index value 1 1 1 2 2NA 3 3NA 4 4 6 5 5NA 6 6NA 7 7 4 8 8NA 9 9 5 1010NA 1111 3 1212NA 1313 2 -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O On 19/07/07, Nikola Markov [EMAIL PROTECTED] wrote: I have multicolumn data.frames with the first comumn giving ordinal observation index ( ex.: 1 4 7 9 11 13 etc). I would like to fill up the missing observations (i.e. 2 3 5 6 8 etc) with NAs. Thank you __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange warning in summary.lm
Ah, in printCoefmat() there is Cf which is at some point: Cf Estimate Std. Error t value Pr(|t|) (Intercept) 0,2542 0,1875 1,356 x -0,5346 0,3463 -1,544 and cat(Cf) 0,2542 -0,5346 0,1875 0,3463 1,356 -1,544 (i.e. values are character) and then x0 - (xm[okP] == 0) != (as.numeric(Cf[okP]) == 0) includes as.numeric(Cf[okP]) which won't work any more. I think this design is not the best possible. Nevertheless, the real problem is in format(), because it does not use its default `decimal.mark = .' if options(OutDec = ,) has been set, and even ignores manual chosen values: # nice: options(OutDec = .) format(0.1, decimal.mark = :) # [1] 0:1 # bug: options(OutDec = ,) format(0.1, decimal.mark = :) # [1] 0,1 Best, Uwe ONKELINX, Thierry wrote: Dear Peter, Here's an example. Notice the warning in the last two lines of the summary with options(OutDec = ,). It's not present with options(OutDec = .). Cheers, Thierry x - runif(100) y - rnorm(100) options(OutDec = ,) summary(lm(y~x)) Call: lm(formula = y ~ x) Residuals: Min1QMedian3Q Max -2,389749 -0,607002 0,006969 0,689535 1,713197 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0,033970,17774 0,1910,849 x -0,092190,29518 -0,3120,755 Residual standard error: 0,868 on 98 degrees of freedom Multiple R-Squared: 0.0009943, Adjusted R-squared: -0.0092 F-statistic: 0.09754 on 1 and 98 DF, p-value: 0,7555 Warning message: NAs introduced by coercion in: as.double.default(Cf[okP]) options(OutDec = .) summary(lm(y~x)) Call: lm(formula = y ~ x) Residuals: Min1QMedian3Q Max -2.389749 -0.607002 0.006969 0.689535 1.713197 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.033970.17774 0.1910.849 x -0.092190.29518 -0.3120.755 Residual standard error: 0.868 on 98 degrees of freedom Multiple R-Squared: 0.0009943, Adjusted R-squared: -0.0092 F-statistic: 0.09754 on 1 and 98 DF, p-value: 0.7555 ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -Oorspronkelijk bericht- Van: Peter Dalgaard [mailto:[EMAIL PROTECTED] Verzonden: donderdag 19 juli 2007 13:37 Aan: ONKELINX, Thierry CC: Uwe Ligges; r-help@stat.math.ethz.ch Onderwerp: Re: [R] Strange warning in summary.lm ONKELINX, Thierry wrote: The problem also exists in a clean workspace. But I've found the troublemaker. I had set options(OutDec = ,). Resetting this to options(OutDec = .) solved the problem. Thanks, Thierry Oups. That sounds like there's a bug somewhere. Can you cook up a minimal example which shows the behaviour? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multinomial logit estimation
Walter Paczkowski wrote: Good morning, I'd like to estimate a simple multinomial logit model in R (not a McFadden conditional logit). For instance, I'd like to estimate the probability of someone having one of eight titles in a company with the independent variables being the company characteristics. A binary logit is well documented. What about the multinomial? Thanks, Walt Paczkowski __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Have you considered the results of RSiteSearch(multinomial, restrict=function) which point to a number of relevant packages and functions? -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting dataframes
CG Pettersson wrote: Dear all! W2k, R 2.5.1 I am working with an ongoing malting barley variety evaluation within Sweden. The structure is 25 cultivars tested each year at four sites, in field trials with three replicates and 'lattice' structure (the replicates are divided into five sub blocks in a structured way). As we are normally keeping around 15 varieties from each year to the next, and take in 10 new for next year, we have tested totally 72 different varieties during five years. I store the data in a field trial database, and generate text tables with the subset of data I want and import the frame to R. I take in all cultivars in R and use 'subset' to select what I want to look at. Using lme{nlme} works with no problems to get mean results over the years, but as I now have a number of years I want to analyse the general site x cultivar relation. I am testing AMMI{agricolae} for this and it seems to work except for the subsetting. This is what happens: If I do the subsetting like this: x62_samvar - subset(x62_5, cn %in% c(Astoria,Barke,Christina,Makof, Prestige,Publican,Quench)) A test run with AMMI seems to work in the first part: AMMI(site, cn, rep, yield) ANALYSIS AMMI: yield Class level information ENV: Hag Klb Bjt Ska GEN: Astoria Prestige Makof Christina Publican Quench REP: 1 2 3 Number of observations: 240 model Y: yield ~ ENV + REP%in%ENV + GEN + ENV:GEN Analysis of Variance Table Response: Y DfSum Sq Mean Sq F valuePr(F) ENV 3 120092418 40030806 90.0424 1.665e-06 *** REP(ENV)8 3556620444578 0.5674 0.803923 GEN 5 21376142 4275228 5.4564 9.680e-05 *** ENV:GEN15 28799807 1919987 2.4504 0.002555 ** Residuals 208 162973213783525 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Coeff var Mean yield 13.08629 6764.098 After this something goes wrong, as AMMI finds a cultivar name not selected in the subsetting. (The plotting might go wrong anyhow, but I haven´t got that far yet): Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : factor 'y' has new level(s) Arkadia Looking at the dataframe using edit(x62_samvar) only shows the selected lines, but using levels() gives another answer as levels(x62_samvar$cn) gives back all 72 cultivar names used during the five years (starting with Arcadia). Where do I go wrong and how do I use subset in a proper way? So you have to drop the levels you are excluding. Example: x - factor(letters[1:4]) x x[1:2] x[1:2, drop=TRUE] Uwe Ligges Thanks /CG __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: evaluation nested too deeply when doing heatmap with binary distfunction
zhihua li wrote: Hi netters, I have a matrix X of the size (1000,100). The values are from -3 to +3. When I tried heatmap(X, distfun=function(c),dist(c,method=bin),hclustfun=function(m),hclust(m,method=average)) I got the error message: Error: evaluation nested too deeply: infinite recursion / options(expressions=)? So, does it help to increase the thresholds? If not, please specify a easily reproducible example that helps us to investigate your problem. Best, Uwe Ligges However, if I used default parameters for distfunction: heatmap(X, hclustfun=function(m),hclust(m,method=average)) there is no error messages at all. But the problem is that I have to use binary method in my disfunction. How can I resolve the problem? Thanks a lot! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multinomial logit estimation
Hi all, You can consult help of the functions multinom (library(MASS)) and vglm (library(VGAM)). hope this help, Pierre Selon Walter Paczkowski [EMAIL PROTECTED]: Good morning, I'd like to estimate a simple multinomial logit model in R (not a McFadden conditional logit). For instance, I'd like to estimate the probability of someone having one of eight titles in a company with the independent variables being the company characteristics. A binary logit is well documented. What about the multinomial? Thanks, Walt Paczkowski __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange warning in summary.lm
On Thu, 19 Jul 2007, Peter Dalgaard wrote: Prof Brian Ripley wrote: On Thu, 19 Jul 2007, Peter Dalgaard wrote: ONKELINX, Thierry wrote: The problem also exists in a clean workspace. But I've found the troublemaker. I had set options(OutDec = ,). Resetting this to options(OutDec = .) solved the problem. Thanks, Thierry Oups. That sounds like there's a bug somewhere. Can you cook up a minimal example which shows the behaviour? Any use of summary.lm will do it (e.g. example(lm)). The problem is in printCoefmat, at x0 - (xm[okP] == 0) != (as.numeric(Cf[okP]) == 0) and yes, it looks like an infelicity to me. Ick. Any better ideas than printsAs0 - scan(con - textConnection(Cf[okP), dec=options(outDec)) ; close(con) x0 - (xm[okP] == 0) != printsAs0 ? Yes, several. That assumes that getOption(OutDec) (which is what you need) is a single char since scan() does. If so, chartr() will do the job and more generally sub() would. Alternatively, we could figure out what representations of zero are possible and grep for them. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] df manipulation
Nikola Markov wrote: I have multicolumn data.frames with the first comumn giving ordinal observation index ( ex.: 1 4 7 9 11 13 etc). I would like to fill up the missing observations (i.e. 2 3 5 6 8 etc) with NAs. Please specify reproducible examples, they make it much easier to help! If you have: df - data.frame(index = c(1, 4, 7), x1 = 1:3, x2 = 3:1) df Then you can say: temp - data.frame(index = 1:max(df$index)) df - merge(temp, df, all.x = TRUE) df Uwe Ligges Thank you __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] maximum likelihood estimation
I need to perform maximum likelihood estimation on R, but I am not sure which command to use. I searched on google, and found an example using the function mlogl, but I couldn't find the package on R. Is there such function? Or how should i perform my mle? http://www.mayin.org/ajayshah/KB/R/documents/mle/mle.html might help. -- Ajay Shah http://www.mayin.org/ajayshah [EMAIL PROTECTED] http://ajayshahblog.blogspot.com *(:-? - wizard who doesn't know the answer. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tapply
I also don't understand, but perhaps: with(df, tapply(aps, list(class, id), mean)) -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O On 19/07/07, sigalit mangut-leiba [EMAIL PROTECTED] wrote: hello, i want to compute the mean of a variable (aps) for every class (1,2, and 3). every id have a few obs., aps and class are constant over id. like this: id aps class 1 11 2 1 11 2 1 11 2 1 11 2 1 11 2 2 83 2 83 2 83 3 12 2 3 12 2 . . i tried: tapply(icu1$aps_st, icu1$hidclass, function(z) mean(unique(z))) but it's counting every row and not every id. thank you, Sigalit. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can I test if there are statistical significance between different rows in R*C table?
Dear friends, My R*C table is as follow: better good bad Goup1 16 71 37 Group2 0 4 61 Group3 1 6 57 Can I test if there are statistical significant between Group1 and Group2, Group2 and Group3, Group1 and Group2, taking into the multiple comparisons? The table can be set up using the following program: a-matrix(data=c(16,71,37,0,4,61,1,6,57),nrow=3,byrow=TRUE) Thanks very much. -- With Kind Regards, oooO: (..): :\.(:::Oooo:: ::\_)::(..):: :::)./::: ::(_/ : [***] Zhi Jie,Zhang ,PHD Tel:86-21-54237149 Dept. of Epidemiology,School of Public Health,Fudan University Address:No. 138 Yi Xue Yuan Road,Shanghai,China Postcode:200032 Email:[EMAIL PROTECTED] Website: www.statABC.com [***] oooO: (..): :\.(:::Oooo:: ::\_)::(..):: :::)./::: ::(_/ : [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (R) Using arguments for the empirical cumulative distribution function
Hi, I have just started using R. Now I have the following problem: I want to create an Empirical Cumulative Distribution Function and I only came so far: F10 - ecdf(x) plot(F10, verticals= TRUE, do.p = TRUE, lwd=3) x=c(1.6,1.8,2.4,2.7,2.9,3.3,3.4,3.4,4,5.2) Now I'd like to use arguments such as xlabs and main but I don't know how to integrate them. I hope someone can help me, I am really stuck! -- View this message in context: http://www.nabble.com/%28R%29-Using-arguments-for-the-empirical-cumulative-distribution-function-tf4111355.html#a11690150 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot centered line on barplot
Dear R user, I need plot an histogram for the occurrence of a dataset, and then add a line corresponding to the freuqnecy of another similar dataset. in order to do this i used the function hist_data1=hist(data1, breaks= seq(0,50,5), plot=FALSE) hist_data2=hist(data2, breaks= seq(0,50,5), plot=FALSE) then I plotted the frequency barplot(hist_data1$density) lines(hist_data1$density) but the line is shifted in respect to the center of the bars. how can I properly plot the line? another question. this is easy, how can I smooth the curve (not fit with loess of spline)? tnx -- Claudio __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting dataframes
Thanks a lot. But an ignorant R user, like me, needed the code example from Jim Holtman posted outside the list earlier today to understand that: x62_samvar$cn - x62_samvar$cn[,drop=TRUE] was the way to code. Thank you both! /CG On Thu, July 19, 2007 3:01 pm, Uwe Ligges said: CG Pettersson wrote: Dear all! W2k, R 2.5.1 I am working with an ongoing malting barley variety evaluation within Sweden. The structure is 25 cultivars tested each year at four sites, in /snip Where do I go wrong and how do I use subset in a proper way? So you have to drop the levels you are excluding. Example: x - factor(letters[1:4]) x x[1:2] x[1:2, drop=TRUE] Uwe Ligges Thanks /CG -- CG Pettersson, PhD Swedish University of Agricultural Sciences (SLU) Dept. of Crop Production Ecology. Box 7043. SE-750 07 Uppsala, Sweden [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: evaluation nested too deeply when doing heatmap with binary distfunction
Sorry, that was a typo. Actually there wasn't a comma after 'function(m)' in my expression. So I'll try to increase the threshould to see if that works. Thanks a lot! From: jim holtman [EMAIL PROTECTED] To: zhihua li [EMAIL PROTECTED] Subject: Re: [R] Error: evaluation nested too deeply when doing heatmap with binary distfunction Date: Thu, 19 Jul 2007 00:19:54 -0400 you seem to have a syntax error in your statement. There appears to be an extra commas after 'function(m),'. I think is should be: heatmap(X, distfun=function(c)dist(c,method=bin),hclustfun=function(m)hclust(m,method=average)) On 7/18/07, zhihua li [EMAIL PROTECTED] wrote: Hi netters, I have a matrix X of the size (1000,100). The values are from -3 to +3. When I tried heatmap(X, distfun=function(c),dist(c,method=bin),hclustfun=function(m),hclust(m,method=average)) I got the error message: Error: evaluation nested too deeply: infinite recursion / options(expressions=)? However, if I used default parameters for distfunction: heatmap(X, hclustfun=function(m),hclust(m,method=average)) there is no error messages at all. But the problem is that I have to use binary method in my disfunction. How can I resolve the problem? Thanks a lot! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A small problem with as.timeSeries() function...
Dear Users, I have a problem with as.timeSeries() function. When I am using this function, I can not assign my own date column. Is it possible to do this. For example, my data frame is like below, Datex1 x2... 01/01/2005 2.31.5... 01/02/2005 1.22.2... and so on... For getReturns(), this data frame must be a time series so the result is same as what as.timeSeries() converts. So the problem is rownames began from 1970 and with year, the date and month differ from mine time range. And also plotting is another problem. What is the way to fix this, or how can I convert getReturns()'s results to original data frame like above? Thanks already for your hints. With my best regards, Levent TERLEMEZ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dates() is a great date function in R
This is great. I haven't seen as.Date function before, and was using as.date from library(date). (note the lowercase 'd') I have an alternative which might or might not be faster... If the date is formatted mmdd (e.g. 20070719) library(date) formatted - gsub(^(\\d{4})(\\d{2})(\\d{2})$, \\2-\\3-\\1, d$mmdd, perl=TRUE) d$dates - as.date(formatted) Since as.date only accepts certain type of date formats, I had to use gsub to reshuffle the date substrings around. as.Date returns the objects of class Date, whereas as.date returns the objects of class date. Not sure what the differences are, but a simple test below shows that as.date conversion is slightly faster, given a character vector of 1 date entries. FYI, I ran a quick performance comparison test on a 64bit linux machine on 2.6.9 kernel. The test is very rudimentary, but hopefully useful... I have two scripts: ## as.date_test.R ## library(date) d - read.table(/tmp/dates, as.is = TRUE, col.names = c(mmdd), colClasses = c(character)) formatted - gsub(^(\\d{4})(\\d{2})(\\d{2})$, \\2-\\3-\\1, d$mmdd, perl=TRUE) d$dates - as.date(formatted) print(nrow(d)) print(d$dates[1:3]) ## ## as.Date_test.R ## d - read.table(/tmp/dates, as.is = TRUE, col.names = c(mmdd), colClasses = c(character)) d$dates - as.Date(d$mmdd, format = %Y%m%d) print(nrow(d)) print(d$dates[1:3]) ## Both scripts read in the same text file containing 1 date strings, and then convert them into appropriate date objects. # 1 date records in a flat file [EMAIL PROTECTED]$ wc -l /tmp/dates 1 /tmp/dates # just to illustrate what the dates look like [EMAIL PROTECTED]$ head -2 /tmp/dates 19900817 19900820 # Running the test script 5 times each [EMAIL PROTECTED]$ for i in 1 2 3 4 5; do time R --vanilla as.date_test.R /dev/null; done real0m1.29s user0m1.23s sys 0m0.05s real0m1.28s user0m1.23s sys 0m0.06s real0m1.28s user0m1.22s sys 0m0.06s real0m1.29s user0m1.22s sys 0m0.06s real0m1.28s user0m1.21s sys 0m0.07s [EMAIL PROTECTED]$ for i in 1 2 3 4 5; do time R --vanilla as.Date_test.R /dev/null; done real0m1.65s user0m0.99s sys 0m0.64s real0m1.64s user0m0.98s sys 0m0.66s real0m1.63s user0m0.98s sys 0m0.65s real0m1.64s user0m1.00s sys 0m0.64s real0m1.64s user0m0.98s sys 0m0.65s Notice that as.date conversion is silghtly faster than as.Date conversion, on average... Just thought it was interesting to share. (and thanks Mark Leeds for reference) Regards, JB On 07/18/07 16:13:49, Gavin Simpson wrote: On Wed, 2007-07-18 at 12:14 -0700, Mr Natural wrote: Proper calendar dates in R are great for plotting and calculating. However for the non-wonks among us, they can be very frustrating. I have recently discussed the pains that people in my lab have had with dates in R. Especially the frustration of bringing date data into R from Excel, which we have to do a lot. I've always found the following reasonably intuitive: Given the csv file that I've pasted in below, the following reads the csv file in, formats the dates and class Date and then draws a plot. I have dates in DD/MM/ format so year is not first - thus attesting to R not hating dates in this format ;-) ## read in csv data ## as.is = TRUE stops characters being converted to factors ## thus saving us an extra step to convert them back dat - read.csv(date_data.csv, as.is = TRUE) ## we convert to class Date ## format tells R how the dates are formatted in our character strings ## see ?strftime for the meaning and available codes dat$Date - as.Date(dat$Date, format = %d/%m/%Y) ## check this worked ok str(dat$Date) dat$Date ## see nicely formatted dates and not a drop of R-related hatred ## but just about the most boring graph I could come up with plot(Data ~ Date, dat, type = l) And you can keep your Excel file formatted as dates as well - bonus! Oh, and before you get Martin'd, it is the chron *package*! HTH G CSV file I used, generated in OpenOffice.org, but I presume it stores Dates in the same way as Excel?: Data,Date 1,01/01/2007 2,02/01/2007 3,03/01/2007 4,04/01/2007 5,05/01/2007 6,06/01/2007 7,07/01/2007 8,08/01/2007 9,09/01/2007 10,10/01/2007 11,11/01/2007 10,12/01/2007 9,13/01/2007 8,14/01/2007 7,15/01/2007 6,16/01/2007 5,17/01/2007 4,18/01/2007 3,19/01/2007 2,20/01/2007 1,21/01/2007 1,22/01/2007 2,23/01/2007 3,24/01/2007 Please find below a simple analgesic
[R] tukey or Neuman-Keuls
hello, I need to groupe some means and I wonder how to do to get critical range table for tukey or for Neuman-Keuls I think it's possible to do that with the qtukey() function? thanks. _ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R
On Thu, 19 Jul 2007, Fluss wrote: Hello! I am using for logistic regression in survey data the svyglm procedure. I wondered how does the strata effect estimates SE (in addition to the weights given proportional to population size). I know that for simple regression measurements of each strata is assumed to have different variance. But in a logistic model this is not the case. It is simpler (and more complicated) than that. The survey package uses the same formula for nearly all designs and estimators, so it doesn't have to handle special cases like estimating separate stratum variances for stratified models. For a population total the variance estimator is just the Horvitz-Thompson estimator. Other estimators are defined by the estimating equations they solve, so the mean solves sum_i w_i(x_i-mu) = 0 and logistic regression solves sum_i w_ix_i(y_i-mu_i) = 0 We now compute the Horvitz-Thompson estimator for the sum of the estimating functions (V) and also the population total of the derivative of the estimating functions (D). The variance of the estimator is D^{-1}VD^{-1} The standard reference for this in the survey literature seems to be Binder, David A. (1983). On the variances of asymptotically normal estimators from complex surveys. International Statistical Review, 51, 279-292. which is in the References section of help(svyrecvar). -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-DB] RODBC on Oracle DB
Yes, it's a quesion of rights. My system administrator just confirme that. And I also try with a small Oracle client, and same as R, no permission to read tables. On R with RODBC I could list the tables but no permission to read or import tables. But I get confused because I was using the same Oracle user with an MS Access client, wich works perfectly even to importe table from Oracle DB. My system admin told me that this Oracle user is in fact a synomyme or duplicata from the real Oracle account. But why this very same account can import table via the MS Access client ? This remains mystery to me. As soon as may sysadmin return from vacations, I will clear that out. grettings et thanks to everyone. Eric R. On Mer Juil 18 17:50 , Marc Schwartz sent: I think that you are on to something there Don. I just tried accessing a table from our Oracle server, which I do know exists, but for which I do not have access permissions. Using the following query in the Oracle Instant Client: select table_name from all_tables; I can get a list of all tables on the server, which includes a table called INCOMPATIBLE_USER_AGENTS, for which I do not have access permissions. When attempting to query the table in the Instant Client I get: SQL select * from INCOMPATIBLE_USER_AGENTS; select * from INCOMPATIBLE_USER_AGENTS * ERROR at line 1: ORA-00942: table or view does not exist When running the same query from R using RODBC I get: sqlQuery(db, select * from INCOMPATIBLE_USER_AGENTS) [1] [RODBC] ERROR: Could not SQLExecDirect [2] 42S02 942 [Oracle][ODBC][Ora]ORA-00942: table or view does not exist\n So it looks like permission issues may be possible here. Eric, definitely confirm with your SysAdmins that you have appropriate permissions. HTH, Marc On Wed, 2007-07-18 at 07:43 -0700, Don MacQueen wrote: I believe I have seen that error message from Oracle when I tried to query a table for which I did not have select privileges (and when I knew for certain that the table existed). Ask your database administrator about the table, and make sure that you do have that privilege. What I am uncertain about is whether Oracle, when asked to list tables, returns a list that includes tables for which the user does not have select privileges. -Don At 9:24 AM +0200 7/17/07, [EMAIL PROTECTED] wrote: essai uid=osis_r, pwd=12miss15 ,case=oracle) sqlTables(essai)$ORESTE ... 1315ORESTE S_PROFESSIONS_OLDTABLE 1316ORESTE S_PROVENANCESTABLE 1317ORESTES_SEXESTABLE 1318ORESTE S_SOUS_CLASSESTABLE 1319ORESTE S_TYP_COLLEGESTABLE 1320ORESTE S_TYP_ENSEIGNEMENTTABLE ... sqlQuery(essai, select * from S_TYP_COLLEGES) [1] [RODBC] ERROR: Could not SQLExecDirect [2] 42S02 942 [Oracle][ODBC][Ora]ORA-00942: Table ou vue inexistante\n I have also tried the essai2 But with no succes. On Lun Juil 16 15:32 , Prof Brian Ripley [EMAIL PROTECTED] sent: The problem could be quoting, if Oracle is not standards-compliant. See the options in ?odbcConnect. If sqlQuery(essai, select * from S_TYP_COLLEGES) works, this is likely to be the problem. On Mon, 16 Jul 2007, [EMAIL PROTECTED] wrote: essai odbcGetInfo(essai) DBMS_Name DBMS_Ver Driver_ODBC_Ver Oracle 09.00.0121 03.51 Data_Source_Name Driver_Name Driver_Ver ORESTE_prodSQORA32.DLL 09.00.0101 ODBC_Ver Server_Name 03.52. weba sqlTables(essai) The result of this function is a liste of tables, one of them is called: S_TYP_COLLEGES. sqlFetch(essai,S_TYP_COLLEGES) [1] [RODBC] ERROR: Could not SQLExecDirect [2] 42S02 942 [Oracle][ODBC][Ora]ORA-00942: Table ou vue inexistante\n sqlFetch(essai, S_TYP_COLLEGES, colnames=TRUE, rownames=FALSE) [1] [RODBC] ERROR: Could not SQLExecDirect [2] 42S02 942 [Oracle][ODBC][Ora]ORA-00942: Table ou vue inexistante\n What could be the problem here ? Any help is welcome Eric Röthlisberger, Neuchâtel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory error with 64-bit R in linux
You might try running top while R runs, to get a better idea of what is happening. 64-bit R takes more memory than 32-bit (longer pointers) and for a large problem I would say that 2GB RAM is a minimum if you want any speed. Slowness is likely related to needing to use swap space. The cannot allocate error is because you run out of both RAM and swap. If you are close to finishing your calculation you may resolve things by increasing swap, but don't expect it to be fast. There is also a possibility that your user id is restricted, but I'm not sure how that works anymore. It used to be controlled by ulimit, but that does not seem to be the case in newer versions of Linux. If you are still debugging your code, and there is some chance you are just gobbling up memory endlessly until it runs out, then you can speed things up (i.e. fail more quickly) by turning swap off. There are debugging situations where this turns out to be useful. HTH, Paul zhihua li wrote: Hi netters, I'm using the 64-bit R-2.5.0 on a x86-64 cpu, with an RAM of 2 GB. The operating system is SUSE 10. The system information is: -uname -a Linux someone 2.6.13-15.15-smp #1 SMP Mon Feb 26 14:11:33 UTC 2007 x86_64 x86_64 x86_64 GNU/Linux I used heatmap to process a matrix of the dim [16000,100]. After 3 hours of desperating waiting, R told me: cannot allocate vector of size 896 MB. I know the matrix is very big, but since I have 2 GB of RAM and in a 64-bit system, there should be no problem to deal with a vector smaller than 1 GB? (I was not running any other applications in my system) Does anyone know what's going on? Is there a hardware limit where I have to add more RAM, or is there some way to resolve it softwarely? Also is it possible to speed up the computing (I don't wanna wait another 3 hours to know I get another error message) Thank you in advance! _ 享用世界上最大的电子邮件系统― MSN Hotmail。 http://www.hotmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. La version française suit le texte anglais. This email may contain privileged and/or confidential inform...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tapply
I'm sorry for the unfocused questions, i'm new here... the output should be: classaps_mean 1 na 2 11.5 3 8 the mean aps of every class, when every id count *once*, for example: class 2, mean= (11+12)/2=11.5 hope it's clearer. sigalit. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with heatmap - how to remove annoying X before numeric values?
Sorry, I just realized I didn't send this to the list! (See below) Thanks for all the help! All is working fine now. If anyone knows of a more straightforward way to change the Value string for the Key, please let me know (just to satisfy my curiosity). I got it to work by modifying the source code (specifically, heatmap.2.R in the gplots package). However, before, I didn't make the call source(heatmap.2.R) before I called source(mysillyheatmap.R) Making the additional call to heatmap.2.R fixed everything. Thanks again for all your help! On 7/19/07, Suzanne Matthews [EMAIL PROTECTED] wrote: Thank you all for your prompt replies! The check.names=FALSE parameter fixed things entirely. One more question: Is there a straightforward way to modify the the Values string that labels the key to a user-defined value? For example, let's say I want to change Values to Silly Values. So far, what I have found that I need to do is actually go and change a static string in the source code. Is there a more direct way? Also, after I make the source code change (in gplots package, file: heatmap.2.R), how do I have R build from that? If I remember correctly, if I put the new heatmap.2.R in my directory, R is supposed to check for functions and such there before it goes and builds it from the main source code base (located at /usr/bin/R). I am a touch confused on which directory is my directory, where R will check first. I tried putting the modified heatmap.2.R file in the directory that my script is, and where I initially run R. But that didn't work! Is there anything that I should add to my R script that will force it to read from that from my directory? If not, which directory should I place this in? My OS is OS X, so I think Unix-based instructions will work. Thank you once again for your time and patience! Sincerely, Suzanne On 7/18/07, Moshe Olshansky [EMAIL PROTECTED] wrote: I was right saying that my solution was not the best possible! --- Prof Brian Ripley [EMAIL PROTECTED] wrote: read.table('temp.txt', check.names = FALSE) would be easier (and more general, since make.names can do more than prepend an 'X'). On Wed, 18 Jul 2007, Moshe Olshansky wrote: Hi Suzanne, My solution (which I am sure is not the best) would be: heat - read.table('temp.txt') heat X1905 X1910 X1950 X1992 X2011 X2020 Gnat 0.08 0.29 0.29 0.37 0.39 0.43 Snake 0.16 0.34 0.32 0.40 0.41 0.53 Bat0.40 0.54 0.52 0.60 0.60 0.63 Cat0.16 0.27 0.29 0.39 0.37 0.41 Dog0.43 0.54 0.52 0.61 0.60 0.62 Lynx 0.50 0.57 0.54 0.59 0.50 0.59 a-names(heat) b-strsplit(a,split=X) w-unlist(b) w [1] 1905 1910 1950 1992 2011 2020 z - w[seq(2,length(w),by=2)] z [1] 1905 1910 1950 1992 2011 2020 names(heat) - z heat 1905 1910 1950 1992 2011 2020 Gnat 0.08 0.29 0.29 0.37 0.39 0.43 Snake 0.16 0.34 0.32 0.40 0.41 0.53 Bat 0.40 0.54 0.52 0.60 0.60 0.63 Cat 0.16 0.27 0.29 0.39 0.37 0.41 Dog 0.43 0.54 0.52 0.61 0.60 0.62 Lynx 0.50 0.57 0.54 0.59 0.50 0.59 Regards, Moshe. --- Suzanne Matthews [EMAIL PROTECTED] wrote: Hello All, I have a simple question based on how things are labeled on my heat map; particularly, there is this annoying X that appears before the numeric value of all the labels of my columns. Let's say I have the following silly data, stored in temp.txt 19051910195019922011 2020 Gnat0.080.290.290.370.39 0.43 Snake 0.160.340.320.400.41 0.53 Bat 0.400.540.520.60 0.60 0.63 Cat 0.160.270.290.390.37 0.41 Dog 0.430.540.520.610.60 0.62 Lynx0.500.570.540.59 0.5 0.59 I use the following commands to generate my heatmap: heat - read.table('temp.txt') x - as.matrix(heat) heatmap.2(x, keysize=1.2, dendrogram=none, trace=none, Colv = FALSE, main = Silly Data, labCol= NULL, margin=c(7,8)) This generates a very nice heatmap, but there is one thing I have an issue with: How do I get rid of the 'X' that seems to come automatically before my numeric column values? I just want those columns to be labeled 1905, 1910, 1950, and so on. I cannot find anything in the heatmap.2 documentation that suggests how I should do this. Thank you very much for your time, and patience in reading this! Sincerely, Suzanne [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read
Re: [R] tapply
sigalit mangut-leiba wrote: I'm sorry for the unfocused questions, i'm new here... the output should be: classaps_mean 1 na 2 11.5 3 8 the mean aps of every class, when every id count *once*, for example: class 2, mean= (11+12)/2=11.5 hope it's clearer. Much... Get the first record for each individual from (e.g.) icul.redux - subset(icul, !duplicated(id)) then use tapply as before using variables from icul.redux. Or in one go with( subset(icul, !duplicated(id)), tapply(aps, class, mean, na.rm=TRUE) ) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot3d labels
Hello R users, I am a newby using R 2.5.0 on a Apple Power Book G4 with Mac OS X 10.4.10. Sorry that I ask again such stupid questions, but I haven´t found how to label the points created with plot3d (rgl). Hope somebody can help me. Thanks in advance. Birgit Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with heatmap - how to remove annoying X before numeric values?
Try this. It makes a copy of heatmap.2 whose scope is changed to first look within heatmap.3 for functions like mtext. We redefine mtext to intercept the text argument and change it appropriately. Then we call our copy of heatmap.2. With this there is no need to change the source of heatmap.2. heatmap.3 - function(...) { environment(heatmap.2) - environment() mtext - function(text, ...) { if (text == Value) text - Silly Value graphics::mtext(text, ...) } heatmap.2(...) } heatmap.3(as.matrix(heat), keysize=1.2, dendrogram=none, trace=none, Colv = FALSE, main = Silly Data, labCol= NULL, margin=c(7,8)) On 7/19/07, Suzanne Matthews [EMAIL PROTECTED] wrote: Sorry, I just realized I didn't send this to the list! (See below) Thanks for all the help! All is working fine now. If anyone knows of a more straightforward way to change the Value string for the Key, please let me know (just to satisfy my curiosity). I got it to work by modifying the source code (specifically, heatmap.2.R in the gplots package). However, before, I didn't make the call source(heatmap.2.R) before I called source(mysillyheatmap.R) Making the additional call to heatmap.2.R fixed everything. Thanks again for all your help! On 7/19/07, Suzanne Matthews [EMAIL PROTECTED] wrote: Thank you all for your prompt replies! The check.names=FALSE parameter fixed things entirely. One more question: Is there a straightforward way to modify the the Values string that labels the key to a user-defined value? For example, let's say I want to change Values to Silly Values. So far, what I have found that I need to do is actually go and change a static string in the source code. Is there a more direct way? Also, after I make the source code change (in gplots package, file: heatmap.2.R), how do I have R build from that? If I remember correctly, if I put the new heatmap.2.R in my directory, R is supposed to check for functions and such there before it goes and builds it from the main source code base (located at /usr/bin/R). I am a touch confused on which directory is my directory, where R will check first. I tried putting the modified heatmap.2.R file in the directory that my script is, and where I initially run R. But that didn't work! Is there anything that I should add to my R script that will force it to read from that from my directory? If not, which directory should I place this in? My OS is OS X, so I think Unix-based instructions will work. Thank you once again for your time and patience! Sincerely, Suzanne On 7/18/07, Moshe Olshansky [EMAIL PROTECTED] wrote: I was right saying that my solution was not the best possible! --- Prof Brian Ripley [EMAIL PROTECTED] wrote: read.table('temp.txt', check.names = FALSE) would be easier (and more general, since make.names can do more than prepend an 'X'). On Wed, 18 Jul 2007, Moshe Olshansky wrote: Hi Suzanne, My solution (which I am sure is not the best) would be: heat - read.table('temp.txt') heat X1905 X1910 X1950 X1992 X2011 X2020 Gnat 0.08 0.29 0.29 0.37 0.39 0.43 Snake 0.16 0.34 0.32 0.40 0.41 0.53 Bat0.40 0.54 0.52 0.60 0.60 0.63 Cat0.16 0.27 0.29 0.39 0.37 0.41 Dog0.43 0.54 0.52 0.61 0.60 0.62 Lynx 0.50 0.57 0.54 0.59 0.50 0.59 a-names(heat) b-strsplit(a,split=X) w-unlist(b) w [1] 1905 1910 1950 1992 2011 2020 z - w[seq(2,length(w),by=2)] z [1] 1905 1910 1950 1992 2011 2020 names(heat) - z heat 1905 1910 1950 1992 2011 2020 Gnat 0.08 0.29 0.29 0.37 0.39 0.43 Snake 0.16 0.34 0.32 0.40 0.41 0.53 Bat 0.40 0.54 0.52 0.60 0.60 0.63 Cat 0.16 0.27 0.29 0.39 0.37 0.41 Dog 0.43 0.54 0.52 0.61 0.60 0.62 Lynx 0.50 0.57 0.54 0.59 0.50 0.59 Regards, Moshe. --- Suzanne Matthews [EMAIL PROTECTED] wrote: Hello All, I have a simple question based on how things are labeled on my heat map; particularly, there is this annoying X that appears before the numeric value of all the labels of my columns. Let's say I have the following silly data, stored in temp.txt 19051910195019922011 2020 Gnat0.080.290.290.370.39 0.43 Snake 0.160.340.320.400.41 0.53 Bat 0.400.540.520.60 0.60 0.63 Cat 0.160.270.290.390.37 0.41 Dog 0.430.540.520.610.60 0.62 Lynx0.500.570.540.59 0.5 0.59 I use the following commands to generate my heatmap: heat -
Re: [R] plot3d labels
The documentation has: text3d(x, y = NULL, z = NULL, texts, adj = 0.5, justify, ...) Do this do it for you? On 7/19/07, Birgit Lemcke [EMAIL PROTECTED] wrote: Hello R users, I am a newby using R 2.5.0 on a Apple Power Book G4 with Mac OS X 10.4.10. Sorry that I ask again such stupid questions, but I haven´t found how to label the points created with plot3d (rgl). Hope somebody can help me. Thanks in advance. Birgit Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: Do GLM by groups
Dear All, I'm trying to do 'glm' analysis by groups just like in SAS you use by variable. I don't know how to do it in R, anyone can help with this? i.e. groupline rep value 1 1 1 0.2 1 1 2 0.3 1 1 3 0.23 1 2 1 0.2 1 2 2 0.3 1 2 3 0.23 2 1 1 0.2 2 1 2 0.3 2 1 3 0.23 2 2 1 0.2 2 2 2 0.3 2 2 3 0.23 in SAS we say: model value=line rep; by group; How can I do this in R? Thanks, Hongmei Jia __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fw: Do GLM by groups
Check the following example for by(): require(stats) attach(warpbreaks) by(warpbreaks, tension, function(x) lm(breaks ~ wool, data = x)) or just type: example(by) b On Jul 19, 2007, at 1:18 PM, Hongmei Jia wrote: Dear All, I'm trying to do 'glm' analysis by groups just like in SAS you use by variable. I don't know how to do it in R, anyone can help with this? i.e. groupline rep value 1 1 1 0.2 1 1 2 0.3 1 1 3 0.23 1 2 1 0.2 1 2 2 0.3 1 2 3 0.23 2 1 1 0.2 2 1 2 0.3 2 1 3 0.23 2 2 1 0.2 2 2 2 0.3 2 2 3 0.23 in SAS we say: model value=line rep; by group; How can I do this in R? Thanks, Hongmei Jia __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sweave on mac os x
Dear all, is Sweave working on MAC ? I installed R-2.5.1 but seems like Sweave is not coming with the distribution as it comes on linux. Do I need to install it separately ? In that case is there a .dmg for mac ? Thanks for the help! Regards Marco [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] linear interpolation of multiple random time series
Hi all, Looking for tips on how I might more optimally solve this. I have time series data (samples from a force sensor) that are not guaranteed to be sampled at the same time values across trials. ex. trial timex 1 1 1 1 5 4 1 7 9 1 12 20 2 1 0 2 3 5 2 9 10 2 13 14 2 19 22 2 24 32 Within each trial I'd like to use linear interpolation between each successive time sample to fill in intermediary timepoints and x- values, ex. trial timex 1 1 1 1 2 1.75 1 3 2.5 1 4 3.25 1 5 4 1 6 6.5 1 7 9 1 8 11.2 1 9 13.4 1 10 15.6 1 11 17.8 1 12 20 2 1 0 2 2 2.5 2 3 5 2 4 5.83 2 5 6.67 2 6 7.5 2 7 8.33 2 8 9.17 2 9 10 2 10 11 2 11 12 2 12 13 2 13 14 2 14 15.3 2 15 16.7 2 16 18 2 17 19.3 2 18 20.7 2 19 22 2 20 24 2 21 26 2 22 28 2 23 30 2 24 32 The solution I've coded (below) involves going through the original data frame line by line and is thus very slow (indeed, I had to resort to writing to file as with a large data set I started running into memory issues if I tried to create the new data frame in memory). Any suggestions on a faster way to achieve what I'm trying to do? #assumes the first data frame above is stored as 'a' arows = (length(a$x)-1) write('', 'temp.txt') for(i in 1:arows){ if(a$time[i+1] a$time[i]){ write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) x1 = a$time[i] x2 = a$time[i+1] dx = x2-x1 if(dx != 1){ y1 = a$x[i] y2 = a$x[i+1] dy = y2-y1 slope = dy/dx int = -slope*x1+y1 temp=a[i,] for(j in (x1+1):(x2-1)){ temp$time = j temp$x = slope*j+int write.table(temp, 'temp.txt', row.names = F, col.names = F, append = T) } } }else{ write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) } } i=i+1 write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) b=read.table('temp.txt',skip=1) names(b)=names(a) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave on mac os x
Marco, Yes, absolutely sweave comes with the binary Mac OS X distribution; it works great for me. Regards, Tom marco.R.help marco.R.help wrote: Dear all, is Sweave working on MAC ? I installed R-2.5.1 but seems like Sweave is not coming with the distribution as it comes on linux. Do I need to install it separately ? In that case is there a .dmg for mac ? Thanks for the help! Regards Marco [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thomas E Adams National Weather Service Ohio River Forecast Center 1901 South State Route 134 Wilmington, OH 45177 EMAIL: [EMAIL PROTECTED] VOICE: 937-383-0528 FAX:937-383-0033 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] linear interpolation of multiple random time series
This should do it for you: x - read.table(textConnection(trial timex + 1 1 1 + 1 5 4 + 1 7 9 + 1 12 20 + 2 1 0 + 2 3 5 + 2 9 10 + 2 13 14 + 2 19 22 + 2 24 32), header=TRUE) # compute for each trial trial.list - lapply(split(x, x$trial), function(set){ + .xval - seq(min(set$time), max(set$time)) + .yval - approx(set$time, set$x, xout=.xval)$y + cbind(trial=set$trial[1], time=.xval, x=.yval) + }) do.call('rbind', trial.list) trial time x [1,] 11 1.00 [2,] 12 1.75 [3,] 13 2.50 [4,] 14 3.25 [5,] 15 4.00 [6,] 16 6.50 [7,] 17 9.00 [8,] 18 11.20 [9,] 19 13.40 [10,] 1 10 15.60 [11,] 1 11 17.80 [12,] 1 12 20.00 [13,] 21 0.00 [14,] 22 2.50 [15,] 23 5.00 [16,] 24 5.83 [17,] 25 6.67 [18,] 26 7.50 [19,] 27 8.33 [20,] 28 9.17 [21,] 29 10.00 [22,] 2 10 11.00 [23,] 2 11 12.00 [24,] 2 12 13.00 [25,] 2 13 14.00 [26,] 2 14 15.33 [27,] 2 15 16.67 [28,] 2 16 18.00 [29,] 2 17 19.33 [30,] 2 18 20.67 [31,] 2 19 22.00 [32,] 2 20 24.00 [33,] 2 21 26.00 [34,] 2 22 28.00 [35,] 2 23 30.00 [36,] 2 24 32.00 On 7/19/07, Mike Lawrence [EMAIL PROTECTED] wrote: Hi all, Looking for tips on how I might more optimally solve this. I have time series data (samples from a force sensor) that are not guaranteed to be sampled at the same time values across trials. ex. trial timex 1 1 1 1 5 4 1 7 9 1 12 20 2 1 0 2 3 5 2 9 10 2 13 14 2 19 22 2 24 32 Within each trial I'd like to use linear interpolation between each successive time sample to fill in intermediary timepoints and x- values, ex. trial timex 1 1 1 1 2 1.75 1 3 2.5 1 4 3.25 1 5 4 1 6 6.5 1 7 9 1 8 11.2 1 9 13.4 1 10 15.6 1 11 17.8 1 12 20 2 1 0 2 2 2.5 2 3 5 2 4 5.83 2 5 6.67 2 6 7.5 2 7 8.33 2 8 9.17 2 9 10 2 10 11 2 11 12 2 12 13 2 13 14 2 14 15.3 2 15 16.7 2 16 18 2 17 19.3 2 18 20.7 2 19 22 2 20 24 2 21 26 2 22 28 2 23 30 2 24 32 The solution I've coded (below) involves going through the original data frame line by line and is thus very slow (indeed, I had to resort to writing to file as with a large data set I started running into memory issues if I tried to create the new data frame in memory). Any suggestions on a faster way to achieve what I'm trying to do? #assumes the first data frame above is stored as 'a' arows = (length(a$x)-1) write('', 'temp.txt') for(i in 1:arows){ if(a$time[i+1] a$time[i]){ write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) x1 = a$time[i] x2 = a$time[i+1] dx = x2-x1 if(dx != 1){ y1 = a$x[i] y2 = a$x[i+1] dy = y2-y1 slope = dy/dx int = -slope*x1+y1 temp=a[i,] for(j in (x1+1):(x2-1)){ temp$time = j temp$x = slope*j+int write.table(temp, 'temp.txt', row.names = F, col.names = F, append = T) } } }else{ write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) } } i=i+1 write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) b=read.table('temp.txt',skip=1) names(b)=names(a) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch
Re: [R] linear interpolation of multiple random time series
Thsi can be done compactly using the zoo package. The first statement after library converts the rows for each trial into a separate zoo object and then uses by to merge these into a single zoo object with one column per trial. The second statement converts it from zoo to ts which has the effect of filling in all missing times. na.approx does the actual linear interpolation. The result is an mts object (or you could use as.zoo to convert that to a zoo object if you prefer). library(zoo) z - do.call(merge, by(a, a$trial, function(DF) zoo(DF$x, DF$time))) na.approx(as.ts(z), na.rm = FALSE) On 7/19/07, Mike Lawrence [EMAIL PROTECTED] wrote: Hi all, Looking for tips on how I might more optimally solve this. I have time series data (samples from a force sensor) that are not guaranteed to be sampled at the same time values across trials. ex. trial timex 1 1 1 1 5 4 1 7 9 1 12 20 2 1 0 2 3 5 2 9 10 2 13 14 2 19 22 2 24 32 Within each trial I'd like to use linear interpolation between each successive time sample to fill in intermediary timepoints and x- values, ex. trial timex 1 1 1 1 2 1.75 1 3 2.5 1 4 3.25 1 5 4 1 6 6.5 1 7 9 1 8 11.2 1 9 13.4 1 10 15.6 1 11 17.8 1 12 20 2 1 0 2 2 2.5 2 3 5 2 4 5.83 2 5 6.67 2 6 7.5 2 7 8.33 2 8 9.17 2 9 10 2 10 11 2 11 12 2 12 13 2 13 14 2 14 15.3 2 15 16.7 2 16 18 2 17 19.3 2 18 20.7 2 19 22 2 20 24 2 21 26 2 22 28 2 23 30 2 24 32 The solution I've coded (below) involves going through the original data frame line by line and is thus very slow (indeed, I had to resort to writing to file as with a large data set I started running into memory issues if I tried to create the new data frame in memory). Any suggestions on a faster way to achieve what I'm trying to do? #assumes the first data frame above is stored as 'a' arows = (length(a$x)-1) write('', 'temp.txt') for(i in 1:arows){ if(a$time[i+1] a$time[i]){ write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) x1 = a$time[i] x2 = a$time[i+1] dx = x2-x1 if(dx != 1){ y1 = a$x[i] y2 = a$x[i+1] dy = y2-y1 slope = dy/dx int = -slope*x1+y1 temp=a[i,] for(j in (x1+1):(x2-1)){ temp$time = j temp$x = slope*j+int write.table(temp, 'temp.txt', row.names = F, col.names = F, append = T) } } }else{ write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) } } i=i+1 write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) b=read.table('temp.txt',skip=1) names(b)=names(a) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with Dates
R I am taking an excel dataset and reading it into R using read.table. (actually I am dumping the data into a .txt file first and then reading data in to R). Here is snippet: head(data); Date Price Open.Int. Comm.Long Comm.Short net.comm 1 15-Jan-86 673.25175645 65910 2842537485 2 31-Jan-86 677.00167350 54060 2712026940 3 14-Feb-86 680.25157985 37955 2542512530 4 28-Feb-86 691.75162775 49760 1603033730 5 14-Mar-86 706.50163495 54120 2799526125 6 31-Mar-86 709.75164120 54715 3039024325 The dataset runs from 1986 to 2007. I want to be able to take subsets of my data based on date e.g. data between 2000 - 2005. As it stands, I can't work with the dates as they are not in correct format. I tried successfully converting the dates to just the year using: transform(data, Yr = format(as.Date(as.character(Date),format = '%d-%b-%y'), %y))) This gives the following format: Date Price Open.Int. Comm.Long Comm.Short net.comm Yr 1 15-Jan-86 673.25175645 65910 2842537485 86 2 31-Jan-86 677.00167350 54060 2712026940 86 3 14-Feb-86 680.25157985 37955 2542512530 86 4 28-Feb-86 691.75162775 49760 1603033730 86 5 14-Mar-86 706.50163495 54120 2799526125 86 6 31-Mar-86 709.75164120 54715 3039024325 86 I can subset for a single year e.g: head(subset(df, Yr ==00) But how can I subset for multiple periods e.g 00- 05? The following won't work: head(subset(df, Yr ==00 Yr==01) or head(subset(df, Yr = c(00,01,02,03) I can't help but feeling that I am missing something and there is a simpler route. I leafed through R newletter 4.1 which deals with dates and times but it seemed that strptime and POSIXct / POSIXlt are not what I need either. Can anybody help me? Regards Alex __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: tapply
thank you! sigalit. -- Forwarded message -- From: sigalit mangut-leiba [EMAIL PROTECTED] Date: Jul 19, 2007 7:03 PM Subject: tapply To: r-help r-help@stat.math.ethz.ch I'm sorry for the unfocused questions, i'm new here... the output should be: classaps_mean 1 na 2 11.5 3 8 the mean aps of every class, when every id count *once*, for example: class 2, mean= (11+12)/2=11.5 hope it's clearer. sigalit. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and Copula
Hi Meeryana, It seems you have not loaded the package. To use a package that you already installed, do: library(copula) I recommend you review the R documentation. There are several good references and tutorial on the CRAN site: http://cran.r-project.org/manuals.html http://cran.r-project.org/other-docs.html Any time you spend learning the basics will make your work more efficient. :) Good luck, Julian Fisheries Acoustics Research Lab School of Aquatic and Fishery Science copula wrote: I have installed fast all packages for copula, I have installed package 'gnml' and the others from the Jim Lindsey's web site and the other packages like 'repeted' which is necessary for calculating copula. but when I start with copula it write: 'gausscop' function is not found. And also 'fitted.gausscop(z)' and 'residuals.gausscop(z)'. also what I want to say is: when I have installed package 'repeted' from Jim Lindsey's web site in R project, this what I have received: utils:::menuInstallLocal() updating HTML package descriptions what have I to do next to solve this problem and to continue with copula. Thanks a lot for previous help! gyadav wrote: hi meeryana, may be this time nobnody is responding. but dont worry you will get a lot of help eventually, so always post a copy to the mailing list. The reason is there are a lot many newbies, although i am also not so old enough, who have even the simplest questions but are hesitant to ask. the objective of the list is to help, and thus feel free to ask, stand and contribute :-) i hope you will understand not today then tomorrow as you will get associated with the mailing list more closely. Personally i have found friends here. Further, never post a mail with a loose subject and secondly try to maintain the thread i.e. reply to the mail which you want to reply(for more details refer to the posting guide) :-) that would really help. Yes now regarding your question :-) Step1 : List of all packages can be found at this link :-) http://cran.r-project.org/src/contrib/PACKAGES.html Step2: Click on the package you want to install :-) http://cran.r-project.org/src/contrib/Descriptions/copula.html Step3: Then download the binary of your Operating system. If windows then download corresponding zip file. for copula it is http://cran.r-project.org/bin/windows/contrib/r-release/copula_0.5-3.zip save zip file on your system Step4 Open your R rpogram Step5: Goto Packages - Install packages from Local Zip file Step6: Select your package zip file which you want to install Step7: Sit back and relax Step8: load the library using library(LibraryName) on R prompt There are alternate ways of installing the package directly from R prompt. It didn't worked for me a long time back, so i always adopt this method. Somebody on the list may help you in this regards :-) bye and learn Join and stand with Open Source and R community Cheers and Chiao, Welcome -gaurav dear Mr. Yadav, I want to thank for help, and for that you are only who is willing to help, but I have one question: because I'm new with R project also, I think I should install a package for copula. I have only installed R program. How should I install this package? And is it what I have also to do with credit metrics, Value at Risk, matix and the other formulas, I mean install packages. I hope that you have a little time for me and my problem, and I hope I'm not disturbing you. thank you for all you can do for me and best regards, Mirjana gyadav wrote: hi see the code below i hope this will make your understanding of copulas better this code plots two normal distribution and their joint distribution N[0,2] N[0,4] HTH ##code library(copula) ###copy in two parts in R# ##PART A## ## construct a bivariate distribution whose marginals ## are normal and Normal respectively, coupled ## together via a normal copula op - par(mfrow = c(2, 2), # 2 x 2 pictures on one plot pty = s) # square plotting region, # independent of device size x - mvdc(normalCopula(0.75), c(norm, norm), list(list(mean = 0, sd =2),list(mean = 0, sd =4))) x.samp - rmvdc(x, 1) par(mfrow=c(2,3)) hist(x.samp[,1],xlab=Normal) hist(x.samp[,2],xlab=Normal) plot(x.samp[,2],x.samp[,1],pch=21,xlab=Normal,ylab=Normal) plot(dmvdc(x, x.samp)) plot(pmvdc(x, x.samp)) ## At end of plotting, reset to previous settings: ###PART B### par(op) for (i in seq(1:360)){ persp(x, dmvdc, xlim = c(-4, 4), ylim=c(0, 1),theta=i) } Regards, Gaurav Yadav +++ Assistant Manager, CCIL, Mumbai (India) Mob: +919821286118 Email: [EMAIL PROTECTED] Bhagavad Gita: Man is made by his Belief, as
Re: [R] linear interpolation of multiple random time series
Jim Holtman's solution works great, and will try the zoo method just for fun as well. Thanks to all of you :) Mike -- Mike Lawrence Graduate Student, Department of Psychology, Dalhousie University Website: http://memetic.ca Public calendar: http://icalx.com/public/informavore/Public The road to wisdom? Well, it's plain and simple to express: Err and err and err again, but less and less and less. - Piet Hein __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Questions regarding R and fitting GARCH models
Dear all, I've recently switched from EViews to R with RMetrics/fSeries (newest version of july 10) for my analysis because of the much bigger flexibility it offers. So far my experiences had been great -prior I had already worked extensively with S-Plus so was already kind of familiar with the language- until I got to the fSeries package. My problem with the documentation of fSeries is that it is pretty sparse; therefore I don't whether I am doing something wrong or if my problem is related to elementary statistics (though I hold an MSc in Econometrics, it's been quite awhile :o). Next to that I have two questions related to the syntaxis of the R language itself; I've been searching for a good couple of hours but couldn't find the answers. Hope you can help me out. From the descriptive statistics of my series, I had determined that my GARCH error term should follow a student's t distribution, preferrably skewed; the sstdFit function returns nu=2.002 and xi=1.012. I know that for this distribution 2 degrees of freedom means the second, third and fourth moment are not defined; however, the QQ plot looks good -had to use the skwt package though, because rsstd didn't work- so I decided to give it a try. 1) That didn't go all to well. garchFit returns alpha1 and beta1 of 0.1 and 0.8 respectively (and beta1 has a standard error of NaN) - which are not consistent at all with what EViews reported. How is this possible? Is it because GARCH estimation with a student's t distribution with 2 degrees of freedom is impossible? What are my options in this case? Or do I need to use garchSpec, of which I have absolutely *no* idea what its use is (i.e. the entire idea is to obtain estimates for a given order of AR-GARCH, right)? 2) R/RGui crashes frequently - also if I just want to estimate a GARCH(1,1) with a student's t distributed error term and 4 or 5 degrees of freedom. I think it just gets into an endless calculation loop; is there a way to abort this? 3) Is there a way to extract the data from the garchFit objects? So if fit is my object, how can I extract specific data like the fit$coef[1] I can use with other objects? Ultimately, fSeries comes with a really nice plotting function (i.e., plot(fit, which=2)) but I want to extract the series it actually plots from there - is that possible like the hist function? Thanks a lot for your time. Best regards, Jeroen van der Heide __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions regarding R and fitting GARCH models
As for (3) below, we are also interested in the xi parameter from the gpdFit function from fExtremes. It seems the fit classes returned by the associated functions don't have the full properties as finMetrics has for its classes. My solution was to modify the appropriate fMetrics code, build a new package, and install it. I haven't completed that yet. Are there any other things I can try? Dale Smith Vicis Capital, LLC -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeroen van der Heide Sent: Thursday, July 19, 2007 4:54 PM To: r-help@stat.math.ethz.ch Subject: [R] Questions regarding R and fitting GARCH models Dear all, I've recently switched from EViews to R with RMetrics/fSeries (newest version of july 10) for my analysis because of the much bigger flexibility it offers. So far my experiences had been great -prior I had already worked extensively with S-Plus so was already kind of familiar with the language- until I got to the fSeries package. My problem with the documentation of fSeries is that it is pretty sparse; therefore I don't whether I am doing something wrong or if my problem is related to elementary statistics (though I hold an MSc in Econometrics, it's been quite awhile :o). Next to that I have two questions related to the syntaxis of the R language itself; I've been searching for a good couple of hours but couldn't find the answers. Hope you can help me out. From the descriptive statistics of my series, I had determined that my GARCH error term should follow a student's t distribution, preferrably skewed; the sstdFit function returns nu=2.002 and xi=1.012. I know that for this distribution 2 degrees of freedom means the second, third and fourth moment are not defined; however, the QQ plot looks good -had to use the skwt package though, because rsstd didn't work- so I decided to give it a try. 1) That didn't go all to well. garchFit returns alpha1 and beta1 of 0.1 and 0.8 respectively (and beta1 has a standard error of NaN) - which are not consistent at all with what EViews reported. How is this possible? Is it because GARCH estimation with a student's t distribution with 2 degrees of freedom is impossible? What are my options in this case? Or do I need to use garchSpec, of which I have absolutely *no* idea what its use is (i.e. the entire idea is to obtain estimates for a given order of AR-GARCH, right)? 2) R/RGui crashes frequently - also if I just want to estimate a GARCH(1,1) with a student's t distributed error term and 4 or 5 degrees of freedom. I think it just gets into an endless calculation loop; is there a way to abort this? 3) Is there a way to extract the data from the garchFit objects? So if fit is my object, how can I extract specific data like the fit$coef[1] I can use with other objects? Ultimately, fSeries comes with a really nice plotting function (i.e., plot(fit, which=2)) but I want to extract the series it actually plots from there - is that possible like the hist function? Thanks a lot for your time. Best regards, Jeroen van der Heide __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. All e-mail sent to or from this address will be received or otherwise recorded by Vicis Capital, LLC and is subject to archival, monitoring and/or review, by and/or disclosure to, someone other than the recipient. This message is intended only for the use of the person(s) (intended recipient) to whom it is addressed. It may contain information that is privileged and confidential. If you are not the intended recipient, please contact the sender as soon as possible and delete the message without reading it or making a copy. Any dissemination, distribution, copying, or other use of this message or any of its content by any person other than the intended recipient is strictly prohibited. Vicis Capital, LLC only transacts business in states where it is properly registered or notice filed, or excluded or exempted from registration or notice filing requirements. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trend lines on a scatterplot matrix
I'm using pairs() to generate a scatterplot matrix; pairs(~ Fuzzy.gray.white.ratio+Fuzzy.gw.t.score+AgeWhenTested+signal_mean.noise, data=datam,subset=status==control,main=Controls, labels=c(G/W,Peak Separation,Age,S/N)) How can I add regression lines to the plots? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Dates
Try some of the following: head(subset(df, Yr %in% c(00,01,02,03))) subset(df, (Yr = '00') (Yr = '03')) # same as above subset(df, (Yr == '00') | (Yr == '01') | (Yr == '02') |(Yr == '03')) # same On 7/19/07, Alex Park [EMAIL PROTECTED] wrote: R I am taking an excel dataset and reading it into R using read.table. (actually I am dumping the data into a .txt file first and then reading data in to R). Here is snippet: head(data); Date Price Open.Int. Comm.Long Comm.Short net.comm 1 15-Jan-86 673.25175645 65910 2842537485 2 31-Jan-86 677.00167350 54060 2712026940 3 14-Feb-86 680.25157985 37955 2542512530 4 28-Feb-86 691.75162775 49760 1603033730 5 14-Mar-86 706.50163495 54120 2799526125 6 31-Mar-86 709.75164120 54715 3039024325 The dataset runs from 1986 to 2007. I want to be able to take subsets of my data based on date e.g. data between 2000 - 2005. As it stands, I can't work with the dates as they are not in correct format. I tried successfully converting the dates to just the year using: transform(data, Yr = format(as.Date(as.character(Date),format = '%d-%b-%y'), %y))) This gives the following format: Date Price Open.Int. Comm.Long Comm.Short net.comm Yr 1 15-Jan-86 673.25175645 65910 2842537485 86 2 31-Jan-86 677.00167350 54060 2712026940 86 3 14-Feb-86 680.25157985 37955 2542512530 86 4 28-Feb-86 691.75162775 49760 1603033730 86 5 14-Mar-86 706.50163495 54120 2799526125 86 6 31-Mar-86 709.75164120 54715 3039024325 86 I can subset for a single year e.g: head(subset(df, Yr ==00) But how can I subset for multiple periods e.g 00- 05? The following won't work: head(subset(df, Yr ==00 Yr==01) or head(subset(df, Yr = c(00,01,02,03) I can't help but feeling that I am missing something and there is a simpler route. I leafed through R newletter 4.1 which deals with dates and times but it seemed that strptime and POSIXct / POSIXlt are not what I need either. Can anybody help me? Regards Alex __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] can I paste 'newline'?
It is ok to bury a reg expression '\n' when using 'cat', but not 'paste'. e.g. cat ('I need to move on to a new line', '\n', 'at here') # change line! paste ('I need to move on to a new line', '\n', 'at here') # '\n' is just a character as it is. Is there a way around pasting '\n' ? Thanks a lot. -- View this message in context: http://www.nabble.com/can-I-paste-%27newline%27--tf4114350.html#a11699845 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can I paste 'newline'?
On 19/07/2007 7:41 PM, runner wrote: It is ok to bury a reg expression '\n' when using 'cat', but not 'paste'. e.g. cat ('I need to move on to a new line', '\n', 'at here') # change line! paste ('I need to move on to a new line', '\n', 'at here') # '\n' is just a character as it is. Is there a way around pasting '\n' ? Thanks a lot. What do you want to get? Do you want a two element vector? Then use c(). Do you want a one element vector that prints on two lines? Use either form, they both work (but you need to use cat() to do the display). x - paste ('I need to move on to a new line', '\n', 'at here') cat(x) I need to move on to a new line at here __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can I paste 'newline'?
Notice the difference: cat ('I need to move on to a new line', '\n', 'at here') # change line! I need to move on to a new line at here paste ('I need to move on to a new line', '\n', 'at here') # '\n' is just a [1] I need to move on to a new line \n at here cat(paste ('I need to move on to a new line', '\n', 'at here')) I need to move on to a new line at here paste(a long string + with carriage + returns) [1] a long string\nwith carriage\nreturns cat(paste(a long string + with carriage + returns)) a long string with carriage returns paste is showing you the characters in the string; cat is acutally outputting to a print device where '\n' is a line feed. On 7/19/07, runner [EMAIL PROTECTED] wrote: It is ok to bury a reg expression '\n' when using 'cat', but not 'paste'. e.g. cat ('I need to move on to a new line', '\n', 'at here') # change line! paste ('I need to move on to a new line', '\n', 'at here') # '\n' is just a character as it is. Is there a way around pasting '\n' ? Thanks a lot. -- View this message in context: http://www.nabble.com/can-I-paste-%27newline%27--tf4114350.html#a11699845 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] exclude1 in summary.formula from Hmisc
Dear Users, Still having troubles with sharing my results, I'm trying to display a contingency table using summary.formula. Computing works well but I'd like to display information on redundant entries say, males AND females. I wonder if this code is correct : desqualjum - summary.formula(varqual1$X ~., data = subset( varqual1, select = -X), method = reverse, overall = T, test = T, exclude1 = FALSE) where varqual1 is the data frame containing the variable sex. I'm using R 2.5.1 and Hmisc 3.4-2 on a Mac (OSX.4, intel) but I had the same trouble with former versions of R and the package. Thank you for your help. David [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] BOA (Bayesian Output Analysis)
Dear all, May anyone help me to use this package for the R software? My procedure have been: dir() [1] line1.ind line1.out line1.txt line2.ind line2.out line2.txt regressao.odc boa.menu() Bayesian Output Analysis Program (BOA) Version 1.1.6 for i386, mingw32 Copyright (c) 2007 Brian J. Smith [EMAIL PROTECTED] This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License or any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. For a copy of the GNU General Public License write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA, or visit their web site at http://www.gnu.org/copyleft/gpl.html NOTE: if the event of a menu system crash, type boa.menu(recover = TRUE) to restart and recover your work. BOA MAIN MENU * 1: File 2: Data 3: Analysis 4: Plot 5: Options 6: Window Seleção: 1 FILE MENU = 1: Back 2: ---+ 3: Import Data | 4: Save Session | 5: Load Session | 6: Exit BOA | 7: ---+ Seleção: 3 IMPORT DATA MENU 1: Back 2: ---+ 3: CODA Output Files | 4: Flat ASCII Files | 5: Data Matrix Objects| 6: View Format Specifications | 7: Options... | 8: ---+ Seleção: 3 Enter index filename prefix without the .ind extension [Working Directory: ] 1: line1 Read 1 item Enter output filename prefix without the .out extension [Default: line1] 1: line1 Read 1 item *Warning: import failed Could not find '/line1.ind' or '/line1.out'.* ** *What is happening??* *Thanks a lot.* *Marcus Vinicius* [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to determine/assign a numeric vector to Y in the cor.test function for spearman's correlations?
Hello to all of you, R-expeRts! I am trying to compute the cor.test for a matrix that i labelled mydata according to mydata=read.csv... then I converted my csv file into a matrix with the mydata=as.matrix(mydata) NOW, I need to get the p-values from the correlations... I can successfully get the spearman's correlation matrix with: cor(mydata, method=s, use=pairwise,complete,obs) but then if I try the cor.test(mydata, method=s, use=pairwise.complete.obs) i always get an error message as follows: the y argument is missing and does not have any default value assigned... (excuse my translation) WHAT SHOULD I DO??? HOW CAN I DEFINE Y AS A NUMERIC VECTOR??? thanx guys, i just can't express myself as a computer would... G :) -- View this message in context: http://www.nabble.com/how-to-determine-assign-a-numeric-vector-to-%22Y%22-in-the-cor.test-function-for-spearman%27s-correlations--tf4114486.html#a11700314 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: evaluation nested too deeply when doing heatmap with binary distfunction
Yes. After I increase the threshould to 1 it got through. Thanks a lot! From: Uwe Ligges [EMAIL PROTECTED] To: zhihua li [EMAIL PROTECTED] CC: r-help@stat.math.ethz.ch Subject: Re: [R] Error: evaluation nested too deeply when doing heatmap with binary distfunction Date: Thu, 19 Jul 2007 15:18:29 +0200 zhihua li wrote: Hi netters, I have a matrix X of the size (1000,100). The values are from -3 to +3. When I tried heatmap(X, distfun=function(c),dist(c,method=bin),hclustfun=function(m),hclust(m,method=average)) I got the error message: Error: evaluation nested too deeply: infinite recursion / options(expressions=)? So, does it help to increase the thresholds? If not, please specify a easily reproducible example that helps us to investigate your problem. Best, Uwe Ligges However, if I used default parameters for distfunction: heatmap(X, hclustfun=function(m),hclust(m,method=average)) there is no error messages at all. But the problem is that I have to use binary method in my disfunction. How can I resolve the problem? Thanks a lot! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ 享用世界上最大的电子邮件系统― MSN Hotmail。 http://www.hotmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dataframe of factors transform speed?
Hello, This is a speed question. I have a dataframe genoT: dim(genoT) [1] 1002 238304 str(genoT) 'data.frame': 1002 obs. of 238304 variables: $ SNP_A.4261647: Factor w/ 3 levels 0,1,2: 3 3 3 3 3 3 3 3 3 3 ... $ SNP_A.4261610: Factor w/ 3 levels 0,1,2: 1 1 3 3 1 1 1 2 2 2 ... $ SNP_A.4261601: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261704: Factor w/ 3 levels 0,1,2: 3 3 3 3 3 3 3 3 3 3 ... $ SNP_A.4261563: Factor w/ 3 levels 0,1,2: 3 1 2 1 2 3 2 3 3 1 ... $ SNP_A.4261554: Factor w/ 3 levels 0,1,2: 1 1 NA 1 NA 2 1 1 2 1 ... $ SNP_A.4261666: Factor w/ 3 levels 0,1,2: 1 1 2 1 1 1 1 1 1 2 ... $ SNP_A.4261634: Factor w/ 3 levels 0,1,2: 3 3 2 3 3 3 3 3 3 2 ... $ SNP_A.4261656: Factor w/ 3 levels 0,1,2: 1 1 2 1 1 1 1 1 1 2 ... $ SNP_A.4261637: Factor w/ 3 levels 0,1,2: 1 3 2 3 2 1 2 1 1 3 ... $ SNP_A.4261597: Factor w/ 3 levels AA,AB,BB: 2 2 3 3 3 2 1 2 2 3 ... $ SNP_A.4261659: Factor w/ 3 levels AA,AB,BB: 3 3 3 3 3 3 3 3 3 3 ... $ SNP_A.4261594: Factor w/ 3 levels AA,AB,BB: 2 2 2 1 1 1 2 2 2 2 ... $ SNP_A.4261698: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261538: Factor w/ 3 levels AA,AB,BB: 2 3 2 2 3 2 2 1 1 2 ... $ SNP_A.4261621: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261553: Factor w/ 3 levels AA,AB,BB: 1 1 2 1 1 1 1 1 1 1 ... $ SNP_A.4261528: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261579: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 2 1 1 1 2 ... $ SNP_A.4261513: Factor w/ 3 levels AA,AB,BB: 2 1 2 2 2 NA 1 NA 2 1 ... $ SNP_A.4261532: Factor w/ 3 levels AA,AB,BB: 1 2 2 1 1 1 3 1 1 1 ... $ SNP_A.4261600: Factor w/ 2 levels AB,BB: 2 2 2 2 2 2 2 2 2 2 ... $ SNP_A.4261706: Factor w/ 2 levels AA,BB: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261575: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 2 2 1 ... Its columns are factors with different number of levels (from 1 to 3 - that's what I got from read.table, i.e., it dropped missing levels). I want to convert it to uniform factors with 3 levels. The 1st 10 rows above show already converted columns and the rest are not yet converted. Here's my attempt wich is a complete failure as speed: system.time( + for(j in 1:(10 )){ #-- this is to try 1st 10 cols and measure the time, it otherwise is ncol(genoT) instead of 10 +gt-genoT[[j]] #-- this is to avoid 2D indices +for(l in 1:length([EMAIL PROTECTED])){ + levels(gt)[l] - switch([EMAIL PROTECTED],AA=0,AB=1,BB=2) #-- convert levels to 0,1, or 2 + genoT[[j]]-factor(gt,levels=0:2) #-- make a 3-level factor and put it back +} + } + ) [1] 785.085 4.358 789.454 0.000 0.000 789s for 10 columns only! To me it seems like replacing 10 x 3 levels and then making a factor of 1002 element vector x 10 is a negligible amount of operations needed. So, what's wrong with me? Any idea how to accelerate significantly the transformation or (to go to the very beginning) to make read.table use a fixed set of levels (AA,AB, and BB) and not to drop any (missing) level? R-devel_2006-08-26, Sun Solaris 10 OS - x86 64-bit The machine is with 32G RAM and AMD Opteron 285 (2.? GHz) so it's not it. Thank you very much for the help, Latchezar Dimitrov, Analyst/Programmer IV, Wake Forest University School of Medicine, Winston-Salem, North Carolina, USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dataframe of factors transform speed?
it looks like that whatever method you used to genotype the 1002 samples on the STY array gave you a transposed matrix of genotype calls. :-) i'd use: genoT = read.table(yourFile, stringsAsFactors = FALSE) as a starting point... but I don't think that would be efficient (as you'd need to fix one column at a time - lapply). i'd preprocess yourFile before trying to load it: cat yourFile | sed -e 's/AA/1/g' | sed -e 's/AB/2/g' | sed -e 's/BB/3/ g' outFile and, now, in R: genoT = read.table(outFile, header=TRUE) b On Jul 19, 2007, at 11:51 PM, Latchezar Dimitrov wrote: Hello, This is a speed question. I have a dataframe genoT: dim(genoT) [1] 1002 238304 str(genoT) 'data.frame': 1002 obs. of 238304 variables: $ SNP_A.4261647: Factor w/ 3 levels 0,1,2: 3 3 3 3 3 3 3 3 3 3 ... $ SNP_A.4261610: Factor w/ 3 levels 0,1,2: 1 1 3 3 1 1 1 2 2 2 ... $ SNP_A.4261601: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261704: Factor w/ 3 levels 0,1,2: 3 3 3 3 3 3 3 3 3 3 ... $ SNP_A.4261563: Factor w/ 3 levels 0,1,2: 3 1 2 1 2 3 2 3 3 1 ... $ SNP_A.4261554: Factor w/ 3 levels 0,1,2: 1 1 NA 1 NA 2 1 1 2 1 ... $ SNP_A.4261666: Factor w/ 3 levels 0,1,2: 1 1 2 1 1 1 1 1 1 2 ... $ SNP_A.4261634: Factor w/ 3 levels 0,1,2: 3 3 2 3 3 3 3 3 3 2 ... $ SNP_A.4261656: Factor w/ 3 levels 0,1,2: 1 1 2 1 1 1 1 1 1 2 ... $ SNP_A.4261637: Factor w/ 3 levels 0,1,2: 1 3 2 3 2 1 2 1 1 3 ... $ SNP_A.4261597: Factor w/ 3 levels AA,AB,BB: 2 2 3 3 3 2 1 2 2 3 ... $ SNP_A.4261659: Factor w/ 3 levels AA,AB,BB: 3 3 3 3 3 3 3 3 3 3 ... $ SNP_A.4261594: Factor w/ 3 levels AA,AB,BB: 2 2 2 1 1 1 2 2 2 2 ... $ SNP_A.4261698: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261538: Factor w/ 3 levels AA,AB,BB: 2 3 2 2 3 2 2 1 1 2 ... $ SNP_A.4261621: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261553: Factor w/ 3 levels AA,AB,BB: 1 1 2 1 1 1 1 1 1 1 ... $ SNP_A.4261528: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261579: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 2 1 1 1 2 ... $ SNP_A.4261513: Factor w/ 3 levels AA,AB,BB: 2 1 2 2 2 NA 1 NA 2 1 ... $ SNP_A.4261532: Factor w/ 3 levels AA,AB,BB: 1 2 2 1 1 1 3 1 1 1 ... $ SNP_A.4261600: Factor w/ 2 levels AB,BB: 2 2 2 2 2 2 2 2 2 2 ... $ SNP_A.4261706: Factor w/ 2 levels AA,BB: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261575: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 2 2 1 ... Its columns are factors with different number of levels (from 1 to 3 - that's what I got from read.table, i.e., it dropped missing levels). I want to convert it to uniform factors with 3 levels. The 1st 10 rows above show already converted columns and the rest are not yet converted. Here's my attempt wich is a complete failure as speed: system.time( + for(j in 1:(10 )){ #-- this is to try 1st 10 cols and measure the time, it otherwise is ncol(genoT) instead of 10 +gt-genoT[[j]] #-- this is to avoid 2D indices +for(l in 1:length([EMAIL PROTECTED])){ + levels(gt)[l] - switch([EMAIL PROTECTED],AA=0,AB=1,BB=2) #-- convert levels to 0,1, or 2 + genoT[[j]]-factor(gt,levels=0:2) #-- make a 3-level factor and put it back +} + } + ) [1] 785.085 4.358 789.454 0.000 0.000 789s for 10 columns only! To me it seems like replacing 10 x 3 levels and then making a factor of 1002 element vector x 10 is a negligible amount of operations needed. So, what's wrong with me? Any idea how to accelerate significantly the transformation or (to go to the very beginning) to make read.table use a fixed set of levels (AA,AB, and BB) and not to drop any (missing) level? R-devel_2006-08-26, Sun Solaris 10 OS - x86 64-bit The machine is with 32G RAM and AMD Opteron 285 (2.? GHz) so it's not it. Thank you very much for the help, Latchezar Dimitrov, Analyst/Programmer IV, Wake Forest University School of Medicine, Winston-Salem, North Carolina, USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dataframe of factors transform speed?
Is this what you want? It took 0.01 seconds to convert 20 rows of the test data: # create some data (20 rows with 1000 columns) n - 20 result - list() vals - c(AA, AB, BB) for (i in 1:n){ + result[[as.character(i)]] - sample(vals,1000, replace=TRUE, prob=c(9000,1,1)) + } result.df - do.call('data.frame', result) str(result.df) 'data.frame': 1000 obs. of 20 variables: $ X1 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X2 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X3 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X4 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X5 : Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ... $ X6 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X7 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X8 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X9 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X10: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X11: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X12: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X13: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X14: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X15: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X16: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X17: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X18: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X19: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ X20: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... # go through each row and convert the factors according to 'vals' above system.time({ # time to convert 20 rows + x - lapply(result.df, function(facts){ + factor(match(as.character(facts), vals) - 1, levels=0:2) + }) + result.df - do.call('data.frame', x) + }) user system elapsed 0.010.000.01 str(result.df) 'data.frame': 1000 obs. of 20 variables: $ X1 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X2 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X3 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X4 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X5 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X6 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X7 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X8 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X9 : Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X10: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X11: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X12: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X13: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X14: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X15: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X16: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X17: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X18: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X19: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ X20: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... On 7/19/07, Latchezar Dimitrov [EMAIL PROTECTED] wrote: Hello, This is a speed question. I have a dataframe genoT: dim(genoT) [1] 1002 238304 str(genoT) 'data.frame': 1002 obs. of 238304 variables: $ SNP_A.4261647: Factor w/ 3 levels 0,1,2: 3 3 3 3 3 3 3 3 3 3 ... $ SNP_A.4261610: Factor w/ 3 levels 0,1,2: 1 1 3 3 1 1 1 2 2 2 ... $ SNP_A.4261601: Factor w/ 3 levels 0,1,2: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261704: Factor w/ 3 levels 0,1,2: 3 3 3 3 3 3 3 3 3 3 ... $ SNP_A.4261563: Factor w/ 3 levels 0,1,2: 3 1 2 1 2 3 2 3 3 1 ... $ SNP_A.4261554: Factor w/ 3 levels 0,1,2: 1 1 NA 1 NA 2 1 1 2 1 ... $ SNP_A.4261666: Factor w/ 3 levels 0,1,2: 1 1 2 1 1 1 1 1 1 2 ... $ SNP_A.4261634: Factor w/ 3 levels 0,1,2: 3 3 2 3 3 3 3 3 3 2 ... $ SNP_A.4261656: Factor w/ 3 levels 0,1,2: 1 1 2 1 1 1 1 1 1 2 ... $ SNP_A.4261637: Factor w/ 3 levels 0,1,2: 1 3 2 3 2 1 2 1 1 3 ... $ SNP_A.4261597: Factor w/ 3 levels AA,AB,BB: 2 2 3 3 3 2 1 2 2 3 ... $ SNP_A.4261659: Factor w/ 3 levels AA,AB,BB: 3 3 3 3 3 3 3 3 3 3 ... $ SNP_A.4261594: Factor w/ 3 levels AA,AB,BB: 2 2 2 1 1 1 2 2 2 2 ... $ SNP_A.4261698: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261538: Factor w/ 3 levels AA,AB,BB: 2 3 2 2 3 2 2 1 1 2 ... $ SNP_A.4261621: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261553: Factor w/ 3 levels AA,AB,BB: 1 1 2 1 1 1 1 1 1 1 ... $ SNP_A.4261528: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261579: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 2 1 1 1 2 ... $ SNP_A.4261513: Factor w/ 3 levels AA,AB,BB: 2 1 2 2 2 NA 1 NA 2 1 ... $ SNP_A.4261532: Factor w/ 3 levels AA,AB,BB: 1 2 2 1 1 1 3 1 1 1 ... $ SNP_A.4261600: Factor w/ 2 levels AB,BB: 2 2 2 2 2 2 2 2 2 2 ... $ SNP_A.4261706: Factor w/ 2 levels AA,BB: 1 1 1 1 1 1 1 1 1 1 ... $ SNP_A.4261575: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 2 2 1 ... Its columns are factors with