[R] splitting a vector of strings
Hi everybody, I have a vector of character strings. Each string has the same pattern and I want to split them in pieces and get a vector made of the first pieces of each string. The problem is that strsplit returns a list. All I found is uu<- matrix(unlist(strsplit(x,";")),ncol=3,byrow=T)[,1] where x is the vector ";" is the delimiting character and I know that each string will be cut in 3 pieces. That works for my problem but I would prefer a more elegant solution. Besides, it would not work if all the string didn't have the same number of pieces. does someone have a better solution? sorry if that topic was discussed recently. There is too much traffic on the r-help list, I cannot catch up. -- Eric Elguero MIVEGEC. - UMR (CNRS/IRD/UM) 5290 Maladies Infectieuses et Vecteurs, Génétique, Evolution et Contrôle Institut de Recherche pour le Développement (IRD) 911, Avenue Agropolis BP 64501 34394 Montpellier Cedex 5, France __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with function "polygon"
On 11/07/2014 04:35 PM, Duncan Murdoch wrote: You are not using the polygon() function from the graphics package, you're using one coming from somewhere else (maybe an old version of R, or some package). The polygon() function in the graphics package doesn't call .Internal(polygon(..., it calls .External.graphics(C_polygon, ... If at some point you made a copy of the polygon() function and saved it, you're stuck with that one forever (or at least until you delete it from your workspace, or even better, delete the whole saved workspace). you're absolutely right. I was usin a "polygon" function from package ade4 (that I copied to my workspace, don't remember why). I will ask ade4 developpers. thank you. e.e. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with function "polygon"
Hi all, I'm trying to use the polygon function from the graphics package, and get this error message : > polygon(x=c(1,2,3,1),y=c(1,4,5,1)) Error in .Internal(polygon(xy$x, xy$y, col, border, lty, ...)) : there is no .Internal function 'polygon' That annoys me because polygon is actually called by several other functions I need. my R version: R version 3.1.2 (2014-10-31) -- "Pumpkin Helmet" Copyright (C) 2014 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) and I just updated everything. e.e. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] transmission of parameters to the glmmadmb function
Hi everybody, I wrote a function where several variables are created, and the used in a generalized mixed model, from the glmmADMB package. here is part of the function: print(length(spy)) uu<-summary(glmmadmb(spy~sex+poswing+spx+(1|host),data=ni, family="nbinom",zeroInflation=True)) when I run the function I get [1] 596 Error in eval(expr, envir, enclos) : object 'spy' not found (so spy is known to the function "print" but not to the function glmmadmb) now I modify my function: print(length(spy)) ni$spy<-spy ni$spx<-spx uu<-summary(glmmadmb(spy~sex+poswing+spx+(1|host),data=ni, family="nbinom",zeroInflation=True)) and that works. however, when I call glmmadmb interactively, it accepts in the formula variables which are in the dataframe specified by the 'data' argument, as well as variables which are not. and if in my function I replace glmmadmb by glm it works even if spy and spx are not in the 'ni' dataframe. that puzzles me. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] censored counts and glmer/glmmADMB
dear R-users, I have to model counts where all counts above some threshold have been censored. In the same dataset I have too many zeroes for a Poisson or even a negative binomial distribution to make sense, so I would need a zero-inflated-censored negative binomial family for use in glmer (or glmmADMB?). That seems not to exist. my question is : how could I add a custom-built family of distributions that I could call in glmer/glmmADMM ? if it's not possible, I am considering imputing fake values to replace the censored ones, but I am unsure whether this is bad or very bad... Eric Elguero MIVEGEC (UM1- UM2 -CNRS 5290-IRD 224) Maladies infectieuses et vecteurs : écologie, génétique, évolution et contrôle Centre IRD de Montpellier 911 Av Agropolis - BP 64501 34394 Montpellier Cedex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] call to system returns warning : status 2 (Ubuntu)
Hi everybody, I have to run under Ubuntu a programs repeatedly with different arguments and I am using R just to generate the data files and call the external program. basically, in my script I have inside a loop these two lines: command <- paste(,sep="") system(command,intern=T,wait=T) when I run this script, I get a number of warnings, like this one: 16: running command '~/LDhat22/ldconvert -seq ld/serca/serca-Trs.fas -freqcut 0.0 -missfreqcut 100.0 -sites 1 3687 -nous 6 > ld/serca/serca-Trs.out' had status 2 however, when I run the very same command at the bash prompt, everything seems fine (no complaint). in either cases, the output is the same and looks correct. So, may I just ignore these warnings or is there something I should fix? thank you in advance, Eric Elguero MIVEGEC IRD -CNRS - UM1 Montpellier - France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with missing package
Hi everybody, I just tried to run R on one of my projects but it did not want to run: R version 2.12.1 (2010-12-16) Copyright (C) 2010 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-pc-linux-gnu (64-bit) Loading required package: utils Error in loadNamespace(i[[1L]], c(lib.loc, .libPaths())) : there is no package called 'nlme' Fatal error: unable to restore saved data in .RData there are two things I do not understand: i) I had actually nlme installed, and working, but when I look in /usr/lib/R/library/nlme I find only a text file named "COPYING" and containing the gnu license. Where is the package gone? (by itself) ii) I tried to reinstall nlme but could not find it in the usual repositories. in any case, I would like to recover at least those R objects that do not depend on nlme. I tried : $mv .RData xxx $R >load("xxx") but that doesn't help. Is there a method to extract some information from .RData without loading it? Eric Elguero __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] modifynig some elements of a vector
He everybody, I want to add 1 to some elements of a vector: x is a vector u is a vector of idices, that is, integers assumed to be within the range 1..length(x) and I want to add 1 to the elements of x each time their index appears in u x[u]<-x[u]+1 works only when there are no duplicated values in u I found this solution: tu <- table(u) indices <- as.numeric(names(tu)) x[indices] <- x[indices]+tu but it looks ugly to me and I would prefer to avoid calling the function 'table' since this is to be done millions of times as part of a simulation program. Eric Elguero Génétique & Adaptation des Plasmodium IRD Montpellier - FRance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using gedit
Dear all, I'm using R (2.10.1) under Ubuntu (9.10) and, as I don't like vi, I edit my functions with the command : edit(.,editor="gedit") which works fine, except when gedit happens to be already running. Then a new tab is created, and on exit all changes are lost, regardless if I close the tab or the editor altogether. Is there a solution to this problem? thanks in advance. Eric Elguero __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] randomness in stepclass (klaR) or lda (MASS) ?
On Thu, 2010-04-29 at 15:08 +0200, Uwe Ligges wrote: > Well, it is called cross validation which is based on random sampling if > you do not have k=n -fold CV (=leave-one-out). > Again, to get reproducible results, you will need to set a seed. > thank you. I thought that "leave-one-out" was the default. I looked at the reference file and I am not sure how to get it. Is that by setting fold=1 ? > > If the results are that unstable: Do you really have a sufficient number > of observations for your classification problem? you're probably right. e.e. Eric Elguero Laboratory Genetics and Evolution of Infectious Diseases, Team: Genetics and Adaptation of Plasmodium UMR 2724 CNRS-IRD, IRD Montpellier, 911 Avenue Agropolis, BP 64501, 34394 Montpellier Cedex 5, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] randomness in stepclass (klaR) or lda (MASS) ?
Hi, a colleague ran a stepwise discriminant analysis twice in a row and got different results, suggesting some "sochasticity" in the algorithms involved. I looked at her data and found that there was a lot of collinearity, so that I reckoned that maybe "stepclass" (klaR) cannot find a clear winner when trying to include a new variable and makes a random choice. Is that true? another possibility is that "lda" (from MASS) computes CV classification rates from a random subsample instead of using all the data (?) That might be a sensible choice with a very large sample. I advised her to run the function several times and see if a consensus emerges, but that doesn't seem to be the case, and besides, I would like to know what really is going on. thanks Eric Elguero Laboratory Genetics and Evolution of Infectious Diseases, Team: Genetics and Adaptation of Plasmodium UMR 2724 CNRS-IRD, IRD Montpellier, 911 Avenue Agropolis, BP 64501, 34394 Montpellier Cedex 5, France > f4.U.spDA <- stepclass(f.mes, f.gp4, "lda",improvement=0.01,prior=rep(0.25,4)) `stepwise classification', using 10-fold cross-validated correctness rate of method lda'. 89 observations of 31 variables in 4 classes; direction: both stop criterion: improvement less than 1%. correctness rate: 0.58333; in: "X2"; variables (1): X2 correctness rate: 0.66389; in: "X9"; variables (2): X2, X9 correctness rate: 0.69583; in: "X27"; variables (3): X2, X9, X27 hr.elapsed min.elapsed sec.elapsed 0.000.00 20.77 > f4.U.spDA <- stepclass(f.mes, f.gp4, "lda",improvement=0.01,prior=rep(0.25,4)) `stepwise classification', using 10-fold cross-validated correctness rate of method lda'. 89 observations of 31 variables in 4 classes; direction: both stop criterion: improvement less than 1%. correctness rate: 0.60556; in: "X2"; variables (1): X2 correctness rate: 0.71806; in: "X6"; variables (2): X2, X6 hr.elapsed min.elapsed sec.elapsed 0.000.00 15.14 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sigma in glmer (lme4)
dear R-users, I am trying to understand what is the sigma parameter returner by glmer I thought it was (an estimate of) the sigma parameter defined by Mc Cullagh & Nelder (e.g. p 126 of 2nd edition) but I ran some simulations and it seems that this is something else. I simulated data corresponding to a binomial model, intended to be fitted by this command: glmer(cbind(success,failure)~X+(1|group),family=binomial) but I instead fitted the following model: glmer(cbind(success,failure)~X+(1|group),family=quasibinomial) (and repeated this process 500 times) I expected sigma to be close to 1 but I found that the mean sigma was about 0.05 (sd = 0.003) If I do the analogous simulation study with glm, that is, I simulate binomial data and fit them with family=quasibinomial instead of binomial, I find a mean dispersion parameter = 0. (sd=0.09). changing parameters values does not alter this pattern. In both cases, the fixed effects parameters are correctly estimated. here is the function I used to simulate data (taking 0 as the standard deviation of random effect provides data suitable to glm) function(x,theta,sigmag,nb.groups=10,size=50) #- # sim.data.mixed #- # simulates data for glmer # Y is Binom(p,size) # with logit(p) = theta1 + theta2*X + B # where B is Norm(0,sigmag) #- # x : the x values (same for each group) # length(x) is the number of observations per group # theta: the fixed effects parameters (intercept & slope) # sigmag : the random effects standard deviation # size : the binomial parameter (same for everybody) { group<-rep(1:nb.groups,rep(length(x),nb.groups)) random.effect<-rnorm(nb.groups,mean=0,sd=sigmag) xmat<-expand.grid(x,random.effect) eta<-theta[1]+theta[2]*xmat[,1]+xmat[,2] y<-rbinom(length(eta),size=size,prob=invlogit(eta)) return(data.frame(success=y,failure=size-y,x=xmat[,1],group=group,b=xmat[,2])) #-- # b (random effects) is returned here but not used by glmer } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] refit with binomial model (lme4)
On Mon, 2009-04-27 at 08:30 -0500, Douglas Bates wrote: > This is related to using the matrix form of the response for a > binomial glmm. The refit method for a model fit by lmer is based on a > numeric vector response. > thank you for this explanation. > Is it possible to use the expanded form (i.e. a vector of 0/1 values) > of the responses instead of the matrix form? > yes I could but I found that I could use the probability/weights form, at least in my case where I am simulating new binomial data with the observed number of trials. Eric Elguero > On Mon, Apr 27, 2009 at 7:20 AM, Eric Elguero wrote: > > Dear R users, > > > > I'm trying to use function 'refit' from lme4 > > and I get this error that I can't understand: > > > >> refit(dolo4.model4,cbind(uu,50-uu)) > > Error in function (classes, fdef, mtable) : > > unable to find an inherited method for function "refit", for signature > > "mer", "matrix" > > > > if I try: > > > >> refit(dolo4.model4,uu) > > Error in asMethod(object) : matrix is not symmetric [1,2] > > > > I get this error message that I can no more > > understand but which suggests that refit expects > > two columns. > > > > > > the initial model was: > > > >> dolo4.mod...@call > > glmer(formula = cbind(sortis, restes) ~ mean.co2 + (1 | sujet), > >data = dollo4.df, family = binomial) > > > > > > > > R version 2.9.0 (2009-04-17) > > > > and > > > > Package: lme4 > > Version: 0.999375-28 > > Date: 2008-12-13 > > > > thank you in advance > > > > e.e. > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] refit with binomial model (lme4)
Dear R users, I'm trying to use function 'refit' from lme4 and I get this error that I can't understand: > refit(dolo4.model4,cbind(uu,50-uu)) Error in function (classes, fdef, mtable) : unable to find an inherited method for function "refit", for signature "mer", "matrix" if I try: > refit(dolo4.model4,uu) Error in asMethod(object) : matrix is not symmetric [1,2] I get this error message that I can no more understand but which suggests that refit expects two columns. the initial model was: > dolo4.mod...@call glmer(formula = cbind(sortis, restes) ~ mean.co2 + (1 | sujet), data = dollo4.df, family = binomial) R version 2.9.0 (2009-04-17) and Package: lme4 Version: 0.999375-28 Date: 2008-12-13 thank you in advance e.e. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] strptime
Hi, what's wrong with that? strptime("06:00:00 03.01.2008",format="%H:%M%:%S %d.%m.%Y",tz="GMT") [1] NA the command seems to comply with the rules in the help file but returns NA (R 2.6.1 Windows XT) Eric Elguero __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] precision in seq
thank you to all who answered. > 0+0.05+ + 0.05+0.05+0.05+0.05+0.05+0.05+ + 0.05+0.05+0.05+0.05+0.05+0.05+ + 0.05+0.05+0.05+0.05+0.05+0.05 - 0.95 [1] 3.330669e-16 > seq(0,1,0.05)[20] - 0.95 [1] 1.110223e-16 > 0+19*0.05 - 0.95 [1] 1.110223e-16 so this is the way seq calculates. I would have guessed that addition was more accurate than multiplication, but that is not the case. this one however bothers me: > 19/20-0.95 [1] 0 I noticed this problem when I tried to extract rows of a matrix according to whether values of some vector where in the set (0,0.05,...,0.95,1), with something like x%in%seq(0,1,0.05) Now I understand that I should not use this construction unless x is of type integer. Would you agree? Eric Elguero __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] precision in seq
Hi everybody, this is a warning more than a question. I noticed that seq produces approximate results: > seq(0,1,0.05)[19]==0.9 [1] TRUE > seq(0,1,0.05)[20]==0.95 [1] FALSE > seq(0,1,0.05)[21]==1 [1] TRUE > seq(0,1,0.05)[20]-0.95 [1] 1.110223024625157e-16 I do not understand why 0.9 and 1 are correct (within some tolerance or strictly exact?) and 0.95 is not. this one works: > ((0:20)/20)[20]==0.95 [1] TRUE Eric Elguero __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tabulations in command-line under linux
Hi everybody, I'm trying to use R (2.4.1) undr Linux (debian) and a thing bothers me: sometimes I paste lines from a text editor into the R command line and tabulations are catched by the completing-names function of the csh. How could this behaviour be inhibited? thanks in advance Eric Elguero __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour plot (level curves)
> I have a sample of n values from a bivariate distribution (from a MCMC > procedure). How could I draw a contour plot of "the joint density" based on > that sample ? here is a fast 2D density estimator. Not very sophisticated, but works. The function assumes that data are in the form of a matrix with (first) two columns containing x and y coordinates. To plot the result: image(dens2d(x)) or contour(dens2d(x)) Play with the h parameter to change the smoothness of the surface. >dens2d function(x, nx = 20, ny = 20, margin = 0.05, h = 1) { xrange <- max(x[, 1]) - min(x[, 1]) yrange <- max(x[, 2]) - min(x[, 2]) xmin <- min(x[, 1]) - xrange * margin xmax <- max(x[, 1]) + xrange * margin ymin <- min(x[, 2]) - yrange * margin ymax <- max(x[, 2]) + yrange * margin xstep <- (xmax - xmin)/(nx - 1) ystep <- (ymax - ymin)/(ny - 1) xx <- xmin + (0:(nx - 1)) * xstep yy <- ymin + (0:(ny - 1)) * ystep g <- matrix(0, ncol = nx, nrow = ny) n <- dim(x)[[1]] for(i in 1:n) { coefx <- dnorm(xx - x[i, 1], mean = 0, sd = h) coefy <- dnorm(yy - x[i, 2], mean = 0, sd = h) g <- g + coefx %*% t(coefy)/n } return(list(x = xx, y = yy, z = g)) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.