Re: [R] removed data is still there!
On Thu, 23 Sep 2010, Peter Ehlers wrote: On 2010-09-21 5:51, Nikhil Kaza wrote: example(factor) iris1$Species- factor(iris1$Species, drop=T) will get you what you need. Hmm, doesn't work for me. ?factor does not list a 'drop=' argument. I suspect iris1$Species - [iris1$Species, drop=TRUE] was meant. See ?`[.factor` . -Peter Ehlers Nikhil Kaza Asst. Professor, City and Regional Planning University of North Carolina nikhil.l...@gmail.com On Sep 21, 2010, at 7:41 AM, pdb wrote: I'm confused, hope someone can point out what is not obvious to me. I thought I was creating a new data frame by 'deleting' rows from an existing dataframe - I've tried 2 methods. But this new data frame seems to remember values from its parent - even though there are no occurences. Where does it get the values versicolor and virginica from and give then a count of 0? What am I missing? Thanks in advance. summary(iris$Species) setosa versicolor virginica 50 50 50 nrow(iris) [1] 150 iris1- iris[iris$Species == 'setosa',] nrow(iris1) [1] 50 summary(iris1$Species) setosa versicolor virginica 50 0 0 boxplot(Petal.Width ~ Species, data = iris1, plot=1) iris2- subset(iris, Species == 'setosa') nrow(iris2) [1] 50 summary(iris2$Species) setosa versicolor virginica 50 0 0 boxplot(Petal.Width ~ Species, data = iris2, plot=1) -- View this message in context: http://r.789695.n4.nabble.com/removed-data-is-still-there-tp2548440p2548440.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting multiple animal tracks against Date/Time
Great, thanks ! It works beautifully now. Dr. Juliane Struve Daphne Jackson Fellow Imperial College London Department of Life Sciences Silwood Park Campus Buckhurst Road Ascot, Berkshire, SL5 7PY, UK Tel: +44 (0)20 7594 2527 Fax: +44 (0)1344 874 957 http://www.aquaticresources.orghttp://www.aquaticresources.org/ From: Gabor Grothendieck [ggrothendi...@gmail.com] Sent: 23 September 2010 23:06 To: Struve, Juliane Cc: r-help@r-project.org Subject: Re: [R] plotting multiple animal tracks against Date/Time On Thu, Sep 23, 2010 at 5:53 PM, Struve, Juliane j.str...@imperial.ac.ukmailto:j.str...@imperial.ac.uk wrote: Hello, thank you very much for replying. I have tried this, but I get error message Error in .subset(x, j) : invalid subscript type 'list' after z1 - read.zoo(textConnection(Lines1), skip = 1, index = list(1, 2), FUN = dt) Regards, Juliane It works for me. You likely have an outdated version of zoo. Make sure you have zoo 1.6-4. Here is the output: packageDescription(zoo)$Version [1] 1.6-4 Lines1 - DateDistance [m] + 2006-08-18 22:05:15 1815.798 + 2006-08-18 22:06:35 1815.798 + 2006-08-18 22:08:33 1815.798 + 2006-08-18 22:09:49 1815.798 + 2006-08-18 22:12:50 1815.798 + 2006-08-18 22:16:26 1815.798 Lines2 - Date Distance [m] + 2006-08-18 09:53:20 0.0 + 2006-08-18 09:59:07 0.0 + 2006-08-18 10:09:20 0.0 + 2006-08-18 10:21:14 0.0 + 2006-08-18 10:34:18 0.0 + 2006-08-18 10:36:44100.2 library(zoo) library(chron) # read in data dt - function(date, time) as.chron(paste(date, time)) z1 - read.zoo(textConnection(Lines1), skip = 1, index = list(1, 2), FUN = dt) z2 - read.zoo(textConnection(Lines2), skip = 1, index = list(1, 2), FUN = dt) # create single zoo object z - na.approx(cbind(z1, z2), na.rm = FALSE) Warning messages: 1: closing unused connection 4 (Lines2) 2: closing unused connection 3 (Lines1) # plot -- remove screen=1 if you want separate panels plot(z, screen = 1) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.comhttp://gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting multiple animal tracks against Date/Time
Hello, thank you very much for replying. I have tried this, but I get error message Error in .subset(x, j) : invalid subscript type 'list' after z1 - read.zoo(textConnection(Lines1), skip = 1, index = list(1, 2), FUN = dt) Regards, Juliane Dr. Juliane Struve Daphne Jackson Fellow Imperial College London Department of Life Sciences Silwood Park Campus Buckhurst Road Ascot, Berkshire, SL5 7PY, UK Tel: +44 (0)20 7594 2527 Fax: +44 (0)1344 874 957 http://www.aquaticresources.org From: Gabor Grothendieck [ggrothendi...@gmail.com] Sent: 23 September 2010 18:26 To: Struve, Juliane Cc: r-help@r-project.org Subject: Re: [R] plotting multiple animal tracks against Date/Time On Thu, Sep 23, 2010 at 9:50 AM, Struve, Juliane j.str...@imperial.ac.uk wrote: Dear list, I would like to create a time series plot in which the paths of several individuals are stacked above each other, with the x-axis being the total observation period of three years ( 1.1.2004 to 31.12.2007) and the y-axis being some defined range[min,max]. My data consist of Date/Time information and the paths of 45 individual as the distance from the location of release. An example data set for 2 individuals is given below.The observation period and frequency of observations varies between individuals. I believe stackplot() may be able to do this task, but I am not sure how to handle the variable time period and frequency of observations for different individuals. Could someone advise if stackplot() is suitable or if there is a better approach or package ? Thank you very much for your time and best wishes, Juliane Try this: Lines1 - DateDistance [m] 2006-08-18 22:05:15 1815.798 2006-08-18 22:06:35 1815.798 2006-08-18 22:08:33 1815.798 2006-08-18 22:09:49 1815.798 2006-08-18 22:12:50 1815.798 2006-08-18 22:16:26 1815.798 Lines2 - Date Distance [m] 2006-08-18 09:53:20 0.0 2006-08-18 09:59:07 0.0 2006-08-18 10:09:20 0.0 2006-08-18 10:21:14 0.0 2006-08-18 10:34:18 0.0 2006-08-18 10:36:44100.2 library(chron) dt - function(date, time) as.chron(paste(date, time)) library(zoo) library(chron) # read in data dt - function(date, time) as.chron(paste(date, time)) z1 - read.zoo(textConnection(Lines1), skip = 1, index = list(1, 2), FUN = dt) z2 - read.zoo(textConnection(Lines2), skip = 1, index = list(1, 2), FUN = dt) # create single zoo object z - na.approx(cbind(z1, z2), na.rm = FALSE) # plot -- remove screen=1 if you want separate panels plot(z, screen = 1) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data manipulation in R
If this has already been answered, my apologies in advance I am relatively new to this aspect of [R]. it is a bit of a basic question. I have 4 columns of data (site, Date, measurement type, value) in a tab delimited text file. Site is a site where measurements were collected, Date is a date in DD/MM/ format, measurement is a code for the type of measurement made, and value just the value observed. So each site has multiple dates on which it was sampled and each date has multiple measurement types (fortunately only one value per measurement type per day). I want to know how I can separate this into multiple columns by measurement type averaged over the range of dates available. The output would have a single averaged measurement value per site. Site, Measurement 1, measurement2, measurement3, etc. I have been reading it in as a matrix as.matrix(read.table(myfile.txt, headers=TRUE)), but I don't quite know what to do with it afterward. Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multivariate multiple regression coefficient
hi, I considered multivariate multiple regression for respons, predictor, each 5 dimension. on going process, through initb - eigen(temp)$vectors[,1:3] temp - vhalf%*%Kproduct(Ir,initb) (herein 'vhalf' is root for inverse of covariance matrix and 'Kproduct' is Kronecker product) and then process regression. However, I have tried using the 'lm' function and i get the result, like this. ( 'xi' is 25 by 1 vector ) reg - lm(xi~temp) reg Call: lm(formula = xi ~ temp) Coefficients: (Intercept)temp1temp2temp3temp4 temp5temp6temp7 -3.704e-152.434e-024.087e-02 NA -5.241e-01 -1.213e-02 NA -1.960e-02 temp8temp9 temp10 temp11 temp12 temp13 temp14 temp15 6.516e-03 NA -6.651e-028.865e-02 NA 4.272e-02 -3.085e-02 NA Coefficients for multiple of three are 'NA'. I'd like get any value, so i directly tried to calculate using matrix. That result, ginv(t(temp)%*%temp)%*%t(temp)%*%xi [,1] [1,] 0.0239396255 [2,] 0.0402622882 [3,] -0.0058317229 [4,] -0.5216032391 [5,] -0.0083355651 [6,] 0.0362961561 [7,] -0.0195581411 [8,] 0.0065824700 [9,] 0.0006407087 [10,] -0.0668295201 [11,] 0.0881623108 [12,] -0.0046706822 [13,] 0.0427419205 [14,] -0.0308143260 [15,] 0.0003160107 why don't I get coefficient value using the 'lm' function? and what's different to two way? I really need your help. I waiting your answer ! [1][Nmn9OQFrSBLEKLcn6umQCQ00] [...@from=sluvsekrcpt=r%2Dhelp%40r%2Dproject%2Eorg%2Emsgid=%3C20100924155914% 2EHM%2EM0E5rHm%40sluvsek%2Ewwl1049%2Ehanmail%2Enet%3E] References 1. mailto:sluv...@hanmail.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] removed data is still there!
On Fri, 24 Sep 2010, Prof Brian Ripley wrote: On Thu, 23 Sep 2010, Peter Ehlers wrote: On 2010-09-21 5:51, Nikhil Kaza wrote: example(factor) iris1$Species- factor(iris1$Species, drop=T) will get you what you need. Hmm, doesn't work for me. ?factor does not list a 'drop=' argument. I suspect iris1$Species - [iris1$Species, drop=TRUE] I don't know what happened there: iris1$Species - iris1$Species[, drop=TRUE] is what I saw before sending. was meant. See ?`[.factor` . -Peter Ehlers Nikhil Kaza Asst. Professor, City and Regional Planning University of North Carolina nikhil.l...@gmail.com On Sep 21, 2010, at 7:41 AM, pdb wrote: I'm confused, hope someone can point out what is not obvious to me. I thought I was creating a new data frame by 'deleting' rows from an existing dataframe - I've tried 2 methods. But this new data frame seems to remember values from its parent - even though there are no occurences. Where does it get the values versicolor and virginica from and give then a count of 0? What am I missing? Thanks in advance. summary(iris$Species) setosa versicolor virginica 50 50 50 nrow(iris) [1] 150 iris1- iris[iris$Species == 'setosa',] nrow(iris1) [1] 50 summary(iris1$Species) setosa versicolor virginica 50 0 0 boxplot(Petal.Width ~ Species, data = iris1, plot=1) iris2- subset(iris, Species == 'setosa') nrow(iris2) [1] 50 summary(iris2$Species) setosa versicolor virginica 50 0 0 boxplot(Petal.Width ~ Species, data = iris2, plot=1) -- View this message in context: http://r.789695.n4.nabble.com/removed-data-is-still-there-tp2548440p2548440.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] change tick spacing on the axis of a plot.
Hello How can I change the spacing of tick marks on the axis a plot? What parameters should I use on base plot or on rgl? cheers -- View this message in context: http://r.789695.n4.nabble.com/change-tick-spacing-on-the-axis-of-a-plot-tp2553149p2553149.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to make point character thicker in the legend of xyplot?
On Fri, Sep 24, 2010 at 7:39 AM, Peter Ehlers ehl...@ucalgary.ca wrote: On 2010-09-23 17:57, array chip wrote: Yes, it does what I want. Thank you Peter! Just wondering what else grid.pars controls? not just the symbol in legend, right? John You can have a look at ?gpar (after loading package 'grid'). For example, your original request could be handled with setting lex=2, but that would thicken all lines including the axes and tick marks. I'm not sure that my advice was the best; it's just what occured to me at the moment. Perhaps Deepayan will weigh in with the definitive solution. There is nothing better at the moment, because trellis.par.get(superpose.symbol) etc. do not have a lwd component. I should probably add it. -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice xyplot and groups
On Wed, Sep 22, 2010 at 12:21 AM, Axel axelg...@gmail.com wrote: Hi, I'm trying to plot many (x, y) data files using the xyplot function from the lattice package. Each file can be classified by set name (s1, s2,...) and data type (A, B, ...). Each data set contains a different number of files. If the data is grouped by type or set and visualized as line plot with xyplot(type='l'), the first and last point are joined into a closed line that traverses the whole plot from left to right. This is an example showing the problem: library(lattice) x1 - seq(-10, 10, 0.5) x2 - seq(-10, 10, 0.1) df - data.frame(x=x1, y=sin(x1), id='1a_s1', type='A', set='s1') df - rbind(df, data.frame(x=x1, y=cos(x1), id='1b_s1', type='B', set='s1')) df - rbind(df, data.frame(x=x1, y=3*sin(2*x1), id='2a_s1', type='A', set='s1')) df - rbind(df, data.frame(x=x1, y=3*cos(2*x1), id='2b_s1', type='B', set='s1')) df - rbind(df, data.frame(x=x2, y=sin(x2), id='1a_s1', type='A', set='s2')) df - rbind(df, data.frame(x=x2, y=cos(x2), id='1b_s1', type='B', set='s2')) df - rbind(df, data.frame(x=x2, y=3*sin(2*x2), id='2a_s1', type='A', set='s2')) df - rbind(df, data.frame(x=x2, y=3*cos(2*x2), id='2b_s1', type='B', set='s2')) p=xyplot(y~x|set, df, type='l', group=type, auto.key = list(points = FALSE, lines = TRUE, columns = 2)) print(p) I would really appreciate if you could suggest a way to keep the lines open, either by changing the plot command or by building the data frame in a better way. One solution would be to group by id, but then I don't know if it is possible to make the line color the same for a given data type. That would be the logically correct approach. Here are a couple of ways to specify color: xyplot(y~x|set, df, type='l', groups=id, auto.key = list(text = c(A, B), lines = TRUE, points = FALSE, columns = 2), par.settings = list(superpose.line = Rows(trellis.par.get(superpose.line), 1:2))) xyplot(y~x|set, df, type='l', groups=id, auto.key = list(text = c(A, B), lines = TRUE, points = FALSE, columns = 2), par.settings = simpleTheme(col = c(blue, red))) -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data manipulation in R
Hi: Please provide a minimal reproducible example that resembles your real data so that people can try it out and provide potential solutions for you. Show what you tried that failed, and what you expect. A number of people on this list are very adept in data summarization, but most of them are loath to provide abstract solutions unless the problem is crystal clear. On Thu, Sep 23, 2010 at 8:41 PM, Thomas Parr thomasbp...@gmail.com wrote: If this has already been answered, my apologies in advance I am relatively new to this aspect of [R]. it is a bit of a basic question. I have 4 columns of data (site, Date, measurement type, value) in a tab delimited text file. Site is a site where measurements were collected, Date is a date in DD/MM/ format, measurement is a code for the type of measurement made, and value just the value observed. So each site has multiple dates on which it was sampled and each date has multiple measurement types (fortunately only one value per measurement type per day). I want to know how I can separate this into multiple columns by measurement type averaged over the range of dates available. The output would have a single averaged measurement value per site. This suggests you may need to reshape your data first. Site, Measurement 1, measurement2, measurement3, etc. Matrices are OK, data frames are usually better. I have been reading it in as a matrix as.matrix(read.table(myfile.txt, headers=TRUE)), but I don't quite know what to do with it afterward. There are several functions/packages that are more than capable of solving your problem, but it would be a lot easier and more productive if you provide a concrete example. HTH, Dennis Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] delete d-jackknife with R ?
Hello, I would like to know if there exists a function in a library or code development in /R/ to do delete d-jackknife ? In fact, instead of leaving out one observation at a time as with jackknife, we leave out /d/ observations. Thank you very much for your answer ! Lise Bellanger -- Lise Bellanger Université de Nantes Département de Mathématiques, Laboratoire Jean Leray UMR CNRS 6629 2, Rue de la Houssinière BP 92208 - F-44322 Nantes Cedex 03 Tél. : (33|0) 2 51 12 59 00 (ou 43) - Fax : (33|0) 2 51 12 59 12 E-Mail : lise.bellan...@univ-nantes.fr URL : http://www.math.sciences.univ-nantes.fr/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] looking for a faster way to compare two columns of a matrix
Hello Gustavo, Not sure if I've got all the details of your metric, but what about this... xx - x[ , combn(5,2)] i - seq(2, ncol(xx), 2) colSums(xx[,i-1] xx[,i] xx[,i] 0) / colSums(xx[,i] 0) Michael On 24 September 2010 02:53, Gustavo Carvalho gustavo.bi...@gmail.com wrote: Please consider this matrix: x - structure(c(5, 4, 3, 2, 1, 6, 3, 2, 1, 0, 3, 2, 1, 0, 0, 2, 1, 1, 0, 0, 2, 0, 0, 0, 0), .Dim = c(5L, 5L)) For each pair of columns, I want to calculate the proportion of entries different than 0 in column j (i j) that have lower values than the entries in the same row in column i: x[, 1:2] sum((x[,1] x[,2]) (x[,2] 0))/sum(x[,2] 0) Thus, for this pair, 3 of the 4 entries in the second column are lower than the entries in the same row in the first column. When both columns of a given pair have the same number of cells different than 0, the value of the metric is 0. x[, 3:4] colSums(x[, 3:4] 0) The same if column j has more valid ( 0) entries. I've been doing this using this idea: combinations - combn(1:ncol(x), 2) values - numeric(ncol(combinations)) for (i in 1:ncol(combinations)) { pair - combinations[,i] first - x[, pair[1]] second - x[, pair[2]] if (sum(first 0) = sum(second 0)) next values[i] - sum(first - second 0 second 0) / sum(second 0) } values Anyway, I was wondering if there is a faster/better way. I've tried putting the code from the for loop into a function and passing it to combn but, as expected, it didn't help much. Any pointers to functions that I should be looking into will be greatly appreciated. Thank you very much, Gustavo. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with any()
Hi I have following line of code: any(as.integer(c(1, 3))) == 3 [1] FALSE Shouldn't I expect it is true? Thanks, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with any()
Hello, try this way : any(as.integer(c(1, 3))==3) cheers, Jonas Christofer Bogaso a écrit : Hi I have following line of code: any(as.integer(c(1, 3))) == 3 [1] FALSE Shouldn't I expect it is true? Thanks, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with any()
Hi Christofer, I have just repeated this and changed the code a little and it gives the correct result any(as.integer(c(1, 3)) == 3) [1] TRUE any(as.integer(c(1, 3)) == 2) [1] FALSE any(as.integer(c(1, 3)) == 1) [1] TRUE HTH Chris On 24 Sep 2010, at 10:07, Christofer Bogaso wrote: any(as.integer(c(1, 3))) == 3 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Update RMySQL and ... it's no more running
Hello, After an upgrade, RGui was crashing each time I was trying to connect to MySQL database. The error message was again : 'RMySQL was compiled with MySQL 5.0.67 but loading MySQL 5.1.44' I found an https://stat.ethz.ch/pipermail/r-sig-db/2009q1/000572.html old discussion giving this http://www.stat.berkeley.edu/users/spector/s133/RMySQL_windows.html link . I applied this recipe and it now works well. Have a nice week-end, Ptit bleu. -- View this message in context: http://r.789695.n4.nabble.com/Update-RMySQL-and-it-s-no-more-running-tp1561401p2553239.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving equations involving gamma functions
Shant Ch wrote: Hi, I want to find a value of n1. I used the following code but I am getting the error - Error in as.vector(x, mode) : cannot coerce type 'closure' to vector of type 'any' n=10 a_g-(1/(n*(n-1)))*((pi/3)*(n+1)+(2*sqrt(3)*(n-2))-4*n+6) a_s-function(n1) { t1=(n1-1)/2; (t1*(gamma(t1)/gamma(n1/2))^2)/2-1-a_g } xm-solve(a_s) Have you looked in the documentation for solve? Have you done ?solve You will see that solve is for solving systems of linear equations. First you should graph your function to get some idea where the zero (if there is one) is. For example curve(a_s,from=1.1,to=10) and then start narrowing down the range curve(a_s,from=1.3,to=2) and finally use uniroot to find a zero: ?uniroot uniroot(a_s,c(1.3,2)) I hope this wasn't homework. /Berend -- View this message in context: http://r.789695.n4.nabble.com/Solving-equations-involving-gamma-functions-tp2553058p2553240.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] boundary check
Dear R, I have a covariates matrix with 10 observations, e.g. X - matrix(rnorm(50), 10, 5) X [,1][,2][,3][,4] [,5] [1,] 0.24857135 0.30880745 -1.44118657 1.10229027 1.0526010 [2,] 1.24316806 0.36275370 -0.40096866 -0.24387888 -1.5324384 [3,] -0.33504014 0.42996246 0.03902479 -0.84778875 -2.4754644 [4,] 0.06710229 1.01950917 -0.09325091 -0.03222811 0.4127816 [5,] -0.13619141 1.33143821 -0.79958805 2.08274102 0.6901768 [6,] -0.45060357 0.19348831 -1.23793647 -0.72440163 0.5057326 [7,] -1.20740516 0.20231086 1.15584485 0.8170 -1.2719855 [8,] -1.81166284 -0.07913113 -0.91080581 -0.34774436 0.9552182 [9,] 0.19131383 0.14980569 -0.37458224 -0.09371273 -1.7667203 [10,] -0.85159276 -0.66679528 1.63019340 0.56920196 -2.4049600 And I define a boundary of X: The smallest ball that nests all the observations of X. I wish to check if a particular point x_i x_i - matrix(rnorm(5), 1, 5) x_i [,1] [,2] [,3] [,4] [,5] [1,] -0.1525543 0.4606419 -0.1011011 -1.557225 -1.035694 is inside the boundary of X or not. I know it's easy to do it with 1-D or 2-D, but I don't knot how to manage it when the dimension is large. Can someone give a hint? Thanks in advance! Feng -- Feng Li Department of Statistics Stockholm University 106 91 Stockholm, Sweden http://feng.li/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loadlibrary failure
We have got this problem,: library (RMySQL) Error : .onLoad failed in loadNamespace() for 'RMySQL', details: call: inDL(x, as.logical(local), as.logical(now), ...) error: imposible cargar la biblioteca compartida 'C:/ARCHIV~1/R/R-211~1.1/library/RMySQL/libs/RMySQL.dll': LoadLibrary failure: El acceso a la dirección de memoria no es válido. Error: package/namespace load failed for 'RMySQL' There are a solution? Thank's __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] change tick spacing on the axis of a plot.
On 09/24/2010 05:42 PM, skan wrote: Hello How can I change the spacing of tick marks on the axis a plot? What parameters should I use on base plot or on rgl? Hi skan, The usual way this is done is to suppress the axes in the plot: plot(...,axes=FALSE) and then add the axes individually: axis(1,at=1:8,labels=1:8) axis(2,at=seq(0,70,by=10),labels=seq(0,70,by=10)) box() If you try to place the labels too closely, some of them will not be displayed. You can get around that by specifying empty labels: axis(1,at=1:10,labels=rep(,10)) and then using the mtext function to add the labels to the ticks. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Data manipulation in R
Hi r-help-boun...@r-project.org napsal dne 24.09.2010 05:41:30: If this has already been answered, my apologies in advance I am relatively new to this aspect of [R]. it is a bit of a basic question. I have 4 columns of data (site, Date, measurement type, value) in a tab delimited text file. Site is a site where measurements were collected, Date is a date in DD/MM/ format, measurement is a code for the type of measurement made, and value just the value observed. So each site has multiple dates on which it was sampled and each date has multiple measurement types (fortunately only one value per measurement type per day). I want to know how I can separate this into multiple columns by measurement type averaged over the range of dates available. The output would have a single averaged measurement value per site. Try to look at reshape function or reshape package. Site, Measurement 1, measurement2, measurement3, etc. I have been reading it in as a matrix as.matrix(read.table(myfile.txt, headers=TRUE)), but I don't quite know what to do with it afterward. Do not do that. read.table reads your data as data frame (correct) but you transform them to matrix and it wil result in ***character*** matrix as matrices can have values of only one type. See Intro manual which shall come with your installation of R (chapters 5 and 6). Well and if you are in reading you'd better go through thole document. With 100 pages it is not so big and you want to read only basics it is in first 30 pages. Regards Petr Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Inaccuracy of kummerU (fAsianOptions) (Tricomi function)
Hello, I need to use the confluent function of second kind, also known as Tricomi function. It is implemented as kummerU() function in fAsianOptions package, but I've found very inaccurate values, comparing with those provided by Mathematica. I think Mathematica values are OK because kummerU values leads to negative probabilities. For example, if you try the kummerU() function example, you obtain: x = c(0.001, 0.01, 0.1, 1, 10, 100, 1000) a = 1/3; b = 2/3 U = Re ( kummerU(x, a = a, b = b) ) cbind(x, U) x U Mathematica [1,] 1e-03 1.827638e+00 1.82764 [2,] 1e-02 1.659957e+00 1.65996 [3,] 1e-01 1.341021e+00 1.34102 [4,] 1e+00 8.745968e-01 0.874597 [5,] 1e+01 4.548890e-01 0.454805 [6,] 1e+02 1.731286e+35 0.214970 [7,] 1e+03 -1.318568e+76 0.0999778 I've added a third column in the table with Mathematica values, where you can see the great differences in the last values. I have a lot of examples of these differences, with non very strange parameters values. I've also tried to define the Tricomi function in relation to the usual confluent function given by genhypergeo(), but errors are even greater. Does anybody know another function or another way to define an accurate version of Tricomi function? -- Dr. Antonio José Sáez Castillo Dpto. de Estadística e Investigación Operativa Escuela Politécnica Superior de Linares Universidad de Jaén C/ Alfonso X El Sabio 28, 23700 Linares (Jaén) ESPAÑA Tlf. y FAX +34 953 648578 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boundary check
Hello convex hulls in large numbers of dimensions are hard. For your problem, though, one can tell whether a given point is inside or outside by using linear programming: X - matrix(rnorm(50), 10, 5) x_i - matrix(rnorm(5), 1, 5) isin.chull function(candidate,p,plot=FALSE,give.answers=FALSE, ...){ if(plot){ plot(p,...) p(candidate[1],candidate[2], pch=16) } n - nrow(p) # number of points d - ncol(p) # number of dimensions p - t(sweep(p,2,candidate)) jj - simplex(a=rep(1,n),A3=rbind(p,1),b3=c(0*candidate,1)) if(give.answers){ return(jj) } else { return((jj$solved = 0) all(jj$soln1)) } } isin.chull(x_i,X) [1] FALSE (we can discuss offline; I'll summarize) HTH rksh On 24/09/10 10:44, Feng Li wrote: Dear R, I have a covariates matrix with 10 observations, e.g. X - matrix(rnorm(50), 10, 5) X [,1][,2][,3][,4] [,5] [1,] 0.24857135 0.30880745 -1.44118657 1.10229027 1.0526010 [2,] 1.24316806 0.36275370 -0.40096866 -0.24387888 -1.5324384 [3,] -0.33504014 0.42996246 0.03902479 -0.84778875 -2.4754644 [4,] 0.06710229 1.01950917 -0.09325091 -0.03222811 0.4127816 [5,] -0.13619141 1.33143821 -0.79958805 2.08274102 0.6901768 [6,] -0.45060357 0.19348831 -1.23793647 -0.72440163 0.5057326 [7,] -1.20740516 0.20231086 1.15584485 0.8170 -1.2719855 [8,] -1.81166284 -0.07913113 -0.91080581 -0.34774436 0.9552182 [9,] 0.19131383 0.14980569 -0.37458224 -0.09371273 -1.7667203 [10,] -0.85159276 -0.66679528 1.63019340 0.56920196 -2.4049600 And I define a boundary of X: The smallest ball that nests all the observations of X. I wish to check if a particular point x_i x_i - matrix(rnorm(5), 1, 5) x_i [,1] [,2] [,3] [,4] [,5] [1,] -0.1525543 0.4606419 -0.1011011 -1.557225 -1.035694 is inside the boundary of X or not. I know it's easy to do it with 1-D or 2-D, but I don't knot how to manage it when the dimension is large. Can someone give a hint? Thanks in advance! Feng -- Robin K. S. Hankin Uncertainty Analyst University of Cambridge 19 Silver Street Cambridge CB3 9EP 01223-764877 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boundary check
Hello, If an N-dimensional convex hull fits your idea of a smallest ball then you could try the convhulln function in the geometry package. For testing if a new point is inside a previously derived hull, one brute force approach is to rbind the new point to your data, generate a new hull and see if it is the same as the previous one. I've only used convhulln in low dimensions so I don't know how efficient it is when N is large. Hope this helps. Michael On 24 September 2010 19:44, Feng Li feng...@stat.su.se wrote: Dear R, I have a covariates matrix with 10 observations, e.g. X - matrix(rnorm(50), 10, 5) X [,1] [,2] [,3] [,4] [,5] [1,] 0.24857135 0.30880745 -1.44118657 1.10229027 1.0526010 [2,] 1.24316806 0.36275370 -0.40096866 -0.24387888 -1.5324384 [3,] -0.33504014 0.42996246 0.03902479 -0.84778875 -2.4754644 [4,] 0.06710229 1.01950917 -0.09325091 -0.03222811 0.4127816 [5,] -0.13619141 1.33143821 -0.79958805 2.08274102 0.6901768 [6,] -0.45060357 0.19348831 -1.23793647 -0.72440163 0.5057326 [7,] -1.20740516 0.20231086 1.15584485 0.8170 -1.2719855 [8,] -1.81166284 -0.07913113 -0.91080581 -0.34774436 0.9552182 [9,] 0.19131383 0.14980569 -0.37458224 -0.09371273 -1.7667203 [10,] -0.85159276 -0.66679528 1.63019340 0.56920196 -2.4049600 And I define a boundary of X: The smallest ball that nests all the observations of X. I wish to check if a particular point x_i x_i - matrix(rnorm(5), 1, 5) x_i [,1] [,2] [,3] [,4] [,5] [1,] -0.1525543 0.4606419 -0.1011011 -1.557225 -1.035694 is inside the boundary of X or not. I know it's easy to do it with 1-D or 2-D, but I don't knot how to manage it when the dimension is large. Can someone give a hint? Thanks in advance! Feng -- Feng Li Department of Statistics Stockholm University 106 91 Stockholm, Sweden http://feng.li/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] randomForest - PartialPlot - reg
In a partial dependence plot, only the relative scale, not absolute scale, of the y-axis is meaningful. I.e., you can compare the range of the curves between partial dependence plots of two different variables, but not the actual numbers on the axis. The range is compressed compared to the original data because of averaging. For classification, the function is computed in the logit scale, so it's not necessarily positive. High does mean higher probability for the target class. Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Vijayan Padmanabhan Sent: Wednesday, September 22, 2010 11:47 PM To: r-help Subject: [R] randomForest - PartialPlot - reg Dear R Group I am not sure if this is the right forum to raise this query, but i would rather give it a try and aim for reaching the right person who might be a part of this group who can help. I have a query on interpretation of PartialPlot in package randomForest. In my earlier queries in this regard, I probably did not give sufficient explanation to elicit the intended details in the explanations being provided.. Hence I am resending the query with examples and bit more details. In a scenario where a set of continuous variables vs a class response is being modeled by RF, say the iris example.. using the following code, how do I interpret the partial plot that is generated? library(randomForest) data(iris) set.seed(543) iris.rf - randomForest(Species~., iris) partialPlot(iris.rf, iris, Sepal.Length, setosa) How is the y-axis values to be understood? A straight forward Textual interpretation of the output from the experts in this area, would help me understand this concept of marginal effect being plotted for the variable Sepal.Length on the which.class=setosa. Thanks for your help. Regards Vijayan Padmanabhan What is expressed without proof can be denied without proof - Euclide. Can you avoid printing this? Think of the environment before printing the email. -- - Please visit us at www.itcportal.com ** This Communication is for the exclusive use of the intended recipient (s) and shall not attach any liability on the originator or ITC Ltd./its Subsidiaries/its Group Companies. If you are the addressee, the contents of this email are intended for your use only and it shall not be forwarded to any third party, without first obtaining written authorisation from the originator or ITC Ltd./its Subsidiaries/its Group Companies. It may contain information which is confidential and legally privileged and the same shall not be used or dealt with by any third party in any manner whatsoever without the specific consent of ITC Ltd./its Subsidiaries/its Group Companies. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cv.binary
Dear List, I am looking to perform a cross validation on my glm using the cv.binary function in DAAG. However i am getting the following error - cv.binary(modelhe) Error in sample(nfolds, m, replace = TRUE) : invalid 'size' argument Any help would be greatly appreciated. Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loadlibrary failure
Hello Eduardo, Maybe have a look at this thread : http://r.789695.n4.nabble.com/Update-RMySQL-and-it-s-no-more-running-td1561401.html#a1561401 Update RMySQL and ... it's no more running The last message gives a link to a procedure to configure RMySQL. It worked for me. Have a nice week-end, Ptit Bleu. -- View this message in context: http://r.789695.n4.nabble.com/Re-loadlibrary-failure-tp2553275p2553321.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Great Canadian Sportsman's September Newsletter
[[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm binomial loglog (NOT cloglog) link
I was looking at different link functions for binomial glms recently for the same reason as you (more zeros than ones). I did a bit of reading up on the various link functions and IIRC, you can use the cloglog link on your data, just turn your 0's into 1's and vice versa. This was stated in the one or two references I looked at as to why only one of cloglog and loglog links is often provided in software __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] randomForest - PartialPlot - reg
Thanks for the elaborate detailing. I see sense now. Regards Vijayan Padmanabhan What is expressed without proof can be denied without proof - Euclide. Liaw, Andy andy_l...@merck.com 09/24/2010 04:31 PM To Vijayan Padmanabhan v.padmanab...@itc.in, r-help r-help@r-project.org cc Subject RE: [R] randomForest - PartialPlot - reg In a partial dependence plot, only the relative scale, not absolute scale, of the y-axis is meaningful. I.e., you can compare the range of the curves between partial dependence plots of two different variables, but not the actual numbers on the axis. The range is compressed compared to the original data because of averaging. For classification, the function is computed in the logit scale, so it's not necessarily positive. High does mean higher probability for the target class. Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Vijayan Padmanabhan Sent: Wednesday, September 22, 2010 11:47 PM To: r-help Subject: [R] randomForest - PartialPlot - reg Dear R Group I am not sure if this is the right forum to raise this query, but i would rather give it a try and aim for reaching the right person who might be a part of this group who can help. I have a query on interpretation of PartialPlot in package randomForest. In my earlier queries in this regard, I probably did not give sufficient explanation to elicit the intended details in the explanations being provided.. Hence I am resending the query with examples and bit more details. In a scenario where a set of continuous variables vs a class response is being modeled by RF, say the iris example.. using the following code, how do I interpret the partial plot that is generated? library(randomForest) data(iris) set.seed(543) iris.rf - randomForest(Species~., iris) partialPlot(iris.rf, iris, Sepal.Length, setosa) How is the y-axis values to be understood? A straight forward Textual interpretation of the output from the experts in this area, would help me understand this concept of marginal effect being plotted for the variable Sepal.Length on the which.class=setosa. Thanks for your help. Regards Vijayan Padmanabhan What is expressed without proof can be denied without proof - Euclide. Can you avoid printing this? Think of the environment before printing the email. -- - Please visit us at www.itcportal.com ** This Communication is for the exclusive use of the intended recipient (s) and shall not attach any liability on the originator or ITC Ltd./its Subsidiaries/its Group Companies. If you are the addressee, the contents of this email are intended for your use only and it shall not be forwarded to any third party, without first obtaining written authorisation from the originator or ITC Ltd./its Subsidiaries/its Group Companies. It may contain information which is confidential and legally privileged and the same shall not be used or dealt with by any third party in any manner whatsoever without the specific consent of ITC Ltd./its Subsidiaries/its Group Companies. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:31}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boundary check
Hi, I remember a discussion we had on this list a few months ago for a better way to decide if a point is inside a convex hull. It eventually lead to a R function in this post, http://tolstoy.newcastle.edu.au/R/e8/help/09/12/8784.html I don't know if it was included in the geometry package in the end. HTH, baptiste On 24 September 2010 12:44, Michael Bedward michael.bedw...@gmail.com wrote: Hello, If an N-dimensional convex hull fits your idea of a smallest ball then you could try the convhulln function in the geometry package. For testing if a new point is inside a previously derived hull, one brute force approach is to rbind the new point to your data, generate a new hull and see if it is the same as the previous one. I've only used convhulln in low dimensions so I don't know how efficient it is when N is large. Hope this helps. Michael On 24 September 2010 19:44, Feng Li feng...@stat.su.se wrote: Dear R, I have a covariates matrix with 10 observations, e.g. X - matrix(rnorm(50), 10, 5) X [,1] [,2] [,3] [,4] [,5] [1,] 0.24857135 0.30880745 -1.44118657 1.10229027 1.0526010 [2,] 1.24316806 0.36275370 -0.40096866 -0.24387888 -1.5324384 [3,] -0.33504014 0.42996246 0.03902479 -0.84778875 -2.4754644 [4,] 0.06710229 1.01950917 -0.09325091 -0.03222811 0.4127816 [5,] -0.13619141 1.33143821 -0.79958805 2.08274102 0.6901768 [6,] -0.45060357 0.19348831 -1.23793647 -0.72440163 0.5057326 [7,] -1.20740516 0.20231086 1.15584485 0.8170 -1.2719855 [8,] -1.81166284 -0.07913113 -0.91080581 -0.34774436 0.9552182 [9,] 0.19131383 0.14980569 -0.37458224 -0.09371273 -1.7667203 [10,] -0.85159276 -0.66679528 1.63019340 0.56920196 -2.4049600 And I define a boundary of X: The smallest ball that nests all the observations of X. I wish to check if a particular point x_i x_i - matrix(rnorm(5), 1, 5) x_i [,1] [,2] [,3] [,4] [,5] [1,] -0.1525543 0.4606419 -0.1011011 -1.557225 -1.035694 is inside the boundary of X or not. I know it's easy to do it with 1-D or 2-D, but I don't knot how to manage it when the dimension is large. Can someone give a hint? Thanks in advance! Feng -- Feng Li Department of Statistics Stockholm University 106 91 Stockholm, Sweden http://feng.li/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice xyplot and groups
That would be the logically correct approach. Here are a couple of ways to specify color: That's perfect! Thank you very much. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boundary check
Thanks. I agree with you that the speed and memory issues might be (actually is) a big problem for big dimensions. It is interesting to know to solve this by using linear programming. Buy the way, it seems a potential bug in your function if you try this X - matrix(rnorm(50), 10, 5) x_i-X[1,,drop=FALSE] isin.chull(x_i,X) Error in A.out[, basic] - iden(M) : subscript out of bounds Or I must be wrong somewhere. Feng On Sep 24, 12:39 pm, Robin Hankin rk...@cam.ac.uk wrote: Hello convex hulls in large numbers of dimensions are hard. For your problem, though, one can tell whether a given point is inside or outside by using linear programming: X - matrix(rnorm(50), 10, 5) x_i - matrix(rnorm(5), 1, 5) isin.chull function(candidate,p,plot=FALSE,give.answers=FALSE, ...){ if(plot){ plot(p,...) p(candidate[1],candidate[2], pch=16) } n - nrow(p) # number of points d - ncol(p) # number of dimensions p - t(sweep(p,2,candidate)) jj - simplex(a=rep(1,n),A3=rbind(p,1),b3=c(0*candidate,1)) if(give.answers){ return(jj) } else { return((jj$solved = 0) all(jj$soln1)) } } isin.chull(x_i,X) [1] FALSE (we can discuss offline; I'll summarize) HTH rksh On 24/09/10 10:44, Feng Li wrote: Dear R, I have a covariates matrix with 10 observations, e.g. X - matrix(rnorm(50), 10, 5) X [,1] [,2] [,3] [,4] [,5] [1,] 0.24857135 0.30880745 -1.44118657 1.10229027 1.0526010 [2,] 1.24316806 0.36275370 -0.40096866 -0.24387888 -1.5324384 [3,] -0.33504014 0.42996246 0.03902479 -0.84778875 -2.4754644 [4,] 0.06710229 1.01950917 -0.09325091 -0.03222811 0.4127816 [5,] -0.13619141 1.33143821 -0.79958805 2.08274102 0.6901768 [6,] -0.45060357 0.19348831 -1.23793647 -0.72440163 0.5057326 [7,] -1.20740516 0.20231086 1.15584485 0.8170 -1.2719855 [8,] -1.81166284 -0.07913113 -0.91080581 -0.34774436 0.9552182 [9,] 0.19131383 0.14980569 -0.37458224 -0.09371273 -1.7667203 [10,] -0.85159276 -0.66679528 1.63019340 0.56920196 -2.4049600 And I define a boundary of X: The smallest ball that nests all the observations of X. I wish to check if a particular point x_i x_i - matrix(rnorm(5), 1, 5) x_i [,1] [,2] [,3] [,4] [,5] [1,] -0.1525543 0.4606419 -0.1011011 -1.557225 -1.035694 is inside the boundary of X or not. I know it's easy to do it with 1-D or 2-D, but I don't knot how to manage it when the dimension is large. Can someone give a hint? Thanks in advance! Feng -- Robin K. S. Hankin Uncertainty Analyst University of Cambridge 19 Silver Street Cambridge CB3 9EP 01223-764877 __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boundary check
Hi, Below Baptiste's message I attach the R code and the .Rd documentation I treated as 'final', it may be slightly different from that in the Dec 2009 post. I did submit if for inclusion in the geometry package, but last time I checked it wasn't there. I have found (and others have reported via private e-mail) that high dimensional data sets cause crashes (R exits with no warnings or messages). I tentatively believe this is a 'bug' in convhulln, but tracking it takes me to a complicated package and compiled code. It isn't a problem for me, so I can't make time to follow it up. hth. Keith J baptiste auguie baptiste.aug...@googlemail.com wrote in message news:aanlktikf3+higwyhwyxeu2brwqs0x9mtywz9xzyq_...@mail.gmail.com... Hi, I remember a discussion we had on this list a few months ago for a better way to decide if a point is inside a convex hull. It eventually lead to a R function in this post, http://tolstoy.newcastle.edu.au/R/e8/help/09/12/8784.html I don't know if it was included in the geometry package in the end. HTH, baptiste - inhull - function(testpts, calpts, hull=convhulln(calpts), tol=mean(mean(abs(calpts)))*sqrt(.Machine$double.eps)) { # # R implementation of the Matlab code by John D'Errico 04 Mar 2006 (Updated 30 Oct 2006) # downloaded from http://www.mathworks.com/matlabcentral/fileexchange/10226-inhull # with some modifications and simplifications # # Efficient test for points inside a convex hull in n dimensions # #% arguments: (input) #% testpts - nxp array to test, n data points, in p dimensions #% If you have many points to test, it is most efficient to #% call this function once with the entire set. #% #% calpts - mxp array of vertices of the convex hull, as used by #% convhulln. #% #% hull - (OPTIONAL) tessellation (or triangulation) generated by convhulln #% If hull is left empty or not supplied, then it will be #% generated. #% #% tol - (OPTIONAL) tolerance on the tests for inclusion in the #% convex hull. You can think of tol as the distance a point #% may possibly lie outside the hull, and still be perceived #% as on the surface of the hull. Because of numerical slop #% nothing can ever be done exactly here. I might guess a #% semi-intelligent value of tol to be #% #% tol = 1.e-13*mean(abs(calpts(:))) #% #% In higher dimensions, the numerical issues of floating #% point arithmetic will probably suggest a larger value #% of tol. #% # In this R implementation default tol=mean(mean(abs(calpts)))*sqrt(.Machine$double.eps) # # VALUE: Matlab returns a vector of TRUE (inside/on) or FALSE (outside) # This R implementation returns an integer vector of length n # 1 = inside hull # -1 = inside hull # 0 = on hull (to precision indicated by tol) # # ensure arguments are matrices (not data frames) and get sizes calpts - as.matrix(calpts) testpts - as.matrix(testpts) p - ncol(calpts) # columns in calpts cx - nrow(testpts) # rows in testpts nt - nrow(hull)# number of simplexes in hull # find normal vectors to each simplex nrmls - matrix(NA, nt, p) # predefine each nrml as NA, degenerate degenflag - matrix(TRUE, nt, 1) for (i in 1:nt) { nullsp - t(Null(t(calpts[hull[i,-1],] - matrix(calpts[hull[i,1],],p-1,p, byrow=TRUE if (dim(nullsp)[1] == 1) { nrmls[i,] - nullsp degenflag[i] - FALSE}} # Warn of degenerate faces, and remove corresponding normals if(sum(degenflag) 0) warning(sum(degenflag), degenerate faces in convex hull) nrmls - nrmls[!degenflag,] nt - nrow(nrmls) # find center point in hull, and any (1st) point in the plane of each simplex center = colMeans(calpts) a - calpts[hull[!degenflag,1],] # scale normal vectors to unit length and ensure pointing inwards nrmls - nrmls/matrix(apply(nrmls, 1, function(x) sqrt(sum(x^2))), nt, p) dp - sign(apply((matrix(center, nt, p, byrow=TRUE)-a) * nrmls, 1, sum)) nrmls - nrmls*matrix(dp, nt, p) # if min across all faces of dot((x - a),nrml) is # +ve then x is inside hull # 0 then x is on hull # -ve then x is outside hull # Instead of dot((x - a),nrml) use dot(x,nrml) - dot(a, nrml) aN - diag(a %*% t(nrmls)) val - apply(testpts %*% t(nrmls) - matrix(aN, cx, nt, byrow=TRUE), 1,min) # code values inside 'tol' to zero, return sign as integer val[abs(val) tol] - 0 as.integer(sign(val)) } \name{inhull} \alias{inhull} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Test if one set of N-D points is inside the convex hull defined by another set } \description{ R implementation of the Matlab code by John D'Errico 04 Mar 2006 (Updated 30 Oct 2006) downloaded from
[R] Error: unable to load shared library tcltk.so
Dear R users, when trying library(tcltk) i become following error: Loading Tcl/Tk interface ... Error : .onLoad failed in loadNamespace() for 'tcltk', details: call: dyn.load(file, DLLpath = DLLpath, ...) error: unable to load shared library '/usr/local/R/R-2.11.1/lib/R/library/tcltk/libs/tcltk.so': libtcl8.4.so.0: cannot open shared object file: No such file or directory Error: package/namespace load failed for 'tcltk' Will be very greatful for you help. sessionInfo() R version 2.11.1 (2010-05-31) i686-pc-linux-gnu Dr. Alla Bulashevska Freiburger Center for Data Modeling Germany __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boundary check
Feng thanks for this. The problem you report is reproducible; it originates from simplex() of the boot packge. It is ultimately due to the fact that x_i is precisely *on* the convex hull, which is evidently causing problems. I'll investigate it. In the short term, you can break the degeneracy: isin.chull(X[1,]+1e-10,X) [1] FALSE but I don't know if that is acceptable to you. best rksh On 24/09/10 13:17, Feng Li wrote: Thanks. I agree with you that the speed and memory issues might be (actually is) a big problem for big dimensions. It is interesting to know to solve this by using linear programming. Buy the way, it seems a potential bug in your function if you try this X - matrix(rnorm(50), 10, 5) x_i-X[1,,drop=FALSE] isin.chull(x_i,X) Error in A.out[, basic] - iden(M) : subscript out of bounds Or I must be wrong somewhere. Feng On Sep 24, 12:39 pm, Robin Hankin rk...@cam.ac.uk wrote: Hello convex hulls in large numbers of dimensions are hard. For your problem, though, one can tell whether a given point is inside or outside by using linear programming: X - matrix(rnorm(50), 10, 5) x_i - matrix(rnorm(5), 1, 5) isin.chull function(candidate,p,plot=FALSE,give.answers=FALSE, ...){ if(plot){ plot(p,...) p(candidate[1],candidate[2], pch=16) } n - nrow(p) # number of points d - ncol(p) # number of dimensions p - t(sweep(p,2,candidate)) jj - simplex(a=rep(1,n),A3=rbind(p,1),b3=c(0*candidate,1)) if(give.answers){ return(jj) } else { return((jj$solved = 0) all(jj$soln1)) } } isin.chull(x_i,X) [1] FALSE (we can discuss offline; I'll summarize) HTH rksh On 24/09/10 10:44, Feng Li wrote: Dear R, I have a covariates matrix with 10 observations, e.g. X - matrix(rnorm(50), 10, 5) X [,1][,2][,3][,4] [,5] [1,] 0.24857135 0.30880745 -1.44118657 1.10229027 1.0526010 [2,] 1.24316806 0.36275370 -0.40096866 -0.24387888 -1.5324384 [3,] -0.33504014 0.42996246 0.03902479 -0.84778875 -2.4754644 [4,] 0.06710229 1.01950917 -0.09325091 -0.03222811 0.4127816 [5,] -0.13619141 1.33143821 -0.79958805 2.08274102 0.6901768 [6,] -0.45060357 0.19348831 -1.23793647 -0.72440163 0.5057326 [7,] -1.20740516 0.20231086 1.15584485 0.8170 -1.2719855 [8,] -1.81166284 -0.07913113 -0.91080581 -0.34774436 0.9552182 [9,] 0.19131383 0.14980569 -0.37458224 -0.09371273 -1.7667203 [10,] -0.85159276 -0.66679528 1.63019340 0.56920196 -2.4049600 And I define a boundary of X: The smallest ball that nests all the observations of X. I wish to check if a particular point x_i x_i - matrix(rnorm(5), 1, 5) x_i [,1] [,2] [,3] [,4] [,5] [1,] -0.1525543 0.4606419 -0.1011011 -1.557225 -1.035694 is inside the boundary of X or not. I know it's easy to do it with 1-D or 2-D, but I don't knot how to manage it when the dimension is large. Can someone give a hint? Thanks in advance! Feng -- Robin K. S. Hankin Uncertainty Analyst University of Cambridge 19 Silver Street Cambridge CB3 9EP 01223-764877 __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Robin K. S. Hankin Uncertainty Analyst University of Cambridge 19 Silver Street Cambridge CB3 9EP 01223-764877 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Standard Error for difference in predicted probabilities
Is there a way to estimate the standard error for the difference in predicted probabilities obtained from a logistic regression model? For example, this code gives the difference for the predicted probability of when x2==1 vs. when x2==0, holding x1 constant at its mean: y=rbinom(100,1,.4) x1=rnorm(100, 3, 2) x2=rbinom(100, 1, .7) mod=glm(y ~ x1 + x2, family=binomial) pred=predict(mod, newdata=data.frame(cbind(x1=rep(mean(x1), 100), x2)), type=response) diff=unique(pred)[1]-unique(pred)[2] diff I know that predict() will output SE's for each predicted value, but how do I generate a SE for the difference in those predicted values? Thanks in advance! Andrew Miles [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cv.binary
Hi Chris, On Fri, Sep 24, 2010 at 7:04 AM, Chris Mcowen chrismco...@gmail.com wrote: Dear List, I am looking to perform a cross validation on my glm using the cv.binary function in DAAG. However i am getting the following error - cv.binary(modelhe) Error in sample(nfolds, m, replace = TRUE) : invalid 'size' argument What is stored in m? R sample(1:10, 'a', replace=TRUE) Error in sample(1:10, a, replace = TRUE) : invalid 'size' argument -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] relaimpo
Hello, Does anyone know if relaimpo only applies to pure multiple linear regression models, i.e. - linear in the variables AND linear in the coefficients or is it safe to use it in models that are: - non-linear in the variables BUT linear in the coefficients? Thanks Jyotin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 95% CI of Kaplan-Meier survival in package survival
Hi: My result using Kaplan-Meier estimate in survival package was inconsistent with that from Minitab. The survival probabilities are same, but their 95% CI are different from other calculation. Could you help me find what is wrong with the code? Thanks in advance. library(survival) raw.a - c(80,24,22,12,86,-74,48,41,39,38, 27,24,22,19,18,16,11,10,6.3,76, -21,17,12,11,10,9.8,9.5,8.2,6.4, 4.7,4.3,4.3,4.2,4.1,1.6,31) data.a - data.frame(time = abs(raw.a), event = raw.a = 0) fit.a - survfit(Surv(time, event) ~ 1, data=data.a) summary(fit.a) Call: survfit(formula = Surv(time, event) ~ 1, data = data.a) time n.risk n.event survival std.err lower 95% CI upper 95% CI 1.6 36 1 0.9722 0.0274 0.920001.000 4.1 35 1 0.9444 0.0382 0.872511.000 4.2 34 1 0.9167 0.0461 0.830691.000 4.3 33 2 0.8611 0.0576 0.755240.982 4.7 31 1 0.8333 0.0621 0.720070.964 6.3 30 1 0.8056 0.0660 0.686110.946 6.4 29 1 0.7778 0.0693 0.653170.926 8.2 28 1 0.7500 0.0722 0.621090.906 9.5 27 1 0.7222 0.0747 0.589780.884 9.8 26 1 0.6944 0.0768 0.559160.862 10.0 25 2 0.6389 0.0801 0.499770.817 11.0 23 2 0.5833 0.0822 0.442610.769 12.0 21 2 0.5278 0.0832 0.387490.719 16.0 19 1 0.5000 0.0833 0.360660.693 17.0 18 1 0.4722 0.0832 0.334320.667 18.0 17 1 0. 0.0828 0.308460.640 19.0 16 1 0.4167 0.0822 0.283090.613 22.0 14 2 0.3571 0.0805 0.229620.555 24.0 12 2 0.2976 0.0773 0.178890.495 27.0 10 1 0.2679 0.0751 0.154630.464 31.0 9 1 0.2381 0.0724 0.131200.432 38.0 8 1 0.2083 0.0692 0.108650.399 39.0 7 1 0.1786 0.0654 0.087110.366 41.0 6 1 0.1488 0.0609 0.066730.332 48.0 5 1 0.1190 0.0555 0.047730.297 76.0 3 1 0.0794 0.0492 0.023550.267 80.0 2 1 0.0397 0.0373 0.006280.251 86.0 1 1 0. NaN NA NA - A R learner. -- View this message in context: http://r.789695.n4.nabble.com/95-CI-of-Kaplan-Meier-survival-in-package-survival-tp2553696p2553696.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] color of lines while printing through for loop
I am trying to find a convenient way to control line colors when printing from a for loop using the lines command. Right now I have solved this by creating a colors vector that is refered to in the loop with index. However, the colors choosen here are just 1,2,3,4,5... I would like to get colors from the col = rainbow(x) that you can use in plot() and set the to be my number of lines (I think I know how to set it to my number of lines, but I don't know how to implement this into my loop. Is this possible? This is my script: __ ###creating two emty vectors and opening a graphics window x - vector() y - vector() plot(x,y, xlim = c(pheno.dt$year[1],pheno.dt$year[nrow(pheno.dt)]), ylim = c(min(pheno.dt$julian, na.rm = TRUE),max(pheno.dt$julian, na.rm = TRUE)), xlab = Year, ylab = Julian Day) ###setting up colors sp_pheno.unique - unique(pheno.dt$sp_pheno) colors - seq(1,200,1) printloop for (i in 1:length(unique(pheno.dt$year))) { data.sub - subset(pheno.dt, pheno.dt$sp_pheno==sp_pheno.unique[i]) x - seq(pheno.dt$year[1],pheno.dt$year[nrow(pheno.dt)]) y - data.sub$julian lines(x,y, col = colors[i]) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Standard Error for difference in predicted probabilities
I think you can use the bootstrap to obtain the std error. My attempt for your problem and data is below. I would be interested if anyone can point out a problem with this approach. Darin y=rbinom(100,1,.4) x1=rnorm(100, 3, 2) x2=rbinom(100, 1, .7) diff - vector(mode=numeric, length=200) for (i in 1:200) { idx - sample(1:length(y), length(y), replace=T) mod=glm(y[idx] ~ x1[idx] + x2[idx], family=binomial) pred=predict(mod, newdata=data.frame(cbind(x1=rep(mean(x1[idx]), 100), x2=x2[idx])), type=response) diff[i]=unique(pred)[1]-unique(pred)[2] } sd(diff) On Fri, Sep 24, 2010 at 09:17:29AM -0400, Andrew Miles wrote: Is there a way to estimate the standard error for the difference in predicted probabilities obtained from a logistic regression model? For example, this code gives the difference for the predicted probability of when x2==1 vs. when x2==0, holding x1 constant at its mean: y=rbinom(100,1,.4) x1=rnorm(100, 3, 2) x2=rbinom(100, 1, .7) mod=glm(y ~ x1 + x2, family=binomial) pred=predict(mod, newdata=data.frame(cbind(x1=rep(mean(x1), 100), x2)), type=response) diff=unique(pred)[1]-unique(pred)[2] diff I know that predict() will output SE's for each predicted value, but how do I generate a SE for the difference in those predicted values? Thanks in advance! Andrew Miles [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Some questions about string processing
Hi all A couple of questions about string processing from someone who has only scratched the surface so far. 1) I am wanting to send some strings into a function to allow flexibility inside. My first idea has been e.g. auto_io - function( var_string, factors ) { # e.g. var_string sent as test_file.txt factors sent as x1 + x2 + x3 # input data_name - get( var_string ) # term_list - get( factors ) resp_data - read.table( data_name, header = TRUE ) # fit temp_model - lm( y ~ factors, data = resp_data ) etc... Neither the read.table() nor the lm() are working because in each case the string is not converted or understood properly. I'm sure this is possible (and in fact probably easy) but haven't yet seen the light. Could someone illuminate me here? 2) I will be wanting to process strings as lists of strings. For instance, I might instead want to send factors above as x1 x2 x3 and add the + when I need them, or perhaps * instead, and also look at sub-models by removing parts of the string etc. What functions should I be looking at here and are there any examples available? Thanks in advance. Feel free to CC me on your reply. Michael Hopkins Algorithm and Statistical Modelling Expert Upstream 23 Old Bond Street London W1S 4PZ Mob +44 0782 578 7220 DL +44 0207 290 1326 Fax +44 0207 290 1321 hopk...@upstreamsystems.com www.upstreamsystems.com IMPORTANT NOTICE The information in this e-mail and any attached files is...{{dropped:22}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Some questions about string processing
Michael - You're doing too much work half the time, and not enough the other half :-). Try this (untested): auto_io = function(data_name,factors){ resp_data = read.table(data_name, header = TRUE ) temp_model = lm(formula(paste('y',factors,sep='~')) data = resp_data ) . . . - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 24 Sep 2010, Michael Hopkins wrote: Hi all A couple of questions about string processing from someone who has only scratched the surface so far. 1) I am wanting to send some strings into a function to allow flexibility inside. My first idea has been e.g. auto_io - function( var_string, factors ) { # e.g. var_string sent as test_file.txt factors sent as x1 + x2 + x3 # input data_name - get( var_string ) # term_list - get( factors ) resp_data - read.table( data_name, header = TRUE ) # fit temp_model - lm( y ~ factors, data = resp_data ) etc... Neither the read.table() nor the lm() are working because in each case the string is not converted or understood properly. I'm sure this is possible (and in fact probably easy) but haven't yet seen the light. Could someone illuminate me here? 2) I will be wanting to process strings as lists of strings. For instance, I might instead want to send factors above as x1 x2 x3 and add the + when I need them, or perhaps * instead, and also look at sub-models by removing parts of the string etc. What functions should I be looking at here and are there any examples available? Thanks in advance. Feel free to CC me on your reply. Michael Hopkins Algorithm and Statistical Modelling Expert Upstream 23 Old Bond Street London W1S 4PZ Mob +44 0782 578 7220 DL +44 0207 290 1326 Fax +44 0207 290 1321 hopk...@upstreamsystems.com www.upstreamsystems.com IMPORTANT NOTICE The information in this e-mail and any attached files is...{{dropped:22}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bptest
you have to load package {lmtest} first. That is, library(lmtest) bptest(modelCH, data=KP) -- View this message in context: http://r.789695.n4.nabble.com/bptest-tp2553506p2579815.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Splitting a time + duration into a series of periods
Hi all, Here's what I have. I have a list of log-in times for users and how long their sessions were. user, login_time, session_min alice, 2010/01/01 10:00, 145 bob, 2010/01/01 11:00, 30 What I want to do is create a bar chart showing, in 1/2 hour segments, the number of users logged in at the same time. For example: 10:00 1 10:30 1 11:00 2 11:30 1 The only way I can do this now is to send the data through a Perl script to generate raw data like: alice, 2010/01/01, 10:00 alice, 2010/01/01, 10:30 alice, 2010/01/01, 11:00 ... bob, 2010/01/01, 11:00 I've looked through the man pages for a couple hours now, and I can't find a better of way of doing this directly in R. Any help or pointers? Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with any()
Is as.integer() redundant for this vector of integers? any(c(1, 3) == 3.0 ) -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-any-tp2553226p2580659.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] performing script on multiple files
say that I have a script that uses read.csv to read a textfile in a certain directory, then transforming the data a bit and then spit out a new file in an output directory. Is it possible to make this universal so the same procedure is made for all files in a certain directory and also make it spit out files that changes names for each file that is made (so the same output.csv is not overwritten every time the script runs)? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting a time + duration into a series of periods
Travers - R's date/time abilities are pretty powerful -- you shouldn't have to resort to outside programs. Here's how I'd approach the problem: dat = read.csv(textConnection('user, login_time, session_min + alice, 2010/01/01 10:00, 145 + bob, 2010/01/01 11:00, 30 + '),stringsAsFactors=FALSE) dat$login_time = as.POSIXct(dat$login_time) dat$login_end = dat$login_time + 60 * dat$session_min tms = seq(as.POSIXct('2010/01/01 10:00'), + as.POSIXct('2010/01/01 12:30'),by=30*60) data.frame(time=tms,count=sapply(tms, + function(tm)sum(dat$login_time = tm dat$login_end tm))) time count 1 2010-01-01 10:00:00 1 2 2010-01-01 10:30:00 1 3 2010-01-01 11:00:00 2 4 2010-01-01 11:30:00 1 5 2010-01-01 12:00:00 1 6 2010-01-01 12:30:00 0 Hope this helps. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 24 Sep 2010, Travers Naran wrote: Hi all, Here's what I have. I have a list of log-in times for users and how long their sessions were. user, login_time, session_min alice, 2010/01/01 10:00, 145 bob, 2010/01/01 11:00, 30 What I want to do is create a bar chart showing, in 1/2 hour segments, the number of users logged in at the same time. For example: 10:00 1 10:30 1 11:00 2 11:30 1 The only way I can do this now is to send the data through a Perl script to generate raw data like: alice, 2010/01/01, 10:00 alice, 2010/01/01, 10:30 alice, 2010/01/01, 11:00 ... bob, 2010/01/01, 11:00 I've looked through the man pages for a couple hours now, and I can't find a better of way of doing this directly in R. Any help or pointers? Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] performing script on multiple files
Jonas - Here's an (untested) idea: readwrite = function(file,outputdirectory){ dat = read.csv(file) . . . write.csv(dat,file = paste(outputdirectory,sub('\\.csv','.new',file),sep='/')) } sapply(list.files('.',pattern='\\.csv$'),readwrite,'/some/location') It will read all the .csv files in the current directory, and write out a file with the extension changed to .new in the directory /some/location . - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 24 Sep 2010, Jonas Josefsson wrote: say that I have a script that uses read.csv to read a textfile in a certain directory, then transforming the data a bit and then spit out a new file in an output directory. Is it possible to make this universal so the same procedure is made for all files in a certain directory and also make it spit out files that changes names for each file that is made (so the same output.csv is not overwritten every time the script runs)? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting a time + duration into a series of periods
Try this: findInterval(do.call(seq, c(as.list(as.POSIXct(DF$login_time)), by = '30 mins')), as.POSIXct(DF$login_time)) On Fri, Sep 24, 2010 at 4:32 PM, Travers Naran tna...@gmail.com wrote: Hi all, Here's what I have. I have a list of log-in times for users and how long their sessions were. user, login_time, session_min alice, 2010/01/01 10:00, 145 bob, 2010/01/01 11:00, 30 What I want to do is create a bar chart showing, in 1/2 hour segments, the number of users logged in at the same time. For example: 10:00 1 10:30 1 11:00 2 11:30 1 The only way I can do this now is to send the data through a Perl script to generate raw data like: alice, 2010/01/01, 10:00 alice, 2010/01/01, 10:30 alice, 2010/01/01, 11:00 ... bob, 2010/01/01, 11:00 I've looked through the man pages for a couple hours now, and I can't find a better of way of doing this directly in R. Any help or pointers? Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] delete d-jackknife with R ?
You may want to use combinations() in package {gtools} and write a function with a few lines to perform your leave-k-out procedure. -- View this message in context: http://r.789695.n4.nabble.com/delete-d-jackknife-with-R-tp2553192p2585783.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boundary check
You did not originally define ball, the other respondents have discussed using a convex hull, but here is another approach: Use ball to mean sphere (or technically hypersphere) and find the sphere with the smallest radius that contains all the points, optim or other optimizers could be programmed to do this (or an approximation that may be good enough is to use the means as the center and the distance to the furthest point as the radius). Then finding if a new point is within the sphere is just a matter of computing the Euclidean distance from the new point to the center and comparing that to the radius. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Feng Li Sent: Friday, September 24, 2010 3:44 AM To: r-help@r-project.org Subject: [R] boundary check Dear R, I have a covariates matrix with 10 observations, e.g. X - matrix(rnorm(50), 10, 5) X [,1][,2][,3][,4] [,5] [1,] 0.24857135 0.30880745 -1.44118657 1.10229027 1.0526010 [2,] 1.24316806 0.36275370 -0.40096866 -0.24387888 -1.5324384 [3,] -0.33504014 0.42996246 0.03902479 -0.84778875 -2.4754644 [4,] 0.06710229 1.01950917 -0.09325091 -0.03222811 0.4127816 [5,] -0.13619141 1.33143821 -0.79958805 2.08274102 0.6901768 [6,] -0.45060357 0.19348831 -1.23793647 -0.72440163 0.5057326 [7,] -1.20740516 0.20231086 1.15584485 0.8170 -1.2719855 [8,] -1.81166284 -0.07913113 -0.91080581 -0.34774436 0.9552182 [9,] 0.19131383 0.14980569 -0.37458224 -0.09371273 -1.7667203 [10,] -0.85159276 -0.66679528 1.63019340 0.56920196 -2.4049600 And I define a boundary of X: The smallest ball that nests all the observations of X. I wish to check if a particular point x_i x_i - matrix(rnorm(5), 1, 5) x_i [,1] [,2] [,3] [,4] [,5] [1,] -0.1525543 0.4606419 -0.1011011 -1.557225 -1.035694 is inside the boundary of X or not. I know it's easy to do it with 1-D or 2-D, but I don't knot how to manage it when the dimension is large. Can someone give a hint? Thanks in advance! Feng -- Feng Li Department of Statistics Stockholm University 106 91 Stockholm, Sweden http://feng.li/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fitting GLMM models with glmer
Hi everybody: I´m trying to rewrite some routines originally written for SAS´s PROC NLMIXED into LME4's glmer. These examples came from a paper by Nelson et al. (Use of the Probability Integral Transformation to Fit Nonlinear Mixed-Models with Nonnormal Random Effects - 2006). Firstly the authors fit a Poisson model with canonical link and a single normal random effect bi ~ N(0;Sigma^2).The SAS routine was: log_s =log(SURVT) cens=1 proc nlmixed data=liver; parms logsig2 = 0 b0 = -2.8 btrt = -0.54 bhrt =0.18; xb= log_s + b0 + btrt * treat + bhrt * heart+ bi; lambda=exp(xb); model cens ~ poisson(lambda); random bi ~ normal(0,exp(logsig2)) subject=INST; run; I obtained the same results for parameters estimates and standarderrors, by using: glmer(cens ~ treat + heart + offset(log(SURVT) + (1 | INST), data=liver, family=poisson) After that, the authors present the same model, but now bi = log(gi) where gi has the following gamma distribution: f(gi | theta1) = gi ^ (1/theta1 - 1) exp(-gi/theta1)/ GAMMA(1/theta1)(theta1^(1/theta1) They used a set of transformations proposed in the paper, and with the following rotine fitted the model achieving estimates for the same fixed parameters and for theta1: proc nlmixed data=liver; parms theta1 = 1 b0 = -2:8 btrt = -0.54 bhrt =0:18; pi = CDF('NORMAL',ai); if(pi 0:99) then pi =0:99; gi2 = quantile('GAMMA',pi, 1/theta1); gi = theta1 * gi2; xb= log_s + b0 + btrt * treat + bhrt * heart+ log(gi); lambda=exp(xb); model cens ~ poisson(lambda); random ai ~ normal(0,1) subject=INST; run; This time I' m lost. Could anyone please give me a hint? The data set is: liver - as.data.frame(matrix(c(1,1,1,1,2,2,2,3,3,3,3,4,4,4,4, + 5,5,6,6,7,7,23.286,6.429,26.857,11.143,3.857,9.000,8.714, + 1.143,23.143,2.571,2.857,76.429,35.857,25.857,52.286,25.143, + 25.143,29.286,28.857,1.857,14.286,1,0,1,0,0,1,0,0,1,1,0,1, + 1,0,0,1,0,1,0,0,1,1,0,0,1,1,0,0,1,0,0,1,0,1,0,0,1,0,1,0,1,0), + ncol=4,dimnames=list(NULL,c(INST,SURVT,treat,heart Thanks Clódio Almeida Federal University of Minas Gerais - Brazil (55 31 33727884) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Object Browser
What's the best object browser? Dear all, I have tried all the popular R IDE or editors like Eclipse, Komodo, JGR, Revolution... They all have fancy fucntions like auto completion, syntax highlight BUT, I JUST WANT A OBJECT BROWSER! The easiest way to view objects in R console is fix(), but you have no global view of all the objects in the workspace. Revolution has the best object browser so far, but this thing is way too big... Eclipse all has automatically object browser, but you can't only see the basic summary info. and it's really tricky to configure StatET. Jgr is very handy, when you double click the object name, you can view the object in a spreadsheet like using fix(), but you have to press the Refresh button each time you want to see the updated objects... So, is there any thing like the combination of eclipse and Jgr? If not, I am interested to develope something to fulfill this simple but very important function. But right now I have no idea where to start. Any suggestions? Peter -- View this message in context: http://r.789695.n4.nabble.com/Object-Browser-tp2594912p2594912.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting multiple animal tracks against Date/Time
On Fri, Sep 24, 2010 at 4:16 PM, Struve, Juliane j.str...@imperial.ac.ukwrote: Hi again, when applying the code to my real data I need to deal with a large number of individuals and massive data sets. I am using the code below to read in the data for different individuals, and would like to create the Lines within the loop. But Lines needs to have the variable Fish_ID somehow included in the name, as otherwise the string will be overwritten on each execution. How can I create a different Lines for each Fish_ID ? Or is there a better way of doing this ? Sorry, this is surely a beginner's question. Thank you very much for your help. Regards, Juliane ReleaseDates - read.csv(file=ReleaseDates.csv,head=TRUE,sep=,) for (i in 1:length(ReleaseDates$Fish_ID)){ Fish_ID - ReleaseDates$Fish_ID[i] Data - read.csv(paste(Fish_ID)) Lines - paste(Data$Date,Data$Distance) print Lines } That was just an example. The purpose of Lines was to make the example self contained and reproducible. In reality you dont use Lines at all. Also your post suggested that they were in separate files since you had headings on each one. If that is not the case then you just read it all at once using read.zoo. z - read.zoo(ReleaseDates.csv, split = Fish_ID, header = TRUE, sep = ,, ...whatever...) There are three vignettes (pdf documents) that come with zoo. Read them all and read the help page of read.zoo. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to read this file into R.
Dear community, I have one file named ca_boost_feature.txt, Feature selection (Boosting:0.0025,5)! H.2.C C.1.D C.3.R E.0.N C.2.S C.0.G H.3.G log file: ep If I want to use the second line of this file, how to read it into R? varr-read.table(/home/cdu/operon/carbonic/ca_boost_feature.txt, sep= , skip=1, header=F, strip.white=TRUE, nrows=1) Warning message: In read.table(/home/cdu/operon/carbonic/ca_boost_feature.txt, : incomplete final line found by readTableHeader on '/home/cdu/operon/carbonic/ca_boost_feature.txt' I attached this file with this email. Thanks! -- Sincerely, Changbin -- Feature selection (Boosting:0.0025,5)! H.2.C C.1.D C.3.R E.0.N C.2.S C.0.G H.3.G__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to do a trig regression
Thank you all very much for your help. Unfortunately I do not have the data to test out the ideas suggested but I was able to find a good approximation using another software package. I will try to get the data for this problem so I can find out how to do it in R as I am very much interested in the process. @Dennis : Indeed, its the c that causes the problem. Without the c, solving this would be a very easy thing. On Mon, Sep 13, 2010 at 12:01 PM, Greg Snow greg.s...@imail.org wrote: Without the square term you can just use the rule for addition in sines: sin(a+b) = sin(a)cos(b) + cos(a)sin(b) So a regression of y= a + b* sin(2*pi/360*x + c) can be fit as: lm( y~ sin( 2*pi/360*x) + cos( 2*pi/360/x ) ) If you need the actual values of b and c then you will need to do a little algebra. The same idea may be sufficient for your formula (or at least a close approximation), or you could switch to nonlinear fits using the nls function and fit your formula directly. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Aaditya Nanduri Sent: Sunday, September 12, 2010 8:23 PM To: r-help@r-project.org Subject: [R] How to do a trig regression Hello All, I cant seem to do a trig regression in R. The equation is as follows : y = a+b*(sin((2*pi*x/360) - c))^2 a, b, c are coefs that I want. y, x are input vectors. The equation I put into R: lm(y ~ sin(2*pi*x/360)^2) This equation is missing the c and I dont get the right answer. Also, I dont know how to plot the lm over the x values instead of the indices. Any help is sincerely appreciated. Thank you all very much. -- Aaditya Nanduri aaditya.nand...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Aaditya Nanduri aaditya.nand...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to read this file into R.
Changbin - If you want the entire line, use readLines('~/ca_boost_feature.txt',warn=FALSE)[2] [1] H.2.C C.1.D C.3.R E.0.N C.2.S C.0.G H.3.G If you want a vector with the contents of the line, use scan('~/ca_boost_feature.txt',skip=1,n=7,what='') Read 7 items [1] H.2.C C.1.D C.3.R E.0.N C.2.S C.0.G H.3.G Hope this helps. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 24 Sep 2010, Changbin Du wrote: Dear community, I have one file named ca_boost_feature.txt, Feature selection (Boosting:0.0025,5)! H.2.C C.1.D C.3.R E.0.N C.2.S C.0.G H.3.G log file: ep If I want to use the second line of this file, how to read it into R? varr-read.table(/home/cdu/operon/carbonic/ca_boost_feature.txt, sep= , skip=1, header=F, strip.white=TRUE, nrows=1) Warning message: In read.table(/home/cdu/operon/carbonic/ca_boost_feature.txt, : incomplete final line found by readTableHeader on '/home/cdu/operon/carbonic/ca_boost_feature.txt' I attached this file with this email. Thanks! -- Sincerely, Changbin -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to read this file into R.
Thanks so much, Phil! On Fri, Sep 24, 2010 at 2:45 PM, Phil Spector spec...@stat.berkeley.eduwrote: Changbin - If you want the entire line, use readLines('~/ca_boost_feature.txt',warn=FALSE)[2] [1] H.2.C C.1.D C.3.R E.0.N C.2.S C.0.G H.3.G If you want a vector with the contents of the line, use scan('~/ca_boost_feature.txt',skip=1,n=7,what='') Read 7 items [1] H.2.C C.1.D C.3.R E.0.N C.2.S C.0.G H.3.G Hope this helps. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 24 Sep 2010, Changbin Du wrote: Dear community, I have one file named ca_boost_feature.txt, Feature selection (Boosting:0.0025,5)! H.2.C C.1.D C.3.R E.0.N C.2.S C.0.G H.3.G log file: ep If I want to use the second line of this file, how to read it into R? varr-read.table(/home/cdu/operon/carbonic/ca_boost_feature.txt, sep= , skip=1, header=F, strip.white=TRUE, nrows=1) Warning message: In read.table(/home/cdu/operon/carbonic/ca_boost_feature.txt, : incomplete final line found by readTableHeader on '/home/cdu/operon/carbonic/ca_boost_feature.txt' I attached this file with this email. Thanks! -- Sincerely, Changbin -- -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color of lines while printing through for loop
On Fri, 24-Sep-2010 at 06:11PM +0200, Jonas Josefsson wrote: I am trying to find a convenient way to control line colors when printing from a for loop using the lines command. Right now I have solved this by creating a colors vector that is refered to in the loop with index. However, the colors choosen here are just 1,2,3,4,5... I would like to get colors from the col = rainbow(x) that you can use in plot() and set the to be my number of lines (I think I know how to set it to my number of lines, but I don't know how to implement this into my loop. Is this possible? Without knowing what pheno.dt looks like, I can't be sure, but I'd have thought it would work if you replaced col = colors[i] with col = rainbow(x)[i] HTH This is my script: __ ###creating two emty vectors and opening a graphics window x - vector() y - vector() plot(x,y, xlim = c(pheno.dt$year[1],pheno.dt$year[nrow(pheno.dt)]), ylim = c(min(pheno.dt$julian, na.rm = TRUE),max(pheno.dt$julian, na.rm = TRUE)), xlab = Year, ylab = Julian Day) ###setting up colors sp_pheno.unique - unique(pheno.dt$sp_pheno) colors - seq(1,200,1) printloop for (i in 1:length(unique(pheno.dt$year))) { data.sub - subset(pheno.dt, pheno.dt$sp_pheno==sp_pheno.unique[i]) x - seq(pheno.dt$year[1],pheno.dt$year[nrow(pheno.dt)]) y - data.sub$julian lines(x,y, col = colors[i]) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] POSIXct: Extract the hour for a list of elements
Hi, I have a list/data.frame 'pk' of POSIXct dates, and I'd like to extract the hour for each row. I know that if I have an individual POSIXct object, I can extract the hour by converting to a new object with: new.lt - as.POSIXlt(single POSIXct object) new.lt$hour But I can't figure out how to apply this for a list of such dates in a vectorized form. I can write a loop, I guess and implement this, but I think I'm missing a way that takes advantage of vectorization. Here is my loop to just print the hour extracts: for (ct in pk) { lt - as.POSIXlt(ct, origin=1970-01-01) print(lt$hour) } So is there a shorter vectorization idiom that lets me do this? I can't figure out how to use 'lapply' to apply the '$' operator... Thanks, Matt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] layout of lattice graphics
Hi all, I am trying to pickup lattice graphics in R and having difficulty with simple layout of plots. I have created a series of levelplots and would like to plot them to a single device, but need to reduce the margin areas. This is easily accomplished with par(oma) and par(mar) in the base graphics package but I am having problems finding the equivalent features in the lattice package. Ideally, I would like to reduce the amount of white space among plots in the following example. Thanks in advance. library(lattice) p1 - levelplot( matrix(c(1:25),nr=5,nc=5),row.values=1:5,column.values=1:5) p2 - levelplot(matrix (rnorm(25),nr=5,nc=5),row.values=1:5,column.values=1:5) p3 - levelplot( matrix(c(1:25),nr=5,nc=5),row.values=1:5,column.values=1:5) p4 - levelplot(matrix (rnorm(25),nr=5,nc=5),row.values=1:5,column.values=1:5) print(p1,split=c(1,1,2,2),more=T) print(p2,split=c(2,1,2,2),more=T) print(p3,split=c(1,2,2,2),more=T) print(p4,split=c(2,2,2,2)) Best Jonathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] POSIXct: Extract the hour for a list of elements
Matthew - It's a bit simpler than you think: as.POSIXlt(pk)$hour should return what you want. (If not, please provide a reproducible example.) - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 24 Sep 2010, Matthew Pettis wrote: Hi, I have a list/data.frame 'pk' of POSIXct dates, and I'd like to extract the hour for each row. I know that if I have an individual POSIXct object, I can extract the hour by converting to a new object with: new.lt - as.POSIXlt(single POSIXct object) new.lt$hour But I can't figure out how to apply this for a list of such dates in a vectorized form. I can write a loop, I guess and implement this, but I think I'm missing a way that takes advantage of vectorization. Here is my loop to just print the hour extracts: for (ct in pk) { lt - as.POSIXlt(ct, origin=1970-01-01) print(lt$hour) } So is there a shorter vectorization idiom that lets me do this? I can't figure out how to use 'lapply' to apply the '$' operator... Thanks, Matt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] POSIXct: Extract the hour for a list of elements
Thank you! That solution worked! I thought I'd tried something similar to that, but obviously I didn't. Here's a self-contained example, for posterity and completeness: z.df - data.frame(times=c(Sys.time(), Sys.time() + 3600)) as.POSIXlt(z.df[,1])$hour And this gives me what I want. Thank you again! Matt On Fri, Sep 24, 2010 at 5:50 PM, Phil Spector spec...@stat.berkeley.eduwrote: Matthew - It's a bit simpler than you think: as.POSIXlt(pk)$hour should return what you want. (If not, please provide a reproducible example.) - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 24 Sep 2010, Matthew Pettis wrote: Hi, I have a list/data.frame 'pk' of POSIXct dates, and I'd like to extract the hour for each row. I know that if I have an individual POSIXct object, I can extract the hour by converting to a new object with: new.lt - as.POSIXlt(single POSIXct object) new.lt$hour But I can't figure out how to apply this for a list of such dates in a vectorized form. I can write a loop, I guess and implement this, but I think I'm missing a way that takes advantage of vectorization. Here is my loop to just print the hour extracts: for (ct in pk) { lt - as.POSIXlt(ct, origin=1970-01-01) print(lt$hour) } So is there a shorter vectorization idiom that lets me do this? I can't figure out how to use 'lapply' to apply the '$' operator... Thanks, Matt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Pardon him. Theodotus: he is a barbarian, and thinks that the customs of his tribe and island are the laws of nature. -- George Bernard Shaw, Caesar and Cleopatra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] statistic term in boot function
Hi and thanks for your replies (I'm sorry to be a bit late..), I think I might be being a bit thick on this, but I truly would not be asking if I had not already gone through the manuals and examples in the web. I just want to be sure I got Jorge's example. the function function(v,index)mean(v[index]) is a generic function used to tell boot to take a random sample with replacement from the data, and the index vector is generated by the boot function and takes values 1:N (N being the e.g., length of the vector of the data being resampled). Is this correct? Thanks again for your help. Best, A -- View this message in context: http://r.789695.n4.nabble.com/statistic-term-in-boot-function-tp2550629p2553735.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting multiple animal tracks against Date/Time
Hi again, when applying the code to my real data I need to deal with a large number of individuals and massive data sets. I am using the code below to read in the data for different individuals, and would like to create the Lines within the loop. But Lines needs to have the variable Fish_ID somehow included in the name, as otherwise the string will be overwritten on each execution. How can I create a different Lines for each Fish_ID ? Or is there a better way of doing this ? Sorry, this is surely a beginner's question. Thank you very much for your help. Regards, Juliane ReleaseDates - read.csv(file=ReleaseDates.csv,head=TRUE,sep=,) for (i in 1:length(ReleaseDates$Fish_ID)){ Fish_ID - ReleaseDates$Fish_ID[i] Data - read.csv(paste(Fish_ID)) Lines - paste(Data$Date,Data$Distance) print Lines } Dr. Juliane Struve Daphne Jackson Fellow Imperial College London Department of Life Sciences Silwood Park Campus Buckhurst Road Ascot, Berkshire, SL5 7PY, UK Tel: +44 (0)20 7594 2527 Fax: +44 (0)1344 874 957 http://www.aquaticresources.orghttp://www.aquaticresources.org/ From: Gabor Grothendieck [ggrothendi...@gmail.com] Sent: 23 September 2010 23:06 To: Struve, Juliane Cc: r-help@r-project.org Subject: Re: [R] plotting multiple animal tracks against Date/Time On Thu, Sep 23, 2010 at 5:53 PM, Struve, Juliane j.str...@imperial.ac.ukmailto:j.str...@imperial.ac.uk wrote: Hello, thank you very much for replying. I have tried this, but I get error message Error in .subset(x, j) : invalid subscript type 'list' after z1 - read.zoo(textConnection(Lines1), skip = 1, index = list(1, 2), FUN = dt) Regards, Juliane It works for me. You likely have an outdated version of zoo. Make sure you have zoo 1.6-4. Here is the output: packageDescription(zoo)$Version [1] 1.6-4 Lines1 - DateDistance [m] + 2006-08-18 22:05:15 1815.798 + 2006-08-18 22:06:35 1815.798 + 2006-08-18 22:08:33 1815.798 + 2006-08-18 22:09:49 1815.798 + 2006-08-18 22:12:50 1815.798 + 2006-08-18 22:16:26 1815.798 Lines2 - Date Distance [m] + 2006-08-18 09:53:20 0.0 + 2006-08-18 09:59:07 0.0 + 2006-08-18 10:09:20 0.0 + 2006-08-18 10:21:14 0.0 + 2006-08-18 10:34:18 0.0 + 2006-08-18 10:36:44100.2 library(zoo) library(chron) # read in data dt - function(date, time) as.chron(paste(date, time)) z1 - read.zoo(textConnection(Lines1), skip = 1, index = list(1, 2), FUN = dt) z2 - read.zoo(textConnection(Lines2), skip = 1, index = list(1, 2), FUN = dt) # create single zoo object z - na.approx(cbind(z1, z2), na.rm = FALSE) Warning messages: 1: closing unused connection 4 (Lines2) 2: closing unused connection 3 (Lines1) # plot -- remove screen=1 if you want separate panels plot(z, screen = 1) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.comhttp://gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bptest
Hi I'm very new to R but have plenty of experience with statistics and other packages like SPSS, SAS etc. I have a dataset of around 20 columns and 200 rows. I'm trying to fit a very simple linear model between two variables. Having done so, I want to test the model for heteroscedasticity using the Breusch-Pagan test. Apparently this is easy in R by simply doing bptest(modelCH, data=KP) I've tried this but I'm told it cannot find function bptest. It's here where I'm struggling. I'm probably wrong but as far as I can see, bptest is part of the lm package which, as far as I know, I have installed. Irrespective of the fact I'm not sure how to tell bptest which is the dependent and explanatory variables - there's a more fundamental problem if it can't find the bptest function. I have searched the documentation - albeit briefly so if anyone could help I'd be very grateful Rob QBE Management -- View this message in context: http://r.789695.n4.nabble.com/bptest-tp2553506p2553506.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading in .aux (ESRI raster files) into R
Dear All, I'v tried to read in data in numerous ways including:- Spain10km-data.frame(readAsciiGrid(F://RMap//sp10kpointid1.aux)) Error in readAsciiGrid(F://RMap//sp10kpointid1.aux) : object 'cellsize' not found In addition: Warning message: In readLines(t, n = 6) : incomplete final line found on 'F://RMap//sp10kpointid1.aux' Spain10km-data.frame(readAsciiGrid(F:/GIS.LandcoverEuropeForRisk/Sept10kmmaps/Sp10KPointID.aux)) Error in readAsciiGrid(F:/GIS.LandcoverEuropeForRisk/Sept10kmmaps/Sp10KPointID.aux) : object 'cellsize' not found My original data in Arc GIS is have a cell size an i'mm curious as to how to make sure all the details are included. I also tried to use Spain10km-getRasterData(F://RMap//sp10kpointid1, ,band=NULL, offset=c(0,0),region.dim=dim(sp10kpointid1, output.dim=region.dim,interleave=c(0,0),as.is=FALSE)) Error in assertClass(dataset, GDALReadOnlyDataset) : Object is not a member of class GDALReadOnlyDataset Spain10km - system.file(F://RMap//sp10kpointid1, package=rgdal) #These lines were used to then try to get it to recognise sp10kpointid1 as a member og the GDALReadOnly Dataset. x - new(GDALReadOnlyDataset, Spain10km) Error in .local(.Object, ...) : empty file name x - new(sp10kpointid1, Spain10km) Which is the best way to read this file into R? Why didn't it include my classes? Thank you, Jo -- View this message in context: http://r.789695.n4.nabble.com/Reading-in-aux-ESRI-raster-files-into-R-tp2553544p2553544.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] optimizing lm with multiple very similar RHSs
Dear list, I would like to do something like: # Simulate some example data tmp - matrix(rnorm(50), ncol=5) colnames(tmp) - c(y,x1,x2, x3, x4, x5) # Fit a linear model with random noise added to x5 n times n - 100 replicate(n, lm(y ~ x1+x2+x3+x4+I(x5+rnorm(nrow(tmp))), data=as.data.frame(tmp))) I am wondering about ways to speed up this procedure (the data dimensions will be lot larger in my real examples, so each iteration does take a bit of time). The procedure is of course trivial to parallelize, but I'm also interested in ways to speed up the actual fitting of the linear model. I am aware of the fact that lm.fit is faster than lm as well as the fact that there are faster ways to to do the linear model program if you use Cholesky decomposition (as is nicely described in Douglas Bates Comparisons vignette for the Matrix package). What I would be very happy to get help and ideas about is if there are clever ways to use the fact that the RHS is almost the same in each iteration (I will always add noise to just one independent variable). Can this fact in any way be used to speed up the calculations? Or are there other ways in which the fits can be made faster? Thanks for any help! Best regards, Martin. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] kernlab:ksvm:eps-svr: bug?
Hi, A. In a nutshell: The training error, obtained as error (ret), from the return value of a ksvm () call for a eps-svr model is (likely) being computed wrongly. nu-svr and eps-bsvr suffer from this as well. I am attaching three files: (1) ksvm.R from the the kernlab package, un-edited, (2) ksvm_eps-svr.txt: (for easier reading) containing only eps-svr pertinent blocks of code from ksvm.R with line numbers prefixed to each line (I hope no licenses are being violated!) and (3) trace_output_concise.txt: relevant output from a trace () call for the example in C (my commands there are marked by NOTE THIS strings). B. In detail: (the line numbers refer to either ksvm.R or the prefixed numbers in ksvm_eps-svr.txt) (you may also refer to the trace_output in parallel) 1. the given response vector, y, is standardized at line 143: 142 if (is.numeric(y)(type(ret)!=C-svctype(ret)!=nu-svctype(ret)!=C-bsvctype(ret)!=spoc-svctype(ret)!=kbb-svc)) { 143 y - scale(y) 144 y.scale - attributes(y)[c(scaled:center,scaled:scale)] 145 y - as.vector(y) 146 } 2. fitted response, fitted(ret), is obtained at lines 826,827: 826 fitted(ret) - if (fit) 827predict(ret, x) else NULL (note: during the fitting process the return value of predict(ret,x) is _NOT_ scaled BACK to the original units, see lines 2832-2838) 3. fitted response is now scaled BACK to the original units of y at lines 839,840. So, when assigning error(ret) - at line 844, fitted(ret) is in the original units but y is in standardized units: 828 829 if(any(scaled)) 830 scaling(ret) - list(scaled = scaled, x.scale = x.scale, y.scale = y.scale) 831 832 if (fit){ ... 837 if(type(ret)==eps-svr||type(ret)==nu-svr||type(ret)==eps-bsvr){ 838 if (!is.null(scaling(ret)$y.scale)){ 839 scal - scaling(ret)$y.scale$scaled:scale 840 fitted(ret) - fitted(ret) * scaling(ret)$y.scale$scaled:scale + scaling(ret)$y.scale$scaled:center 841 } 842 else 843 scal - 1 844 error(ret) - drop((scal^2)*crossprod(fitted(ret) - y)/m) 845 } 846 } C. Finally, an example (taken from ?ksvm): require (kernlab) seed (1234) x - seq(-20,20,0.1); x - x[x != 0] y - sin(x)/x + rnorm(400,sd=0.03) regm - ksvm(x,y,epsilon=0.01,kpar=list(sigma=16),cross=3) te - crossprod (fitted(regm)-y)/400 s - (scaling(regm)$y.scale[[scaled:scale]])^2 error (regm) # 0.03891344 te# 0.0008958718 te * s# 6.37252e-05 te / s# 0.01259449 These numbers can also be seen in the trace_output_concise.txt. In particular, compare the output marked by ** NOTE THIS, ?? NOTE THIS, and ++ NOTE THIS there. Basically, compute the error either before scaling back fitted (ret) or scale back y as well!! Can anyone confirm / refute / agree / disagree? Thanks, -- Prasenjit 1 ## Support Vector Machines 2 ## author : alexandros karatzoglou 3 ## updated : 08.02.06 4 5 setGeneric(ksvm, function(x, ...) standardGeneric(ksvm)) 6 setMethod(ksvm,signature(x=formula), 7 function (x, data=NULL, ..., subset, na.action = na.omit, scaled = TRUE){ ... 38 }) 39 ... 47 setMethod(ksvm,signature(x=matrix), 48 function (x, 49y = NULL, 50scaled= TRUE, 51type = NULL, 52kernel= rbfdot, 53kpar = automatic, 54C = 1, 55nu= 0.2, 56epsilon = 0.1, 57prob.model = FALSE, 58class.weights = NULL, 59cross = 0, 60fit = TRUE, 61cache = 40, 62tol = 0.001, 63shrinking = TRUE, 64... 65,subset 66 ,na.action = na.omit) 67 { ... 74sparse - FALSE 75 76if(is.character(kernel)){ 77 kernel - match.arg(kernel,c(rbfdot,polydot,tanhdot,vanilladot,laplacedot,besseldot,anovadot,splinedot,matrix)) 78 79 if(kernel == matrix) 80if(dim(x)[1]==dim(x)[2]) 81 return(ksvm(as.kernelMatrix(x), y = y, type = type, C = C, nu = nu, epsilon = epsilon, prob.model = prob.model, class.weights = class.weights, cross = cross, fit = fit, cache = cache, tol = tol, shrinking = shrinking, ...)) 82else 83 stop( kernel matrix not square!) 84 85 if(is.character(kpar)) 86if((kernel == tanhdot || kernel == vanilladot || kernel == polydot|| kernel == besseldot || kernel== anovadot|| kernel==splinedot) kpar==automatic ) 87 { 88cat ( Setting default kernel parameters ,\n) 89kpar - list() 90 } 91} 92 93## subsetting and na-handling for matrices 94ret -
[R] why I could not reproduce the Mandelbrot plot demonstrated on R wiki
I am trying to reproduce the nice looking of Mandelbrot demonstrated by R wiki page by the following code: library(caTools)# external package providing write.gif function jet.colors = colorRampPalette(c(#7F, blue, #007FFF, cyan, #7FFF7F, yellow, #FF7F00, red, #7F)) m = 600 # define size C = complex( real=rep(seq(-1.8,0.6, length.out=m), each=m ), imag=rep(seq(-1.2,1.2, length.out=m), m ) ) C = matrix(C,m,m) # reshape as square matrix of complex numbers Z = 0 # initialize Z to zero X = array(0, c(m,m,20)) # initialize output 3D array for (k in 1:20) { # loop with 20 iterations Z = Z^2+C # the central difference equation X[,,k] = exp(-abs(Z)) # capture results } write.gif(X, Mandelbrot.gif, col=jet.colors, delay=100) however, the gif file created by this looks much worse than what is shown on R wiki page, see the comparison as follows (left one is what i created) http://r.789695.n4.nabble.com/file/n2591429/Picture1.png -- View this message in context: http://r.789695.n4.nabble.com/why-I-could-not-reproduce-the-Mandelbrot-plot-demonstrated-on-R-wiki-tp2591429p2591429.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Saving iterative components
Hi, I need help! I am trying to iterate an iterative process to do cross vadation and store the results each time. I have a Spatial data.frame, called Tmese str(Tmese) Formal class 'SpatialPointsDataFrame' [package sp] with 5 slots ..@ data :'data.frame': 14 obs. of 17 variables: .. ..$ ID : int [1:14] 73 68 49 62 51 79 69 77 57 53 ... .. ..$ Stazione: Factor w/ 29 levels ALIANO,BONIFATI,..: 2 3 4 5 10 11 12 16 17 19 ... .. ..$ X01_1994: num [1:14] 9.34 10.67 5.29 11.86 9.15 ... .. ..$ X01_1995: num [1:14] 7.07 9.22 2.32 9.3 6.66 ... .. ..$ X01_1996: num [1:14] 9.41 10.4 5.99 12.3 9.93 ... .. ..$ X01_1997: num [1:14] 10.67 10.65 5.76 12.82 10.1 ... .. ..$ X01_1998: num [1:14] 9.57 10.12 4.44 10.34 8.97 ... .. ..$ X01_1999: num [1:14] 8.96 10.21 3.23 10.83 7.74 ... .. ..$ X01_2000: num [1:14] 6.58 8.46 2.8 8.37 6.55 ... Now I want to do 14 cross validation and I wrote a function idw.cv- function(x){ tmp- krige.cv(x~1, Tmese, model=NULL) return(tmp) } But when I run it, it doesn't work, it says: cv_1994- idw.cv(X01_1994) Error in eval(expr, envir, enclos) : object 'X01_1994' not found WhyPlease Help me! -- View this message in context: http://r.789695.n4.nabble.com/Saving-iterative-components-tp2553676p2553676.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odds ratio from Logistic model in R
Hi, I am new to R. Anyone can explain the following from R-help or anyone can direct me how to calculate odds ratio from logistic model in R. Thank you very much. Guoya Stefano stecalza at tiscalinet.it https://stat.ethz.ch/mailman/listinfo/r-help writes: Hi all. A simple question. Is there a function to compute the Odds Ratio and its confidence intervall, from a logistic model (glm(...,family=binomial). I've written my own, but certainly someone did a better job. I show a simple function to do this in my introductory notes available from: http://www.myatt.demon.co.uk basically it is: lreg.or - function(model) { lreg.coeffs - coef(summary(salex.lreg)) lci - exp(lreg.coeffs[ ,1] - 1.96 * lreg.coeffs[ ,2]) or - exp(lreg.coeffs[ ,1]) uci - exp(lreg.coeffs[ ,1] + 1.96 * lreg.coeffs[ ,2]) lreg.or - cbind(lci, or, uci) lreg.or } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Survival graph and tables
Hi, I'm trying to make such a graphic : http://heart.bmj.com/content/96/9/662/F1.large.jpg The curve plot is not a problem but I don't know how to represent the little table underneath with the number of survivors at given periods. Do you have any hints to achieve this ? Thanks in advance Blaise T. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple graph in one graph window
Dear R users; Could you tell me how to let R create multiple graphs in a graph window. Please note that I am talking about creating multiple plots in the same page of the graph window. For example: g1=c(1,2,3) g2=c(3,1,7) g3=c(4,2,11) g4=c(5,5,9) ... all in one graph window. Best Regards Reza __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to change the lengend in a DA figure
Dear all: I have encountered a problem recently: I used to plot figure, and using code main=.., xlab=.,ylab=. to change the legend. But recently I was using R to perform Discriminant Function analysis, and I want to change the group names also add a title to the figure, but I am not sure how to make it work, could you please help me with my problem? My code is: library(MASS) da.anti-lda(type~ct+rd+ht+ndf+adf+hc+ls+cs) da.anti par(family=serif) plot(da.anti) I appreciate your help. Thank you. Xi Chen Graduate Teaching Scholar Biological Science Department Texas Tech University 2500 Broadway, Lubbock TX 79410 Phone: 806-319-1513 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Power Function
Hi, at first, i´m from germany, so sorry for my bad english. but i need ur help in R to programm a power function and to make at last a graphik of it. i have already tried my best. but it doesn´t work.the topic is: the homogeneity test of correlation based entropy. so it means, that i have to check if all correlations of a bivariate random vectors are same or not. for that i saperate the n bivariate random vectors (x1,y1),...,(xn,yn) in blocks m so, so that i at first calculate the correlation in a block. n=m*k. the numbers of the blocks m are user-defined. the test value is the entropy. pls help me!!! set.seed(1000) n=100 m=5 k=n/m x=rnorm(n,0,0.5) y=rnorm(n,0,0.8) #alpha/2 Quantil q_1=qnorm(0.05,0,0.05) #1-alpha/2 Quantile q_2=qnorm(0.95,0,0.05) l=matrix(0,nrow=m,ncol=1) for(i in 1:m){ l[i]=print(cor((x[(((i-1)*k)+1):(((i-1)*k)+k)]), (y[(((i-1)*k)+1):(((i-1)*k)+k)]))) } güte=function(l){ p=matrix(0,nrow=m,ncol=1) for(i in 1:m){ p[i]=l[i]^2/sum(l^2) } H=log(m)-sum(p*log(p)) 1-mean(q_1=H H =q_2) } l=seq(0,1,len=10) plot(l,güte, type=o,pch=20,ylim=c(0,1),col=red) -- View this message in context: http://r.789695.n4.nabble.com/Power-Function-tp2631929p2631929.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odds ratio from Logistic model in R
The output of logisitic procdure only gives you the log(odds-ratio) and the associated standard error of the log(odds-ratio). You need to exponentiate the log(odds-ratio) to get your odds ratio. The code tells you how to obtain the odds ratio from log(odds-ratio). -- View this message in context: http://r.789695.n4.nabble.com/Odds-ratio-from-Logistic-model-in-R-tp2630277p2651963.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Power Function
Hi, What part of your code does not work or does not do what you want? I am guessing that you want gute to return more than a single value (which is what it does in your code). Can you show us an example of the results you would like? Best regards, Josh I reworked some of your code to simplify, clean, and wrap it all in a function: myfun - function(n, m, alpha = .05, seeder = 1000) { set.seed(seeder) x - matrix(rnorm(n, 0, 0.5), ncol = m) y - matrix(rnorm(n, 0, 0.8), ncol = m) l - diag(cor(x, y)) cat(Correlations between two random variables \n, l, fill = TRUE) gute - function(x, m, alpha) { q_1 - qnorm(alpha, 0, 0.05) q_2 - qnorm(1 - alpha, 0, 0.05) p - (x^2)/sum(x^2) H - log(m) - sum(p * log(p), na.rm = TRUE) 1 - mean(q_1 = H H = q_2) } dat - seq(0, 1, length.out = 10) output - gute(x = dat, m = m, alpha = alpha) return(output) } This results in: myfun(100, 5) Correlations between two random variables -0.5887829 0.07042411 -0.06082638 0.2395193 -0.1038213 [1] 1 On Fri, Sep 24, 2010 at 5:26 PM, jethi kart...@hotmail.com wrote: Hi, at first, i´m from germany, so sorry for my bad english. but i need ur help in R to programm a power function and to make at last a graphik of it. i have already tried my best. but it doesn´t work.the topic is: the homogeneity test of correlation based entropy. so it means, that i have to check if all correlations of a bivariate random vectors are same or not. for that i saperate the n bivariate random vectors (x1,y1),...,(xn,yn) in blocks m so, so that i at first calculate the correlation in a block. n=m*k. the numbers of the blocks m are user-defined. the test value is the entropy. pls help me!!! set.seed(1000) n=100 m=5 k=n/m x=rnorm(n,0,0.5) y=rnorm(n,0,0.8) #alpha/2 Quantil q_1=qnorm(0.05,0,0.05) #1-alpha/2 Quantile q_2=qnorm(0.95,0,0.05) l=matrix(0,nrow=m,ncol=1) for(i in 1:m){ l[i]=print(cor((x[(((i-1)*k)+1):(((i-1)*k)+k)]), (y[(((i-1)*k)+1):(((i-1)*k)+k)]))) } güte=function(l){ p=matrix(0,nrow=m,ncol=1) for(i in 1:m){ p[i]=l[i]^2/sum(l^2) } H=log(m)-sum(p*log(p)) 1-mean(q_1=H H =q_2) } l=seq(0,1,len=10) plot(l,güte, type=o,pch=20,ylim=c(0,1),col=red) -- View this message in context: http://r.789695.n4.nabble.com/Power-Function-tp2631929p2631929.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bptest
Hi Rob, I googled cran bptest and the first hit was for package lmtest. I installed lmtest, loaded it up, and bptest is available there. require(lmtest) Loading required package: lmtest Loading required package: zoo bptest function (formula, varformula = NULL, studentize = TRUE, data = list()) ... So see if you can get package lmtest. HTH Cheers Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of robm [r.malp...@ntlworld.com] Sent: September 24, 2010 6:53 AM To: r-help@r-project.org Subject: [R] bptest Hi I'm very new to R but have plenty of experience with statistics and other packages like SPSS, SAS etc. I have a dataset of around 20 columns and 200 rows. I'm trying to fit a very simple linear model between two variables. Having done so, I want to test the model for heteroscedasticity using the Breusch-Pagan test. Apparently this is easy in R by simply doing bptest(modelCH, data=KP) I've tried this but I'm told it cannot find function bptest. It's here where I'm struggling. I'm probably wrong but as far as I can see, bptest is part of the lm package which, as far as I know, I have installed. Irrespective of the fact I'm not sure how to tell bptest which is the dependent and explanatory variables - there's a more fundamental problem if it can't find the bptest function. I have searched the documentation - albeit briefly so if anyone could help I'd be very grateful Rob QBE Management -- View this message in context: http://r.789695.n4.nabble.com/bptest-tp2553506p2553506.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple graph in one graph window
Which one do you want? library(lattice) d - data.frame(x = rep(1:3, 4), g = factor(rep(1:4, each = 3)), y = c(1, 2, 3, 3, 1, 7, 4, 2, 11, 5, 5, 9)) xyplot(y ~ x | g, data = d, pch = 16) xyplot(y ~ x, data = d, groups = g, type = c('p', 'l'), pch = 16) Both provide 'multiple plots' and both are on the 'same page'. HTH, Dennis On Fri, Sep 24, 2010 at 2:37 PM, mrdaw...@abo.fi wrote: Dear R users; Could you tell me how to let R create multiple graphs in a graph window. Please note that I am talking about creating multiple plots in the same page of the graph window. For example: g1=c(1,2,3) g2=c(3,1,7) g3=c(4,2,11) g4=c(5,5,9) ... all in one graph window. Best Regards Reza __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the lengend in a DA figure
Hi: It's a little hard to help on that front without a reproducible example but you could read the help page of plot.lda in the MASS package, where you would find that it takes both xlab and ylab arguments, and for which you can probably pass main as well since it contains a ... argument. Did you try it? If you want to change group names, it's generally easier to make the change in the data (add a new variable with modified values) and use that in the model call. HTH, Dennis On Fri, Sep 24, 2010 at 2:57 PM, Chen, Xi xi.c...@ttu.edu wrote: Dear all: I have encountered a problem recently: I used to plot figure, and using code main=.., xlab=.,ylab=. to change the legend. But recently I was using R to perform Discriminant Function analysis, and I want to change the group names also add a title to the figure, but I am not sure how to make it work, could you please help me with my problem? My code is: library(MASS) da.anti-lda(type~ct+rd+ht+ndf+adf+hc+ls+cs) da.anti par(family=serif) plot(da.anti) I appreciate your help. Thank you. Xi Chen Graduate Teaching Scholar Biological Science Department Texas Tech University 2500 Broadway, Lubbock TX 79410 Phone: 806-319-1513 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odds ratio from Logistic model in R
Hi Guoya, Is this what you are after? # fit logistic model my.glm - glm(vs ~ mpg, data = mtcars, family = binomial(link = logit)) # look at a summary summary(my.glm) # view the coefficients as odds ratios exp(coef(my.glm)) Hope that helps, Josh On Fri, Sep 24, 2010 at 10:50 AM, Li, Guoya guoya...@bshsi.org wrote: Hi, I am new to R. Anyone can explain the following from R-help or anyone can direct me how to calculate odds ratio from logistic model in R. Thank you very much. Guoya Stefano stecalza at tiscalinet.it https://stat.ethz.ch/mailman/listinfo/r-help writes: Hi all. A simple question. Is there a function to compute the Odds Ratio and its confidence intervall, from a logistic model (glm(...,family=binomial). I've written my own, but certainly someone did a better job. I show a simple function to do this in my introductory notes available from: http://www.myatt.demon.co.uk basically it is: lreg.or - function(model) { lreg.coeffs - coef(summary(salex.lreg)) lci - exp(lreg.coeffs[ ,1] - 1.96 * lreg.coeffs[ ,2]) or - exp(lreg.coeffs[ ,1]) uci - exp(lreg.coeffs[ ,1] + 1.96 * lreg.coeffs[ ,2]) lreg.or - cbind(lci, or, uci) lreg.or } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odds ratio from Logistic model in R
Hi Guoya, One more option, using Joshua's example, would be: # install.packages('epicalc') require(epicalc) logistic.display(my.glm) HTH, Jorge On Fri, Sep 24, 2010 at 1:50 PM, Li, Guoya wrote: Hi, I am new to R. Anyone can explain the following from R-help or anyone can direct me how to calculate odds ratio from logistic model in R. Thank you very much. Guoya Stefano stecalza at tiscalinet.it https://stat.ethz.ch/mailman/listinfo/r-help writes: Hi all. A simple question. Is there a function to compute the Odds Ratio and its confidence intervall, from a logistic model (glm(...,family=binomial). I've written my own, but certainly someone did a better job. I show a simple function to do this in my introductory notes available from: http://www.myatt.demon.co.uk basically it is: lreg.or - function(model) { lreg.coeffs - coef(summary(salex.lreg)) lci - exp(lreg.coeffs[ ,1] - 1.96 * lreg.coeffs[ ,2]) or - exp(lreg.coeffs[ ,1]) uci - exp(lreg.coeffs[ ,1] + 1.96 * lreg.coeffs[ ,2]) lreg.or - cbind(lci, or, uci) lreg.or } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.