Re: [R] Adding a year to existing date
On 17/11/11 17:33, arunkumar wrote: Hi I need to add an year to and date field in the dataframe. Please help me X Date 1 2008-01-01 2 2008-02-01 3 2003-03-01 I can't find anything built in. This is probably because year is an ill-defined unit; years vary in length in a somewhat peculiar fashion. So doing arithmetic with respect to years is frowned on. However you might try this: `%+%` - function(x,y){ if(!isTRUE(all.equal(y,round(y stop(Argument \y\ must be an integer.\n) x - as.POSIXlt(x) x$year - x$year+y as.Date(x) } Then: xxx - as.Date(c(2008-01-01,2008-02-01,2003-03-01)) xxx %+% 1 [1] 2009-01-01 2009-02-01 2004-03-01 Dunno what dangers lurk; caveat utilitor. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data/variables
Thanks Sarah. I have read about the problems with attach(), and I will try to avoid it. I have now found the line that's causing the problem is: setwd(z:/homework) With that line in place, either in a program or in Rprofile.site (?), then the moment I run R and simply enter (before reading any data) summary(mydata) I get sample statistics for a dozen variables! Do not save the workspace? I thought the option to save/use a binary file is meant to be convenient. I like working in the same working directory, and I like .rdata files. Does this sound hopeless? Thanks. At 09:26 PM 11/15/2011, Sarah Goslee wrote: Hi, The obvious answer is don't use attach() and you'll never have that problem. And see further comments inline. On Tue, Nov 15, 2011 at 6:05 PM, Steven Yen s...@utk.edu wrote: Can someone help me with this variable/data reading issue? I read a csv file and transform/create an additional variable (called y). The first set of commands below produced different sample statistics for hw11$y and y In the second set of command I renameuse the variable name yy, and sample statistics for $hw11$yy and yy are identical. Using y - yy fixed it, but I am not sure why I would need to do that. That y appeared to have come from a variable called y from another data frame (unrelated to the current run). Help! setwd(z:/homework) sink (z:/homework/hw11.our, append=T, split=T) hw11 - read.csv(ij10b.csv,header=T) hw11$y - hw11$e3 attach(hw11) The following object(s) are masked _by_ '.GlobalEnv': y Look there. R even *told* you that it was going to use the y in the global environment rather than the one you were trying to attach. The other solution: don't save your workspace. Your other email on this topic suggested to me that there is a .RData file in your preferred working directory that contains an object y, and that's what is interfering with what you think should happen. Deleting that file, or using a different directory, or removing y before you attach the data frame would all work. But truly, the best possible strategy is to avoid using attach() so you don't have to worry about which object named y is really being used because you specify it explicitly. (n - dim(hw11)[1]) [1] 13765 summary(hw11$y) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(hw11$y) [1] 13765 summary(y) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.0 0.0 0.0 0.24958 0.0 1.0 length(y) [1] 601 setwd(z:/homework) sink (z:/homework/hw11.our, append=T, split=T) hw11 - read.csv(ij10b.csv,header=T) hw11$yy - hw11$e3 attach(hw11) hw11$yy - hw11$e3 summary(hw11$yy) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(hw11$yy) [1] 13765 summary(yy) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(yy) [1] 13765 -- Sarah Goslee http://www.functionaldiversity.org -- Steven T. Yen, Professor of Agricultural Economics The University of Tennessee http://web.utk.edu/~syen/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Numerical Format on axis
Thoiusand thanks to David and Don. Great help! Mario Aachen, Germany Von: MacQueen, Don macque...@llnl.gov Cc: r-help@r-project.org r-help@r-project.org; David Winsemius dwinsem...@comcast.net Gesendet: 2:30 Donnerstag, 17.November 2011 Betreff: Re: [R] Numerical Format on axis To add to what David suggests, and since you're new to R, something like this: plot(x,y, yaxt='n') yticks - pretty(y) axis(2, at=yticks, labels=sprintf(%1.2f,yticks)) See the help page for par ?par and look for the entry for 'xaxt' to see what the 'yaxt' arg to plot does. -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 11/16/11 6:35 AM, David Winsemius dwinsem...@comcast.net wrote: On Nov 16, 2011, at 7:41 AM, Mario Giesel wrote: Hello, list, I'm new to R and I'm trying to produce a chart with currency values on the y axis. Values should be e.g. 1,00, 1,50, 2,00, etc. In fact they are 1,0, 1,5, 2,0, etc. How do I get R to show two digits after the comma on that axis? ?sprintf ?format On the left (geographic) side of the Atlantic, it might be: sprintf(%1.2f, 1) [1] 1.00 I assume that your system is set up with different options() and that your punkts are going to be handle to your liking by sprintf. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. On the left (geographic) side of the Atlantic, it might be: sprintf(%1.2f, 1) [1] 1.00 I assume that your system is set up with different options() and that your punkts are going to be handle to your liking by sprintf. -- David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour on top of 2d histogram
Hi, thanks for the suggestion! I had tried it before, but it did not work - this was probably because I was using the image function to plot the 2d histogram. When I use directly hist2d and then contour with add=T it works. Thanks again Anna Von: R. Michael Weylandt [michael.weyla...@gmail.com] Gesendet: Mittwoch, 16. November 2011 18:43 Bis: Sramkova, Anna (IEE) Cc: r-help@r-project.org Betreff: Re: [R] Contour on top of 2d histogram Try the add = TRUE argument to contour. Michael On Wed, Nov 16, 2011 at 12:35 PM, Sramkova, Anna (IEE) anna.sramk...@iee.unibe.ch wrote: Hi all, I would like to plot one data set as a 2d histogram and another one as a contour. I can do it separately with the hist2d and contour functions, but I wonder if there is a way to combine these two plots into a single one (the ranges of the two plots are the same). Any suggestions? Thanks, Anna [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RV: Reporting a conflict between ADMB and Rtools on Windows systems
De: Rubén Roa Enviado el: jueves, 17 de noviembre de 2011 9:53 Para: 'us...@admb-project.org' Asunto: Reporting a conflict between ADMB and Rtools on Windows systems Hi, I have to work under Windows, it's a company policy. I've just found that there is a conflict between tools used to build R packages (Rtools) and ADMB due to the need to put Rtools compiler's location in the PATH environmental variable to make Rtools work. On a Windows 7 64bit with Rtools installed I installed ADMB-IDE latest version and although I could translate ADMB code to cpp code I could not build the cpp code into an executable via ADMB-IDE's compiler. On another Windows machine, a Windows Vista 32bits with Rtools installed I also installed the latest ADMB-IDE and this time it was not possible to create the .obj file on the way to build the executable when building with ADMB-IDE. On this machine I also have a previous ADMB version (6.0.1) that I used to run from the DOS shell. This ADMB also failed to build the .obj file. Now, going to PATH, the location info to make Rtools is: c:\Rtools\bin;c:\Rtools\MinGW\bin;c:\Rtools\MinGW64\bin;C:\Program Files (x86)\MiKTeX 2.9\miktex\bin; If from this list I remove the reference to the compiler c:\Rtools\MinGW\bin then ADMB works again. So beware of this conflict. Suggestion of a solution will be appreciated. Meanwhile, I run ADMB code in one computer and build R packages with Rtools in another computer. Best Ruben -- Dr. Ruben H. Roa-Ureta Senior Researcher, AZTI Tecnalia, Marine Research Division, Txatxarramendi Ugartea z/g, 48395, Sukarrieta, Bizkaia, Spain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RV: Reporting a conflict between ADMB and Rtools on Windows systems
I assume you use a command window to build your packages. One possible solution might be to leave out the path variables set by Rtools from your global path and to create a separate shortcut to cmd for building r-packages where you set your path as needed by R CMD build/check Something like cmd /K PATH c:\Rtools\bin;c:\Rtools\MinGW\bin;c:\Rtools\MinGW64\bin;C:\Program Files (x86)\MiKTeX 2.9\miktex\bin (I haven't tried this so it might need some tinkering to get it to actually work) HTH Jan On 17-11-2011 9:54, Rubén Roa wrote: De: Rubén Roa Enviado el: jueves, 17 de noviembre de 2011 9:53 Para: 'us...@admb-project.org' Asunto: Reporting a conflict between ADMB and Rtools on Windows systems Hi, I have to work under Windows, it's a company policy. I've just found that there is a conflict between tools used to build R packages (Rtools) and ADMB due to the need to put Rtools compiler's location in the PATH environmental variable to make Rtools work. On a Windows 7 64bit with Rtools installed I installed ADMB-IDE latest version and although I could translate ADMB code to cpp code I could not build the cpp code into an executable via ADMB-IDE's compiler. On another Windows machine, a Windows Vista 32bits with Rtools installed I also installed the latest ADMB-IDE and this time it was not possible to create the .obj file on the way to build the executable when building with ADMB-IDE. On this machine I also have a previous ADMB version (6.0.1) that I used to run from the DOS shell. This ADMB also failed to build the .obj file. Now, going to PATH, the location info to make Rtools is: c:\Rtools\bin;c:\Rtools\MinGW\bin;c:\Rtools\MinGW64\bin;C:\Program Files (x86)\MiKTeX 2.9\miktex\bin; If from this list I remove the reference to the compiler c:\Rtools\MinGW\bin then ADMB works again. So beware of this conflict. Suggestion of a solution will be appreciated. Meanwhile, I run ADMB code in one computer and build R packages with Rtools in another computer. Best Ruben -- Dr. Ruben H. Roa-Ureta Senior Researcher, AZTI Tecnalia, Marine Research Division, Txatxarramendi Ugartea z/g, 48395, Sukarrieta, Bizkaia, Spain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] split list of characters in groups of 2
Hi: Here's one way: apply(matrix(var.names, ncol = 2, byrow = TRUE), 1, function(x) paste(x[1], x[2], sep = ',')) [1] a,b c,d e,f HTH, Dennis On Wed, Nov 16, 2011 at 9:46 PM, B77S bps0...@auburn.edu wrote: hi, If i have a list of things, like this var.names - c(a, b, c, d, e, f) how can i get this: a, b, c, d, e, f thanks ahead of time. -- View this message in context: http://r.789695.n4.nabble.com/split-list-of-characters-in-groups-of-2-tp4079031p4079031.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] permutation within rows of a matrix
On Wed, 2011-11-16 at 14:55 -0800, Peter Ehlers wrote: I must be missing something. What's wrong with t(apply(mat, 1, sample)) ? Only missing that I am either[*] i) stupid, ii) being too clever, iii) down on my coffee intake for the day. G [*] delete as applicable any that don't apply. ;-) Peter Ehlers On 2011-11-16 12:12, Gavin Simpson wrote: On Wed, 2011-11-16 at 14:29 -0500, R. Michael Weylandt wrote: Suppose your matrix is called X. ? sample X[sample(nrow(X)),] That will shuffle the rows at random, not permute within the rows. Here is an alternative, first using one of my packages (permute - shameful promotion ;-) !: mat- matrix(sample(0:1, 100, replace = TRUE), ncol = 10) require(permute) perms- shuffleSet(10, nset = 10) ## permute mat t(sapply(seq_len(nrow(perms)), function(i, perms, mat) mat[i, perms[i,]], mat = mat, perms = perms)) If you don't want to use permute, then you can do this via standard R functions perms- t(replicate(nrow(mat), sample(ncol(mat ## permute mat t(sapply(seq_len(nrow(perms)), function(i, perms, mat) mat[i, perms[i,]], mat = mat, perms = perms)) HTH G Michael On Wed, Nov 16, 2011 at 11:45 AM, Juan Antonio Balbuenabalbu...@uv.es wrote: Hello This is probably a basic question but I am quite new to R. I need to permute elements within rows of a binary matrix, such as [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]000010000 0 [2,]001100011 0 [3,]010000100 0 [4,]000000110 0 [5,]000100001 0 [6,]001100000 1 [7,]000000000 0 [8,]110100010 1 [9,]100101010 0 [10,]000000010 1 That is, elements within each row are permuted freely and independently from the other rows. I see that is is workable by creating a array for each row, performing sample and binding the arrays again, but I wonder whether there is a more efficient way of doing the trick. Any help will be much appreciated. -- View this message in context: http://r.789695.n4.nabble.com/permutation-within-rows-of-a-matrix-tp4076989p4076989.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting unique colours
Le mercredi 16 novembre 2011 à 20:02 -0800, Quercus a écrit : Hey everyone, I am new to R, and I'm making a scatter plot graph where i have a bunch of plots/points that fall into 9 unique categories. I want each category to have a unique colour, however, with the coding I have (below), the colour black is repeated for two of my plot types. Does anyone know a quick way to get 9 unique colours?? Coding: plotba = plot (predictedba ~ actualba, col=as.numeric(ecosite), pch=19, cex=1.5, ylab=Predicted Basal Area (m2/ha-1), xlab=Actual Basal Area (m2/ha-1)) Thanks! You can use col=rainbow(9). For more choice about the color palette, also see the RColorBrewer package. Here's what it looks like currently: http://r.789695.n4.nabble.com/file/n4078889/predicted_height.jpeg Sorry, the link doesn't work. Regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] split list of characters in groups of 2
On 17.11.2011 10:31, Dennis Murphy wrote: Hi: Here's one way: apply(matrix(var.names, ncol = 2, byrow = TRUE), 1, function(x) paste(x[1], x[2], sep = ',')) [1] a,b c,d e,f Or for short and slightly faster for huge data use column-wise operations as in: apply(matrix(var.names, nrow=2), 2, paste, collapse=, ) Best, Uwe Ligges HTH, Dennis On Wed, Nov 16, 2011 at 9:46 PM, B77Sbps0...@auburn.edu wrote: hi, If i have a list of things, like this var.names- c(a, b, c, d, e, f) how can i get this: a, b, c, d, e, f thanks ahead of time. -- View this message in context: http://r.789695.n4.nabble.com/split-list-of-characters-in-groups-of-2-tp4079031p4079031.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spatial Statistics using R
Take a look on http://geodacenter.asu.edu/ , Training section. On Thu, Nov 17, 2011 at 4:28 AM, vioravis viora...@gmail.com wrote: I am looking for online courses to learn Spatial Statistics using R. Statistics.com is offering an online course in December on the same topic but that schedule doesn't suit mine. Are there any other similar modes for learning spatial statistics using R??? Can someone please advice??? Thank you. Ravi -- View this message in context: http://r.789695.n4.nabble.com/Spatial-Statistics-using-R-tp4079092p4079092.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Atenciosamente, Raphael Saldanha saldanha.plan...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Exclude NA while summing
Dear R users, I am new to R and have some query. I am having a dataset with binary output 0's and ones. But along with it it has NA's too. I want to sum all the rows and get the sum total for each column. But whenever there is a NA in an row the sum of the row is returned as NA so I am not able to sum up the values. *row.sums.m - apply(dummy.curr.res.m,1,sum)* It would be helpful if I get some input on this. Regards Vikram [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Introducing \n's so that par.strip.text can produce multiline strips in lattice
Hi: This worked for me - I needed to modify some of the strip labels to improve the appearance a bit and also reduced the strip font size a bit to accommodate the lengths of the strings. The main thing was to change \\n to \n. Firstly, I created a new variable called Indic as a character variable and then did some minor surgery on three of the strings: Indic - as.character(imports$Indicator) Indic[3 + 6 *(0:5)] - Chemicals and related\n products imports Indic[4 + 6 *(0:5)] - Pearls, semiprecious \nprecious stones imports Indic[5 + 6 *(0:5)] - Metaliferrous ores \nmetal scrap imports # Read Indic into the imports data frame as a factor: imports$Indic - factor(Indic) # Redo the plot: barchart(X03/1000 ~ time | Indic, data = imports[which(imports$time != 1), ], horiz = FALSE, scales = list(x = list(rot=45, labels=paste(Mar,2007:2011))), par.strip.text=list(lineheight=1, lines=2, cex = 0.8)) Dennis On Wed, Nov 16, 2011 at 11:25 PM, Ashim Kapoor ashimkap...@gmail.com wrote: Dear all, I have the following data, which has \\n in place of \n. I introduced \n's in the csv file so that I could use it in barchart in lattice. When I did that and read it into R using read.csv, it read it as \\n. My question is how do I introduce \n in the middle of a long string of quoted text so that lattice can make multiline strips. Hitting Enter which is supposed to introduce \n's does'nt work because when I goto the middle of the line and press enter Open Office thinks that I am done with editing my text and takes me to the next line. dput(imports) structure(list(Indicator = structure(c(5L, 4L, 2L, 12L, 8L, 7L, 5L, 4L, 2L, 12L, 8L, 7L, 5L, 4L, 2L, 12L, 8L, 7L, 5L, 4L, 2L, 12L, 8L, 7L, 5L, 4L, 2L, 12L, 8L, 7L, 5L, 4L, 2L, 12L, 8L, 7L ), .Label = c(, Chemicals and related\\n products imports, Coal export, Gold imports, Gold silver imports, Iron ore export, Iron steel imports, Metaliferrous ores metal scrap imports, Mica export, Ores minerals\\nexport, Other ores \\nminerals export, Pearls precious \\n semiprecious stones imports, Processed minerals\\n export ), class = factor), Units = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(, Rs.crore), class = factor), Expression = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(, Ival), class = factor), time = c(7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 1, 1, 1, 1, 1, 1), X03 = c(66170.46, 65337.72, 62669.86, 33870.17, 36779.35, 27133.25, 71829.14, 67226.04, 75086.89, 29505.61, 31750.99, 32961.26, 104786.39, 95323.8, 134276.63, 76263, 36363.61, 41500.36, 140440.36, 135877.91, 111269.69, 76678.27, 36449.89, 36808.06, 162253.77, 154346.72, 124895.76, 142437.03, 42872.16, 43881.85, 109096.024, 103622.438, 101639.766, 71750.816, 36843.2, 36456.956), id = c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L)), row.names = c(1.7, 2.7, 3.7, 4.7, 5.7, 6.7, 1.8, 2.8, 3.8, 4.8, 5.8, 6.8, 1.9, 2.9, 3.9, 4.9, 5.9, 6.9, 1.10, 2.10, 3.10, 4.10, 5.10, 6.10, 1.11, 2.11, 3.11, 4.11, 5.11, 6.11, 1.1, 2.1, 3.1, 4.1, 5.1, 6.1 ), .Names = c(Indicator, Units, Expression, time, X03, id), class = data.frame, reshapeLong = structure(list(varying = structure(list( X03 = c(X03.07, X03.08, X03.09, X03.10, X03.11, X03.1)), .Names = X03, v.names = X03, times = c(7, 8, 9, 10, 11, 1)), v.names = X03, idvar = id, timevar = time), .Names = c(varying, v.names, idvar, timevar))) On which I want to run barchart(X03/1000~time|Indicator, data=imports[which(imports$time!=1),], horiz=F, scales=list(x=list(rot=45,labels=paste(Mar,2007:2011))), par.strip.text=list(lineheight=1,lines=2)) Many thanks, Ashim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] aov how to get the SST?
Hello, I currently run aov in the following way: throughput.aov - aov(log(Throughput)~No_databases+Partitioning+No_middlewares+Queue_size,data=throughput) summary(throughput.aov) Df Sum Sq Mean Sq F valuePr(F) No_databases 1 184.68 184.675 136.6945 2.2e-16 *** Partitioning 1 70.16 70.161 51.9321 2.516e-12 *** No_middlewares 2 44.22 22.110 16.3654 1.395e-07 *** Queue_size 1 0.40 0.395 0.29260.5888 Residuals 440 594.44 1.351 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 In order to compute the fraction of variation I need to know the total Sum Sq. and I assume it is like this: SST = SS-No_databases + SS-Partitioning + SS-No_middlewares + SS-Queue_size = 184.68 + 70.16 + 44.22 + 0.40 = 299.46 So the fraction of variation explained by the No_databases would be: SST/SS-No_databases = 184.68/299.46 = 0.6167101 ... and finally I can say that the No_databases explains 61.6% of the variation in Throughput. Is this correct? if so, how can I do the same calculations using R? I haven't found the way to extract the Sum Sq out of the throughput.aov Object. Is there a function to get this 0.6167101 and 61.6% results without having to do it manually? even better if I get a table containing all these fraction of variations? Since this is a 2^k experiment, I cant see how the Residuals fit in the formula. When I introduce replications (blocking factor) then I can also include a SSE term into the SST calculation. TIA, Best regards, Giovanni [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exclude NA while summing
change to row.sums.m - apply(dummy.curr.res.m,1,sum, na.rm = TRUE) Sent from my iPad On Nov 17, 2011, at 5:18, Vikram Bahure economics.vik...@gmail.com wrote: Dear R users, I am new to R and have some query. I am having a dataset with binary output 0's and ones. But along with it it has NA's too. I want to sum all the rows and get the sum total for each column. But whenever there is a NA in an row the sum of the row is returned as NA so I am not able to sum up the values. *row.sums.m - apply(dummy.curr.res.m,1,sum)* It would be helpful if I get some input on this. Regards Vikram [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to resample one per group
Hello, I have got a dataframe which looks like: y - c(1,5,6,2,5,10) # response x - c(2,12,8,1,16,17) # predictor group - factor(c(1,2,2,3,4,4)) # group df - data.frame(y,x,group) Now I'd like to resample that dataset. I want to get dataset (row) per group. So per total sample I get 4 rows into a new data frame. How can I do that? Is there any simple approach using an existing package. I looked at function strata() from package sampling. I don't if that is the function for that or if there is a simpler approach with sample(). What I unsuccessfully tried so far: library(sampling) strata(data=df,group,size=(rep(1,nlevels(group Maybe you can help me to do this resampling... Thank you, Johannes -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to resample one per group
Something like this? library(plyr) ddply(df, .(group), function(x){ x[sample(nrow(x), 1), ] }) Best regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Johannes Radinger Verzonden: donderdag 17 november 2011 12:37 Aan: r-help@r-project.org Onderwerp: [R] How to resample one per group Hello, I have got a dataframe which looks like: y - c(1,5,6,2,5,10) # response x - c(2,12,8,1,16,17) # predictor group - factor(c(1,2,2,3,4,4)) # group df - data.frame(y,x,group) Now I'd like to resample that dataset. I want to get dataset (row) per group. So per total sample I get 4 rows into a new data frame. How can I do that? Is there any simple approach using an existing package. I looked at function strata() from package sampling. I don't if that is the function for that or if there is a simpler approach with sample(). What I unsuccessfully tried so far: library(sampling) strata(data=df,group,size=(rep(1,nlevels(group Maybe you can help me to do this resampling... Thank you, Johannes -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data/variables
Well, if your problem is that a workspace is being loaded automatically and you don't want that workspace, you have several options: 1. Use a different directory for each project so that the file loaded by default is the correct one. 2. Don't save your workspace, but regenerate it each time. 3. Use R --vanilla or your OS's equivalent to start R without loading anything automatically, and use load() and save() to manually manage RData files. Yes, it's convenient, but if you want to use a non-standard way of working you need to understand what you're doing. Sarah On Thu, Nov 17, 2011 at 3:10 AM, Steven Yen s...@utk.edu wrote: Thanks Sarah. I have read about the problems with attach(), and I will try to avoid it. I have now found the line that's causing the problem is: setwd(z:/homework) With that line in place, either in a program or in Rprofile.site (?), then the moment I run R and simply enter (before reading any data) summary(mydata) I get sample statistics for a dozen variables! Do not save the workspace? I thought the option to save/use a binary file is meant to be convenient. I like working in the same working directory, and I like .rdata files. Does this sound hopeless? Thanks. At 09:26 PM 11/15/2011, Sarah Goslee wrote: Hi, The obvious answer is don't use attach() and you'll never have that problem. And see further comments inline. On Tue, Nov 15, 2011 at 6:05 PM, Steven Yen s...@utk.edu wrote: Can someone help me with this variable/data reading issue? I read a csv file and transform/create an additional variable (called y). The first set of commands below produced different sample statistics for hw11$y and y In the second set of command I renameuse the variable name yy, and sample statistics for $hw11$yy and yy are identical. Using y - yy fixed it, but I am not sure why I would need to do that. That y appeared to have come from a variable called y from another data frame (unrelated to the current run). Help! setwd(z:/homework) sink (z:/homework/hw11.our, append=T, split=T) hw11 - read.csv(ij10b.csv,header=T) hw11$y - hw11$e3 attach(hw11) The following object(s) are masked _by_ '.GlobalEnv': y Look there. R even *told* you that it was going to use the y in the global environment rather than the one you were trying to attach. The other solution: don't save your workspace. Your other email on this topic suggested to me that there is a .RData file in your preferred working directory that contains an object y, and that's what is interfering with what you think should happen. Deleting that file, or using a different directory, or removing y before you attach the data frame would all work. But truly, the best possible strategy is to avoid using attach() so you don't have to worry about which object named y is really being used because you specify it explicitly. (n - dim(hw11)[1]) [1] 13765 summary(hw11$y) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(hw11$y) [1] 13765 summary(y) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0 0.0 0.0 0.24958 0.0 1.0 length(y) [1] 601 setwd(z:/homework) sink (z:/homework/hw11.our, append=T, split=T) hw11 - read.csv(ij10b.csv,header=T) hw11$yy - hw11$e3 attach(hw11) hw11$yy - hw11$e3 summary(hw11$yy) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(hw11$yy) [1] 13765 summary(yy) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(yy) [1] 13765 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to define the bound between parameters in nls()
Hi there, I have read the help page of nls(), there is lower or upper for defining the bounds of parameters. For example, nls(y ~ 1-a*exp(-k1*x)-(1-a)*exp(-k2*x), data=data.1, start=list(a=0.02, k1=0.01, k2=0.0004), upper=c(1,1,1), lower=c(0,0,0)) I hope to define k1 k2, but I don't find a way. Any suggestions will be really appreciated. Regards, Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to read the text ?
hi,R users: I have such a text num = 3 testco = 12 testno = 1;12;3 infp = test1;test2;test3 How can I read this text by readLines? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RV: Reporting a conflict between ADMB and Rtools on Windows systems
On Thu, Nov 17, 2011 at 3:54 AM, Rubén Roa r...@azti.es wrote: I've just found that there is a conflict between tools used to build R packages (Rtools) and ADMB due to the need to put Rtools compiler's location in the PATH environmental variable to make Rtools work. On a Windows 7 64bit with Rtools installed I installed ADMB-IDE latest version and although I could translate ADMB code to cpp code I could not build the cpp code into an executable via ADMB-IDE's compiler. On another Windows machine, a Windows Vista 32bits with Rtools installed I also installed the latest ADMB-IDE and this time it was not possible to create the .obj file on the way to build the executable when building with ADMB-IDE. On this machine I also have a previous ADMB version (6.0.1) that I used to run from the DOS shell. This ADMB also failed to build the .obj file. Now, going to PATH, the location info to make Rtools is: c:\Rtools\bin;c:\Rtools\MinGW\bin;c:\Rtools\MinGW64\bin;C:\Program Files (x86)\MiKTeX 2.9\miktex\bin; If from this list I remove the reference to the compiler c:\Rtools\MinGW\bin then ADMB works again. So beware of this conflict. Suggestion of a solution will be appreciated. Meanwhile, I run ADMB code in one computer and build R packages with Rtools in another computer. The batchfiles Rcmd.bat, Rgui.bat temporarily add R and Rtools to your path by looking them up in the registry and then calling Rcmd.exe or Rgui.exe respectively. When R is finished the path is restored to what it was before. By using those its not necessary to have either on your path.These are self contained batch files with no dependencies so can simply be placed anywhere on the path in order to use them. For those and a few other batch files of interest to Windows users of R see: http://batchfiles.googlecode.com -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spatial Statistics using R
Thanks, Raphael. Just checked their website. It appears that they currently do not have any online courses planned. -- View this message in context: http://r.789695.n4.nabble.com/Spatial-Statistics-using-R-tp4079092p4079574.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vectorizing for weighted distance
The fastest is probably to just implement the matrix calculation directly in R with the %*% operator. (X1-X2) %*% W %*% (X1-X2) You don't need to worry about the transposing if you are passing R vectors X1,X2. If they are 1-d matrices, you might need to. Michael On Thu, Nov 17, 2011 at 1:30 AM, Sachinthaka Abeywardana sachin.abeyward...@gmail.com wrote: Hi All, I am trying to convert the following piece of matlab code to R: XX1 = sum(w(:,ones(1,N1)).*X1.*X1,1); #square the elements of X1, weight it and repeat this vector N1 times XX2 = sum(w(:,ones(1,N2)).*X2.*X2,1); #square the elements of X2, weigh and repeat this vector N2 times X1X2 = (w(:,ones(1,N1)).*X1)'*X2; #get the weighted 'covariance' term XX1T = XX1'; #transpose z = XX1T(:,ones(1,N2)) + XX2(ones(1,N1),:) - 2*X1X2; #get the squared weighted distance which is basically doing: z=(X1-X2)' W (X1-X2) What would the best way (for SPEED) to do this? or is vectorizing as above the best? Any hints, suggestions? Thanks, Sachin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exclude NA while summing
Or, for this specific application rowSums(XXX, na.rm = TRUE) Michael On Thu, Nov 17, 2011 at 5:51 AM, Jim Holtman jholt...@gmail.com wrote: change to row.sums.m - apply(dummy.curr.res.m,1,sum, na.rm = TRUE) Sent from my iPad On Nov 17, 2011, at 5:18, Vikram Bahure economics.vik...@gmail.com wrote: Dear R users, I am new to R and have some query. I am having a dataset with binary output 0's and ones. But along with it it has NA's too. I want to sum all the rows and get the sum total for each column. But whenever there is a NA in an row the sum of the row is returned as NA so I am not able to sum up the values. *row.sums.m - apply(dummy.curr.res.m,1,sum)* It would be helpful if I get some input on this. Regards Vikram [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RV: Reporting a conflict between ADMB and Rtools on Windows systems
Thanks Gabor and Jan. The batch files solution seems the way to go. Will implement it! Rubén -Mensaje original- De: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Enviado el: jueves, 17 de noviembre de 2011 13:58 Para: Rubén Roa CC: r-help@r-project.org Asunto: Re: [R] RV: Reporting a conflict between ADMB and Rtools on Windows systems On Thu, Nov 17, 2011 at 3:54 AM, Rubén Roa r...@azti.es wrote: I've just found that there is a conflict between tools used to build R packages (Rtools) and ADMB due to the need to put Rtools compiler's location in the PATH environmental variable to make Rtools work. On a Windows 7 64bit with Rtools installed I installed ADMB-IDE latest version and although I could translate ADMB code to cpp code I could not build the cpp code into an executable via ADMB-IDE's compiler. On another Windows machine, a Windows Vista 32bits with Rtools installed I also installed the latest ADMB-IDE and this time it was not possible to create the .obj file on the way to build the executable when building with ADMB-IDE. On this machine I also have a previous ADMB version (6.0.1) that I used to run from the DOS shell. This ADMB also failed to build the .obj file. Now, going to PATH, the location info to make Rtools is: c:\Rtools\bin;c:\Rtools\MinGW\bin;c:\Rtools\MinGW64\bin;C:\Program Files (x86)\MiKTeX 2.9\miktex\bin; If from this list I remove the reference to the compiler c:\Rtools\MinGW\bin then ADMB works again. So beware of this conflict. Suggestion of a solution will be appreciated. Meanwhile, I run ADMB code in one computer and build R packages with Rtools in another computer. The batchfiles Rcmd.bat, Rgui.bat temporarily add R and Rtools to your path by looking them up in the registry and then calling Rcmd.exe or Rgui.exe respectively. When R is finished the path is restored to what it was before. By using those its not necessary to have either on your path.These are self contained batch files with no dependencies so can simply be placed anywhere on the path in order to use them. For those and a few other batch files of interest to Windows users of R see: http://batchfiles.googlecode.com -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pairwise correlation
On Wed, Nov 16, 2011 at 11:22 PM, muzz56 musah...@gmail.com wrote: Thanks to everyone who replied to my post, I finally got it to work. I am however not sure how well it worked since it run so quickly, but seems like I have a 2000 x 2000 data set. Behold the great and mighty power that is R! Don't worry -- on a decent machine the correlation of a 2k x 2k data set should be pretty fast. (It's about 9 seconds on my old-ish laptop with a bunch of other junk running) My followup questions would be, how do I get only pairs with say a certain pearson correlation value additionally it seems like my output didn't retain the headers but instead replaced them with numbers making it hard to know which gene pairs correlate. This is a little worrisome: R carries column names through cor() so this would suggest you weren't using them. Were your headers listed as part of your data (instead of being names)? If so, they would have been taken as numbers. Take a look at dimnames(NAMEOFDATA) -- if your headers aren't there, then they are being treated as data instead of numbers. If they are, can you provide some reproducible code and we can debug more fully. The easiest way to send data is to use the dput() function to get a copy-pasteable plain text representation. It would also be great if you could restrict it to a subset of your data rather than the full 4M data points, but if that's hard to do, don't worry. You should have expected behavior like X - matrix(1:9,3) colnames(X) - c(A,B,C) cor(X) # Prints with labels Michael On 16 November 2011 17:11, Nordlund, Dan (DSHS/RDA) [via R] ml-node+s789695n4078114...@n4.nabble.com wrote: -Original Message- From: [hidden email]http://user/SendEmail.jtp?type=nodenode=4078114i=0[mailto: r-help-bounces@r- project.org] On Behalf Of muzz56 Sent: Wednesday, November 16, 2011 12:28 PM To: [hidden email]http://user/SendEmail.jtp?type=nodenode=4078114i=1 Subject: Re: [R] Pairwise correlation Thanks Peter. I tried this after reading in the csv (read.csv) and converted the data to matrix (as.matrix). But when I tried the correlation, I keeping getting the error (x must be numeric) yet when I view the data, its numeric. What does R tell you if you execute the following? str(x) Just because the data looks like it is numeric when it prints doesn't mean it is. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ [hidden email] http://user/SendEmail.jtp?type=nodenode=4078114i=2mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4078114.html To unsubscribe from Pairwise correlation, click herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4076963code=bXVzYWhhc3NAZ21haWwuY29tfDQwNzY5NjN8LTE5ODYxNDM0OTI= . NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.InstantMailNamespacebreadcrumbs=instant+emails%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4078915.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R2 for a mixed-effects model with AR(1) error structure
Dear All, The following equation is a linear mixed-effects model with linear trend and AR(1) error structure, y = B0 + B1x + bo + b1x + e; e~AR(1) where y is a response, x is the predictor, B0 and B1 are fixed effects and b0 and b1 are random effects. Coud someone please advice me a function to compute the R2 for the goodness of fit of the above model in R package, and are the computation for R2 in the mixed model with AR(1) error structure similar to the mixed model with no error structure? thanks, Fir [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spatial Statistics using R
vioravis vioravis at gmail.com writes: Thanks, Raphael. Just checked their website. It appears that they currently do not have any online courses planned. You may find that this site: http://geostat-course.org/ has a wider range of possible courses. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spatial Statistics using R
Hi Ravi, You would probably get more answers to this if you posted to the list r-sig-geo. The following course was advertised a week ago and might match your needs: http://www.itc.nl/personal/rossiter/teach/degeostats.html You might also find the videos from this years' GEOSTAT course in Landau interesting: http://www.archive.org/search.php?query=GEOSTAT%20Landau Cheers, Jon On 17-Nov-11 7:28, vioravis wrote: I am looking for online courses to learn Spatial Statistics using R. Statistics.com is offering an online course in December on the same topic but that schedule doesn't suit mine. Are there any other similar modes for learning spatial statistics using R??? Can someone please advice??? Thank you. Ravi -- View this message in context: http://r.789695.n4.nabble.com/Spatial-Statistics-using-R-tp4079092p4079092.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jon Olav Skøien Joint Research Centre - European Commission Institute for Environment and Sustainability (IES) Global Environment Monitoring Unit Via Fermi 2749, TP 440, I-21027 Ispra (VA), ITALY jon.sko...@jrc.ec.europa.eu Tel: +39 0332 789206 Disclaimer: Views expressed in this email are those of the individual and do not necessarily represent official views of the European Commission. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spatial Statistics using R
On 11/17/2011 06:28 AM, vioravis wrote: I am looking for online courses to learn Spatial Statistics using R. Statistics.com is offering an online course in December on the same topic but that schedule doesn't suit mine. Are there any other similar modes for learning spatial statistics using R??? Can someone please advice??? Thank you. Ravi -- View this message in context: http://r.789695.n4.nabble.com/Spatial-Statistics-using-R-tp4079092p4079092.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. There is an online course by the ITC: http://www.itc.nl/Pub/Study/Courses/C12-GFM-DE-02 cheers, Paul -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pairwise correlation
I think something like this should do it, but I can't test without data: rownames(mydata) - mydata[,1] # Put the elements in the first column as rownames mydata - mydata[,-1] # drop the things that are now rownames Michael On Thu, Nov 17, 2011 at 9:23 AM, Musa Hassan musah...@gmail.com wrote: Hi Michael, Thanks for the response. I have noticed that the error occurred during my data read. It appears that the rownames (which when the data is transposed become my colnames) were converted to numbers instead of strings as they should be. The original header names don't change, just the rownames. I have to figure out how to import the data and have the strings not converted. Right now am using: mydata = read.csv(mydata.csv, headers=T,stringsAsFactors=F) then to convert the data frame to matrix mydata=data.matrix(mydata) Then I just do the correlation as Peter suggested. expression=cor(t(expression)) Thanks. On 17 November 2011 08:51, R. Michael Weylandt michael.weyla...@gmail.com wrote: On Wed, Nov 16, 2011 at 11:22 PM, muzz56 musah...@gmail.com wrote: Thanks to everyone who replied to my post, I finally got it to work. I am however not sure how well it worked since it run so quickly, but seems like I have a 2000 x 2000 data set. Behold the great and mighty power that is R! Don't worry -- on a decent machine the correlation of a 2k x 2k data set should be pretty fast. (It's about 9 seconds on my old-ish laptop with a bunch of other junk running) My followup questions would be, how do I get only pairs with say a certain pearson correlation value additionally it seems like my output didn't retain the headers but instead replaced them with numbers making it hard to know which gene pairs correlate. This is a little worrisome: R carries column names through cor() so this would suggest you weren't using them. Were your headers listed as part of your data (instead of being names)? If so, they would have been taken as numbers. Take a look at dimnames(NAMEOFDATA) -- if your headers aren't there, then they are being treated as data instead of numbers. If they are, can you provide some reproducible code and we can debug more fully. The easiest way to send data is to use the dput() function to get a copy-pasteable plain text representation. It would also be great if you could restrict it to a subset of your data rather than the full 4M data points, but if that's hard to do, don't worry. You should have expected behavior like X - matrix(1:9,3) colnames(X) - c(A,B,C) cor(X) # Prints with labels Michael On 16 November 2011 17:11, Nordlund, Dan (DSHS/RDA) [via R] ml-node+s789695n4078114...@n4.nabble.com wrote: -Original Message- From: [hidden email]http://user/SendEmail.jtp?type=nodenode=4078114i=0[mailto: r-help-bounces@r- project.org] On Behalf Of muzz56 Sent: Wednesday, November 16, 2011 12:28 PM To: [hidden email]http://user/SendEmail.jtp?type=nodenode=4078114i=1 Subject: Re: [R] Pairwise correlation Thanks Peter. I tried this after reading in the csv (read.csv) and converted the data to matrix (as.matrix). But when I tried the correlation, I keeping getting the error (x must be numeric) yet when I view the data, its numeric. What does R tell you if you execute the following? str(x) Just because the data looks like it is numeric when it prints doesn't mean it is. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ [hidden email] http://user/SendEmail.jtp?type=nodenode=4078114i=2mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4078114.html To unsubscribe from Pairwise correlation, click herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4076963code=bXVzYWhhc3NAZ21haWwuY29tfDQwNzY5NjN8LTE5ODYxNDM0OTI= . NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.InstantMailNamespacebreadcrumbs=instant+emails%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4078915.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version
[R] how to read a freetext line ?
hi everyone . Here I have a text where there are some integer and string variables.But I can not read them by readLines and scan the text is : weight ;30;130 food:2;1;12 color:white;black the first column is the names of the variables and others are the value of them. the column in different line are different. Can anyone help me ? -- TANG Jie Email: totang...@gmail.com Tel: 0086-2154896104 Shanghai Typhoon Institute,China [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to read a freetext line ?
Hi, On Thu, Nov 17, 2011 at 9:37 AM, Jie TANG totang...@gmail.com wrote: hi everyone . Here I have a text where there are some integer and string variables.But I can not read them by readLines and scan I've seen this question several times this morning. If that's you, please do not post multiple times. If you haven't gotten an answer in a couple days, then it's okay to ask again, but the trouble is usually with your question, like here. the text is : weight ;30;130 food:2;1;12 color:white;black the first column is the names of the variables and others are the value of them. the column in different line are different. Can anyone help me ? What have you tried? What format do you need? For instance, reading them in as a single string is easy. Using strsplit() to separate that single string into several strings is easy. But without knowing what you are trying to achieve, there's really no way to help you beyond suggesting those two functions. Sarah -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package installtion
I believe the problem is a column of zeroes in my x matrix. I have tried the suggestions in the documentation, so now to try to confirm the probelm I'd like to run debug. Here's where I think the problem is: ###~~ Fitting the model using lmer funtion ~~### (fitmodel - lmer(modelformula,data,family=binomial(link=logit),nAGQ=1)) mtrace(fitmodel) I added the mtrace to catch the error, but get the following: Error in mtrace(fitmodel) : Can't find fitmodel How can I debug this? - Original Message - From: Rolf Turner rolf.tur...@xtra.co.nz To: Scott Raynaud scott.rayn...@yahoo.com Cc: r-help@r-project.org r-help@r-project.org Sent: Wednesday, November 16, 2011 6:04 PM Subject: Re: [R] package installtion On 17/11/11 05:37, Scott Raynaud wrote: That might be an option if it weren't my most important predictor. I'm thinking my best bet is to use MLWin for the estimation since it will properly set fixed effects to 0. All my other sample size simulation programs use SAS PROC IML which I don't have/can't afford. I like R since it's free, but I can't work around the problem I'm currently having. This is the ``push every possible button until you get a result and to hell with what anything actually means'' approach to statistics. The probability of getting a *meaningful* result from this approach is close to zero. Why don't you try to *understand* what is going on, rather than wildly throwing every possible piece of software at the problem until one such piece runs? cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] List of lists to data frame?
I don't know if this is faster, but ... out - do.call(rbind, lapply(s, function(x)data.frame(x$category,x$name,as.vector(x$series ## You can then name the columns of out via names() Note: No fancy additional packages are required. -- Bert On Wed, Nov 16, 2011 at 6:39 PM, Kevin Burton rkevinbur...@charter.net wrote: Say I have the following data: s - list() s[[A]] - list(name=first, series=ts(rnorm(50), frequency=10, start=c(2000,1)), category=top) s[[B]] - list(name=second, series=ts(rnorm(60), frequency=10, start=c(2000,2)), category=next) If I use unlist since this is a list of lists I don't end up with a data frame. And the number of rows in the data frame should equal the number of time series entries. In the sample above it would be 110. I would expect that the name and category strings would be recycled for each row. My brute force code attempts to build the data frame by appending to the master data frame but like I said it is *very* slow. Kevin -Original Message- From: R. Michael Weylandt [mailto:michael.weyla...@gmail.com] Sent: Wednesday, November 16, 2011 5:26 PM To: rkevinbur...@charter.net Cc: r-help@r-project.org Subject: Re: [R] List of lists to data frame? unlist(..., recursive = F) Michael On Wed, Nov 16, 2011 at 6:20 PM, rkevinbur...@charter.net wrote: I would like to make the following faster: df - NULL for(i in 1:length(s)) { df - rbind(df, cbind(names(s[i]), time(s[[i]]$series), as.vector(s[[i]]$series), s[[i]]$category)) } names(df) - c(name, time, value, category) return(df) The s object is a list of lists. It is constructed like: s[[object]] - list(. . . . . .) where object would be the name associated with this list s[[i]]$series is a 'ts' object and s[[i]]$category is a name. Constructing this list is reasonably fast but to do some more processing on the data it would be easier if it were converted to a data frame. Right now the above code is unacceptably slow at converting this list of lists to a data frame. Any suggestions on how to optimize this are welcome. Thank you. Kevin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package installtion
Why are you trying to take the matrix trace of a regression model? (That's the only hit for mtrace on my system at least) Perhaps you mean to use traceback() or, even more useful, options(error = recover) Michael On Thu, Nov 17, 2011 at 9:49 AM, Scott Raynaud scott.rayn...@yahoo.com wrote: I believe the problem is a column of zeroes in my x matrix. I have tried the suggestions in the documentation, so now to try to confirm the probelm I'd like to run debug. Here's where I think the problem is: ###~~ Fitting the model using lmer funtion ~~### (fitmodel - lmer(modelformula,data,family=binomial(link=logit),nAGQ=1)) mtrace(fitmodel) I added the mtrace to catch the error, but get the following: Error in mtrace(fitmodel) : Can't find fitmodel How can I debug this? - Original Message - From: Rolf Turner rolf.tur...@xtra.co.nz To: Scott Raynaud scott.rayn...@yahoo.com Cc: r-help@r-project.org r-help@r-project.org Sent: Wednesday, November 16, 2011 6:04 PM Subject: Re: [R] package installtion On 17/11/11 05:37, Scott Raynaud wrote: That might be an option if it weren't my most important predictor. I'm thinking my best bet is to use MLWin for the estimation since it will properly set fixed effects to 0. All my other sample size simulation programs use SAS PROC IML which I don't have/can't afford. I like R since it's free, but I can't work around the problem I'm currently having. This is the ``push every possible button until you get a result and to hell with what anything actually means'' approach to statistics. The probability of getting a *meaningful* result from this approach is close to zero. Why don't you try to *understand* what is going on, rather than wildly throwing every possible piece of software at the problem until one such piece runs? cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to read the text ?
See Sarah's reply here: http://www.mail-archive.com/r-help@r-project.org/msg152883.html Michael On Thu, Nov 17, 2011 at 7:54 AM, haohao Tsing haohaor...@gmail.com wrote: hi,R users: I have such a text num = 3 testco = 12 testno = 1;12;3 infp = test1;test2;test3 How can I read this text by readLines? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modelling and R misconceptions; was: package installtion
This is hopeless, since you never seem to listen to our advice, therefore this will be my very last try: So you actually need local advice, both for statistical concepts and R related. No statistics software can estimate effects of variables that you observed to be constant (e.g. 0) all the time. If any software does, please delete it a once from your machine. Instead, ask a local statistician for advice on your problem. You certainly want to show the data and your model to the local expert - since you don't show us. And then you want to ask for local R course since reading the documentation seems not to help. Applying mtrace() in a non exiting object shows this straight away. Uwe Ligges On 17.11.2011 15:49, Scott Raynaud wrote: I believe the problem is a column of zeroes in my x matrix. I have tried the suggestions in the documentation, so now to try to confirm the probelm I'd like to run debug. Here's where I think the problem is: ###~~ Fitting the model using lmer funtion~~### (fitmodel- lmer(modelformula,data,family=binomial(link=logit),nAGQ=1)) mtrace(fitmodel) I added the mtrace to catch the error, but get the following: Error in mtrace(fitmodel) : Can't find fitmodel How can I debug this? - Original Message - From: Rolf Turnerrolf.tur...@xtra.co.nz To: Scott Raynaudscott.rayn...@yahoo.com Cc: r-help@r-project.orgr-help@r-project.org Sent: Wednesday, November 16, 2011 6:04 PM Subject: Re: [R] package installtion On 17/11/11 05:37, Scott Raynaud wrote: That might be an option if it weren't my most important predictor. I'm thinking my best bet is to use MLWin for the estimation since it will properly set fixed effects to 0. All my other sample size simulation programs use SAS PROC IML which I don't have/can't afford. I like R since it's free, but I can't work around the problem I'm currently having. This is the ``push every possible button until you get a result and to hell with what anything actually means'' approach to statistics. The probability of getting a *meaningful* result from this approach is close to zero. Why don't you try to *understand* what is going on, rather than wildly throwing every possible piece of software at the problem until one such piece runs? cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to read a freetext line ?
Hi, Please copy your replies to r-help so others may participate in the discussion. 2011/11/17 Jie TANG totang...@gmail.com: yes ,I have tried readLines by config- readLines(configfile,ok=TRUE,n=-1) #but when strsplit is used as below food-unlist(strsplit(config[2].:)) #here food is a vector but the value of food in the text .e.g.12 , is still a string 12 ,not a integer 12. so I have to use strtoi and strtrim ,but I can not decide the value of food is 1 digital or more e.g 1 or 12, so food-strtoi(strtrim(strsplit(config[2]),:)[2],1)) ~~~the number of the stringtointeger could not be get beforehand in my hand since it will be changed by other users.so how could I resolve this problem by read the numierical number into my configure file ? thank you . food-strtoi(strtrim(strsplit(config[2]),:)[2],1)) So your problem is with strsplit(), and not with reading in the data? I did it in many steps so that you can see how each bit works: string1 - food:2;1;12 # one of your lines string2 - strsplit(string1, :) # separate name from values by : varname - string2[[1]][1] varname [1] food values - unlist(strsplit(string2[[1]][2], ;)) # separate individual values by ; values [1] 2 1 12 values - as.numeric(values) # convert to numbers values [1] 2 1 12 2011/11/17 Sarah Goslee sarah.gos...@gmail.com Hi, On Thu, Nov 17, 2011 at 9:37 AM, Jie TANG totang...@gmail.com wrote: hi everyone . Here I have a text where there are some integer and string variables.But I can not read them by readLines and scan I've seen this question several times this morning. If that's you, please do not post multiple times. If you haven't gotten an answer in a couple days, then it's okay to ask again, but the trouble is usually with your question, like here. the text is : weight ;30;130 food:2;1;12 color:white;black the first column is the names of the variables and others are the value of them. the column in different line are different. Can anyone help me ? What have you tried? What format do you need? For instance, reading them in as a single string is easy. Using strsplit() to separate that single string into several strings is easy. But without knowing what you are trying to achieve, there's really no way to help you beyond suggesting those two functions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim seems to be finding a local minimum
One more thing: trying to defend R's honor, I've run optimx instead of optim (after dividing the IV by its max - same as for optim). I did not use L-BFGS-B with lower bounds anymore. Instead, I've used Nelder-Mead (no bounds). First, it was faster: for a loop across 10 different IVs BFGS took 6.14 sec and Nelder-Mead took just 3.9 sec. Second, the solution was better - Nelder-Mead fits were ALL better than L-BFGS-B fits and ALL better than Excel solver's solutions. Of course, those were small improvements, but still, it's nice! Dimitri On Mon, Nov 14, 2011 at 5:26 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Just to provide some closure: I ended up dividing the IV by its max so that the input vector (IV) is now between zero and one. I still used optim: myopt - optim(fn=myfunc, par=c(1,1), method=L-BFGS-B, lower=c(0,0)) I was able to get great fit, in 3 cases out of 10 I've beaten Excel Solver, but in 7 cases I lost to Excel - but again, by really tiny margins (generally less than 1% of Excel's fit value). Thank you everybody! Dimitri On Fri, Nov 11, 2011 at 10:28 AM, John C Nash nas...@uottawa.ca wrote: Some tips: 1) Excel did not, as far as I can determine, find a solution. No point seems to satisfy the KKT conditions (there is a function kktc in optfntools on R-forge project optimizer. It is called by optimx). 2) Scaling of the input vector is a good idea given the seeming wide range of values. That is, assuming this can be done. If the function depends on the relative values in the input vector rather than magnitude, this may explain the trouble with your function. That is, if the function depends on the relative change in the input vector and not its scale, then optimizers will have a lot of trouble if the scale factor for this vector is implicitly one of the optimization parameters. 3) If you can get the gradient function you will almost certainly be able to do better, especially in finding whether you have a minimum i.e., null gradient, positive definite Hessian. When you have gradient function, kktc uses Jacobian(gradient) to get the Hessian, avoiding one level of digit cancellation. JN On 11/11/2011 10:20 AM, Dimitri Liakhovitski wrote: Thank you very much to everyone who replied! As I mentioned - I am not a mathematician, so sorry for stupid comments/questions. I intuitively understand what you mean by scaling. While the solution space for the first parameter (.alpha) is relatively compact (probably between 0 and 2), the second one (.beta) is all over the place - because it is a function of IV (input vector). And that's, probably, my main challenge - that I am trying to write a routine for different possible IVs that I might be facing (they may be in hundreds, in thousands, in millions). Should I be rescaling the IV somehow (e.g., by dividing it by its max) - or should I do something with the parameter .beta inside my function? So far, I've written a loop over many different starting points for both parameters. Then, I take the betas around the best solution so far, split it into smaller steps for beta (as starting points) and optimize again for those starting points. What disappoints me is that even when I found a decent solution (the minimized value of 336) it was still worse than the Solver solution! And I am trying to prove to everyone here that we should do R, not Excel :-) Thanks again for your help, guys! Dimitri On Fri, Nov 11, 2011 at 9:10 AM, John C Nash nas...@uottawa.ca wrote: I won't requote all the other msgs, but the latest (and possibly a bit glitchy) version of optimx on R-forge 1) finds that some methods wander into domains where the user function fails try() (new optimx runs try() around all function calls). This includes L-BFGS-B 2) reports that the scaling is such that you really might not expect to get a good solution then 3) Actually gets a better result than the xlf-myfunc(c(0.888452533990788,94812732.0897449)) xlf [1] 334.607 with Kelley's variant of Nelder Mead (from dfoptim package), with myoptx method par fvalues fns grs itns conv KKT1 4 LBFGSB NA, NA 8.988466e+307 NA NULL NULL NA 2 Rvmmin 0.1, 200186870.6 25593.83 20 1 NULL 0 FALSE 3 bobyqa 6.987875e-01, 2.001869e+08 1933.229 44 NA NULL 0 FALSE 1 nmkb 8.897590e-01, 9.470163e+07 334.1901 204 NA NULL 0 FALSE KKT2 xtimes meths 4 NA 0.01 LBFGSB 2 FALSE 0.11 Rvmmin 3 FALSE 0.24 bobyqa 1 FALSE 1.08 nmkb But do note the terrible scaling. Hardly surprising that this function does not work. I'll have to delve deeper to see what the scaling setup should be because of the nature of the function setup involving some of the data. (optimx includes parscale on all methods). However, original poster DID include code, so it was easy to do a quick check. Good for
Re: [R] Adding a year to existing date
Here is an example that could probably be described as adding a year: dates - c('2008-01-01','2009-03-02') tmp - as.POSIXlt(dates)tmp$year - tmp$year+1 dates2 - format(tmp) dates [1] 2008-01-01 2009-03-02 dates2 [1] 2009-01-01 2010-03-02 ## to begin to understand how it works, give the command ## unclass(tmp) ## (and read the help pages ## ?as.POSIXlt ## ?DateTimeClasses Another example: dates - as.Date(c('2008-01-01','2009-03-02')) tmp - as.POSIXlt(dates) tmp$year - tmp$year+1 dates2 - as.Date(tmp) ## ?as.Date ## ?Date -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 11/16/11 8:33 PM, arunkumar akpbond...@gmail.com wrote: Hi I need to add an year to and date field in the dataframe. Please help me X Date 1 2008-01-01 2 2008-02-01 3 2003-03-01 Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/Adding-a-year-to-existing-date-tp4078930p407 8930.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.table with double precision
Dear all I have a txt file with the following contents 1 50.790643000 6.063498 2 50.790738000 6.063471 3 50.791081000 6.063380 4 50.791189000 6.063552 I am usind read.table('myfile.txt',sep= ) which unfortunately returns only integers and not doubles that are required to store the 50.790643000 What can I do to force it to store things into doubles? B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding a year to existing date
Just looking at the ambiguity in adding a year dates - as.Date(c('2007-03-01','2008-02-29')) tmp - as.POSIXlt(dates) tmp$year - tmp$year+1 dates2 - as.Date(tmp) dates2 [1] 2008-03-01 2009-03-01 dates2 - dates Time differences in days [1] 366 366 KJ MacQueen, Don macque...@llnl.gov wrote in message news:caea785f.7cfdb%macque...@llnl.gov... Here is an example that could probably be described as adding a year: dates - c('2008-01-01','2009-03-02') tmp - as.POSIXlt(dates)tmp$year - tmp$year+1 dates2 - format(tmp) dates [1] 2008-01-01 2009-03-02 dates2 [1] 2009-01-01 2010-03-02 ## to begin to understand how it works, give the command ## unclass(tmp) ## (and read the help pages ## ?as.POSIXlt ## ?DateTimeClasses Another example: dates - as.Date(c('2008-01-01','2009-03-02')) tmp - as.POSIXlt(dates) tmp$year - tmp$year+1 dates2 - as.Date(tmp) ## ?as.Date ## ?Date -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 11/16/11 8:33 PM, arunkumar akpbond...@gmail.com wrote: Hi I need to add an year to and date field in the dataframe. Please help me X Date 1 2008-01-01 2 2008-02-01 3 2003-03-01 Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/Adding-a-year-to-existing-date-tp4078930p407 8930.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table with double precision
I'm having trouble replicating/understanding why that would happen since I do it all the time. The only thing that raises a hint of suspicion is using the blank space separator , but I'm pretty sure that's fine What does str() give? Possibly factors? If you are sure that's happening as described, can you send a sample .txt ( won't get scrubbed) and your exact import code? Michael On Nov 17, 2011, at 11:49 AM, Alaios ala...@yahoo.com wrote: Dear all I have a txt file with the following contents 1 50.790643000 6.063498 2 50.790738000 6.063471 3 50.791081000 6.063380 4 50.791189000 6.063552 I am usind read.table('myfile.txt',sep= ) which unfortunately returns only integers and not doubles that are required to store the� 50.790643000 What can I do to force it to store things into doubles? B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table with double precision
Hi, On Thu, Nov 17, 2011 at 11:49 AM, Alaios ala...@yahoo.com wrote: Dear all I have a txt file with the following contents 1 50.790643000 6.063498 2 50.790738000 6.063471 3 50.791081000 6.063380 4 50.791189000 6.063552 I am usind read.table('myfile.txt',sep= ) which unfortunately returns only integers and not doubles that are required to store the 50.790643000 Using that exact file you included? If so, then I suspect you are confusing display and storage, though even then my default session doesn't show integers. How do you know that it is returning only integers? What is options()$digits set to? myfile V1V2 V3 1 1 50.790643 6.063498 2 2 50.790738 6.063471 3 3 50.791081 6.063380 4 4 50.791189 6.063552 sprintf(%2.10f, myfile[1,2]) [1] 50.790643 What can I do to force it to store things into doubles? What is str(myfile) ? Sarah -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] changelog for MASS?
On Thu, Nov 17, 2011 at 7:33 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Hmmm...sorry -- the only thing I can suggest is maybe striking some sort of deal that you change when it gets however many months out of date: if you look here (http://cran.r-project.org/src/contrib/Archive/MASS/), you can see the last time each version of MASS was updated and by seeing what you You can also get this info on crantastic (scroll to 'prev versions'): http://crantastic.org/packages/MASS Regards Liviu have, you can see about how out of date you are. In the context of MASS, I wouldn't worry so much: it's tied to a book, not active research, so I don't think it gets updated too often in big ways. The other thing is to actually compare differences in the source code, though that might be more trouble than it's worth. Michael On Mon, Nov 14, 2011 at 4:30 PM, Xu Wang xuwang...@gmail.com wrote: Thanks Michael, But I can't see the dates on the NEWS so I have no idea what changed from last version or from whichever version we actually have installed. Do you see what I mean? Thanks, Xu -- View this message in context: http://r.789695.n4.nabble.com/changelog-for-MASS-tp4034473p4040941.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package installtion
See my responses in brackets below. - Original Message - From: Rolf Turner rolf.tur...@xtra.co.nz To: Scott Raynaud scott.rayn...@yahoo.com Cc: r-help@r-project.org r-help@r-project.org Sent: Wednesday, November 16, 2011 6:04 PM Subject: Re: [R] package installtion On 17/11/11 05:37, Scott Raynaud wrote: That might be an option if it weren't my most important predictor. I'm thinking my best bet is to use MLWin for the estimation since it will properly set fixed effects to 0. All my other sample size simulation programs use SAS PROC IML which I don't have/can't afford. I like R since it's free, but I can't work around the problem I'm currently having. This is the ``push every possible button until you get a result and to hell with what anything actually means'' approach to statistics [Well, I'm simply echoing the simulation software instructions in planning to use MLWin. I assume the approach is validated. In the meantime, I'd like to have a deeper understanding of why R isn't working. I have a hunch, but don''t know how to confirm it]. The probability of getting a *meaningful* result from this approach is close to zero [You're most certainly right if there is no sound rationale behind the method. In this case there is and the probability is much higher than you state. That's not to say I haven't made an error somewhere. Maybe further investigation of sort I endeavor to pursue will reveal that]. Why don't you try to *understand* what is going on [Precisely what I'm trying to do. However, I need help which I hope I can find here.], rather than wildly throwing every possible piece of software at the problem [It's not wildly throwing every piece of software at the problem. It's simply a matter of understanding what works and what doesn't] until one such piece runs? cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modelling and R misconceptions; was: package installtion
My responses are in brackets below, plus a final note after the main text. - Original Message - From: Uwe Ligges lig...@statistik.tu-dortmund.de To: Scott Raynaud scott.rayn...@yahoo.com Cc: r-help@r-project.org r-help@r-project.org Sent: Thursday, November 17, 2011 9:16 AM Subject: Re: [R] modelling and R misconceptions; was: package installtion This is hopeless [That's a matter of perception-even concentration camp prisoners found a way to hope (see Viktor Frankl)], since you never [never is a strong word and many times leads to cognitive errors] seem to listen to our advice [It's possible that I misunderstood your recommedations (more likely), or that you communicated poorly (less likely)], therefore this will be my very last try: So you actually need local advice [Yes I need advice-that's why I post here!], both for statistical concepts and R related [I don't claim to be a statistical genius, but I can hold my own. Now, R is a different matter]. No statistics software can estimate effects of variables that you observed to be constant (e.g. 0) all the time [I think you misuderstood my intentions-I never wanted to estimate effects that are 0 all of the time]. If any software does, please delete it a once from your machine. Instead, ask a local statistician for advice on your problem. You certainly want to show the data and your model to the local expert - since you don't show us. [I gave a detailed explanation in a previous post which I repeat here: |OK, I'm using William Browne's MLPowSim to create an R script which will simulate samples for estimation of sample size in mixed models. I have subjects | nested in hospitals with hospitals treated as random and all of my covariates at level 1. My outcome is death, so it's binary and I'll have a fixed and |random intercept. My interest is in the relation of the covariates to the outcome. | |My most important variable is gestational age (GA) which my investigators divide thusly: 23-24, 25-26, 27-28, 29-30 and 31-32. I have recoded the | dummies for GA in the script according to the MLPowSim instructions to a random multinomial variable: | | macpred-rmultinom(n2,1,c(.1031,.1482,.2385,.4404,.0698)) | x[,3]-macpred[1,][l2id] | x[,4]-macpred[2,][l2id] | x[,5]-macpred[3,][l2id] | x[,6]-macpred[4,][l2id] | |GA 23-24 is the reference with p=.0698. I started with a structured sampling scheme of 20, 60, 100, 120 and 140 level 2 units. My level 2 units have |different sizes. So at 20 I had 5 hospitals with 100 patients, 4 with 280, 3 with 460, 3 with 640, 3 with 820 and 2 with 1000. Thus, at 60 hospitals, I have 15, |12, 9, 9, 9, 6 with the same cell sample sizes. | |According to the MLPowSim documentation, with small probablities it's possible to have a column of zeroes in the X matrix if there are not many units in |the random factor. R will choke on this but MLWin sets the associated fixed effects to 0. When R choked, I increased from 20 to 60 as my minimum as |suggested in the MLPowSim documentation. Still no luck. Since this is a simulation, I assume once and a while that by chance a coefficient could be 0. In fact, Browne mentions as much in his documentation. There is a bit more to my simulation, but I thought I'd try to keep it as simple as possible, at least at the outset.] And then you want to ask for local R course since reading the documentation seems not to help [You got that right!]. Applying mtrace() in a non exiting object shows this straight away. Uwe Ligges Apparently I misuderstood the prupose of mtrace after reading the documentation-I thought it was to debug problems of the sort I've encountered. Michael Weylandt provided appropriate direction in the previous post for which I am grateful. Not all of us can be intellectual superstars. That's why we ask for help. This much I did read and understand from the R posting guide: Responding to other posts: * Rudeness and ad hominem comments are not acceptable. Brevity is OK. It's a good lesson to learn. On 17.11.2011 15:49, Scott Raynaud wrote: I believe the problem is a column of zeroes in my x matrix. I have tried the suggestions in the documentation, so now to try to confirm the probelm I'd like to run debug. Here's where I think the problem is: ###~~ Fitting the model using lmer funtion ~~### (fitmodel- lmer(modelformula,data,family=binomial(link=logit),nAGQ=1)) mtrace(fitmodel) I added the mtrace to catch the error, but get the following: Error in mtrace(fitmodel) : Can't find fitmodel How can I debug this? - Original Message - From: Rolf Turnerrolf.tur...@xtra.co.nz To: Scott Raynaudscott.rayn...@yahoo.com Cc: r-help@r-project.orgr-help@r-project.org Sent: Wednesday, November 16, 2011 6:04 PM Subject: Re: [R] package installtion On 17/11/11 05:37, Scott Raynaud wrote:
Re: [R] Help with error: no acceptable C compiler found in $PATH
Hstrangeif possible, this might be solvable by simply updating to the release version R 2.14. If it's at all possible, I'd start there. Can you find the object it's unhappy about? On my machine, I do the following 1) Open Finder 2) Macintosh HD - Library - Frameworks - R.framework - Versions - 2.13 - Resources - library - RCurl - libs - x86_64 - RCurl.so Going the other way, are you sure you have Curl on your system? I'm pretty sure it's standard on all Macs but you never know...follow some of the instructions given here: http://www.omegahat.org/RCurl/FAQ.html You should be able to type curl-config in the terminal and get a meaningful response if it is. Did you change something on the OS level recently? I don't really know why this would have all fallen apart, I just re-reinstalled RCurl on R 2.13.2 OSX 10.5.8 with no problem at all. Michael On Wed, Nov 16, 2011 at 11:14 AM, Hari Easwaran hariharan...@gmail.com wrote: Hi Michael, Thanks for your response. Using the binary seems to solve partially. I am able to install (I think!) RCurl but not able to load the library. Below is the info you required and the error while loading RCurl. sessionInfo() R version 2.13.2 (2011-09-30) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base install.packages(RCurl) trying URL 'http://watson.nci.nih.gov/cran_mirror/bin/macosx/leopard/contrib/2.13/RCurl_1.7-0.tgz' Content type 'application/octet-stream' length 680511 bytes (664 Kb) opened URL == downloaded 664 Kb The downloaded packages are in /var/folders/a6/a60JdPfrHC0ZAizZWyNM-E+++TI/-Tmp-//RtmpYE7JLJ/downloaded_packages library(RCurl) Loading required package: bitops Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/Library/Frameworks/R.framework/Versions/2.13/Resources/library/RCurl/libs/x86_64/RCurl.so': dlopen(/Library/Frameworks/R.framework/Versions/2.13/Resources/library/RCurl/libs/x86_64/RCurl.so, 6): Library not loaded: @rpath/R.framework/Versions/2.13/Resources/lib/libR.dylib Referenced from: /Library/Frameworks/R.framework/Versions/2.13/Resources/library/RCurl/libs/x86_64/RCurl.so Reason: image not found Error: package/namespace load failed for 'RCurl' Warning: dependency ‘Rcompression’ is not available also installing the dependency ‘XML’ Seems like now I need 'Rcompression'. I googled this and found a Rcompression package 'zlib' (http://www.omegahat.org/Rcompression/). However the site says that zlib for Mac OS X: zlib is already included as part of Mac OS X. I am wondering what to do? To my bliss, why did the previous R version oblivious of these issues! Really appreciate any hep. SIncerely, Hari On Tue, Nov 15, 2011 at 11:33 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Yes, you probably need some sort of C compiler, but why can't you just download the appropriate binary directly? I just did on OS X 10.5.8 (admittedly for R 2.13.2, not 2.14) with no problems. The output of sessionInfo() install.packages(RCurl) if you don't mind please. Thanks, Michael On Tue, Nov 15, 2011 at 2:12 PM, Hari Easwaran hariharan...@gmail.com wrote: Dear all, I am trying to install a package from bioconductor (biomaRt) for which I need the RCurl package. I get the following main error message when I try to install RCurl (and its dependencies). configure: error: no acceptable C compiler found in $PATH See `config.log' for more details. ERROR: configuration failed for package ‘RCurl’ I searched for possible solutions and read in some online mailing list that I might have to install Xcode to install the gcc compiler. I am not sure if I should do this because I have installed RCurl in previous versions of R without any problems (on this same computer). I upgraded to the latest R (R version 2.14.0) and faced this problem. So I downgraded to R version 2.13.2 and still cannot install RCurl. I think my last successful installation of RCurl was with R version 2.11. Following is the complete error message and my R version details. I really appreciate any help or suggestions. Sincerely, Hari trying URL ' http://watson.nci.nih.gov/cran_mirror/src/contrib/XML_3.4-3.tar.gz' Content type 'application/octet-stream' length 906364 bytes (885 Kb) opened URL == downloaded 885 Kb trying URL ' http://watson.nci.nih.gov/cran_mirror/src/contrib/RCurl_1.7-0.tar.gz' Content type 'application/octet-stream' length 813252 bytes (794 Kb) opened URL == downloaded 794 Kb * installing *source* package ‘XML’ ... checking for gcc... no checking for cc... no checking for cl.exe... no
Re: [R] Non-finite finite-difference value error in eha's, aftreg
This kind of error seems to surprise R users. It surprises me that it doesn't happen much more frequently. The BFGS method of optim() from the 1990 Pascal version of my book was called the Variable Metric method as per Fletcher's 1970 paper it was drawn from. It really works much better with analytic gradients, and the Rvmmin package which is an all-R version that adds bounds and masks is set up to generate a warning if they are not available. Even with bounds, the finite different derivative code can step over a cliff edge with del - (f(x+h) - f(x))/h i.e., bounds may not be checked within the numerical derivative functions. And BFGS is not set up with bounds. L-BFGS-B which has them is actually a rather different method. If you get such error messages, why not capture the parameter vector and check the function computation at those parameters and nearby? Yes, a bit tedious, but rarely have I found it a waste of time. For information, there should be a small function available shortly on R-forge (project optimizer, likely in the optfntools package) to do an axial search around a set of parameters and generate some information about the functional surface. I still have to prepare documentation and examples, but if anxious, contact me off-list. JN Message: 21 Date: Wed, 16 Nov 2011 15:06:00 +0100 From: Milan Bouchet-Valat nalimi...@club.fr To: r-help r-help@r-project.org Subject: [R] Non-finite finite-difference value error in eha's aftreg Message-ID: 1321452360.13624.2.camel@milan Content-Type: text/plain; charset=UTF-8 Hi list! I'm getting an error message when trying to fit an accelerated failure time parametric model using the aftreg() function from package eha: Error in optim(beta, Fmin, method = BFGS, control = list(trace = as.integer(printlevel)), : non-finite finite-difference value [2] This only happens when adding four specific covariates at the same time in the model (see below). I understand that kind of problem can come from a too high correlations between my covariates, but is there anything I can do to avoid it? Does something need to be improved in aftreg.fit? My data set is constituted of 34,505 observations (years) of 2,717 individuals, which seems reasonable to me to fit a complex model like that (covariates are all factors with less than 10 levels). I can send it by private mail if somebody wants to help debugging this. The details of the model and errors follow, but feel free to ask for more testing. I'm using R 2.13.1 (x86_64-redhat-linux-gnu), eha 2.0-5 and survival 2.36-9. Thanks for your help! m -aftreg(Surv(start, end, event) ~ homo1 + sexego + dipref1 ++ t.since.school.q, +data=ms, dist=loglogistic, id=ident) Error in optim(beta, Fmin, method = BFGS, control = list(trace = as.integer(printlevel)), : non-finite finite-difference value [2] Calls: aftreg - aftreg.fit - aftp0 - optim traceback() 4: optim(beta, Fmin, method = BFGS, control = list(trace = as.integer(printlevel)), hessian = TRUE) 3: aftp0(printlevel, ns, nn, id, strata, Y, X, offset, dis, means) 2: aftreg.fit(X, Y, dist, strats, offset, init, shape, id, control, center) 1: aftreg(Surv(start, end, event) ~ homo1 + sexego + dipref1 + t.since.school.q, data = ms, dist = loglogistic, id = ident) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to Fit Inflated Negative Binomial
Tyler Rinker tyler_rinker at hotmail.com writes: try: library(pscl) There's a zeroinfl for zero inflated neg. binom. Tyler Dear All, I am trying to fit some data both as a negative binomial and a zero inflated binomial. For the first case, I have no particular problems, see the small snippet below set.seed(123) #to have reproducible results ## You don't actually need MASS::rnegbin, rnbinom in base ## R works fine (different parameter names) x6 - c(rep(0,100),rnbinom(500,mu=5,size=4)) ## sample() is irrelevant, it just permutes the results library(pscl) zz - zeroinfl(x6~1|1,dist=negbin) exp(coef(zz)[1]) ## mu zz$theta ## theta plogis(coef(zz)[2]) ## zprob Alternatively you can use fitdistr with the dzinbinom() function from the emdbook package: library(emdbook) fitdistr(x6,dzinbinom,start=list(mu=4,size=5,zprob=0.2)) The pscl solution is likely to be much more robust. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hierachical code system
Hi, Thanks for your reply. Based on your suggestions, I managed to simplify the code, but only a little. I don't see how I could do without a loop, given the nestedness of the hierachy. See the code below, which is working, but I'd like to simplify it. # sample data theCodes - c('STAT.01', 'STAT.01.01', 'STAT.01.01.01', 'STAT.01.01.02', 'STAT.01.01.03', 'STAT.01.01.04', 'STAT.01.01.05', 'STAT.01.01.06', 'STAT.01.01.06.01', 'STAT.01.01.06.02', 'STAT.01.01.06.03', 'STAT.01.01.06.04', 'STAT.01.01.06.05', 'STAT.01.01.06.06', 'STAT.01.02', 'STAT.01.02.01', 'STAT.01.02.02', 'STAT.01.02.03', 'STAT.01.02.03.01', 'STAT.01.02.03.02', 'STAT.01.02.03.03', 'STAT.01.02.03.04', 'STAT.01.02.03.05', 'STAT.01.03') theValues - as.numeric(c(NA, NA, 15074.23366, 4882.942034, 1619.59628, 1801.722877, 1019.973666, NA, 503.9239317, 917.2189347, 6018.830465, 1944.11311, 1427.575402, 1965.725428, NA, 5857.293612, 5933.770263, NA, 6077.089518, 1427.180073, 455.9387993, 859.766603, 1002.983331, 2225.328211)) df - as.data.frame(cbind(code=theCodes, value=theValues)) df$value - as.numeric(df$value) # actual code getDepth - function(df) { df$diepte - do.call(rbind, lapply(strsplit(df$code, \\.), length)) - 1 return(df) } getParents - function(df) { df$parent - substr(df$code, 1, 4 + (df$diepte - 1) * 3) return(df) } getTotals - function(df, depth) { s - subset(df, diepte==depth) if(!parent %in% names(df)) s - getParents(s) agg - aggregate(s[value], s[parent], FUN=sum, na.rm=TRUE) merged - merge(df, agg, by.x=code, by.y=parent, all=TRUE, suffixes=c(, _summed)) isSum - !is.na(merged$value_summed) merged[isSum, value] - merged[isSum, value_summed] merged$value_summed - merged$parent - NULL return(merged) } #library(debug) #mtrace(getTotals) df - getDepth(df) for( depth in max(df$diepte):2 ) { if (depth == max(df$diepte)) { x - getTotals(df, depth) } else { x - getTotals(x, depth) } } Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ From: ONKELINX, Thierry thierry.onkel...@inbo.be To: Albert-Jan Roskam fo...@yahoo.com; R Mailing List r-help@r-project.org Sent: Wednesday, November 16, 2011 2:34 PM Subject: RE: [R] hierachical code system Dear Albert-Jan, The easiest way is to create extra variables with the corresponding aggregation level. substr() en strsplit() can be your friends. Once you have those variables you can use aggregate() or any other aggregating function. You don't need loops. Best regards, Thierry -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Albert-Jan Roskam Verzonden: woensdag 16 november 2011 14:28 Aan: R Mailing List Onderwerp: [R] hierachical code system Hi, I have a hierachical code system such as the example below (the printed data are easiest to read). I would like to write a function that returns an 'imputed' data frame, ie. where the the parent values are calculated as the sum of the child values. So, for instance, STAT.01.01.06 is the sum of STAT.01.01.06.01 through STAT.01.01.06.06. The code I have written uses two for loops, and, moreover, does not work as intended. My starting point was to determine the code depth by counting the dots in the variable 'code' (using strsplit), then iterate over the tree from deep to shallow. Does anybody have a good idea as to how to approach this in R? theCodes - c('STAT.01', 'STAT.01.01', 'STAT.01.01.01', 'STAT.01.01.02', 'STAT.01.01.03', 'STAT.01.01.04', 'STAT.01.01.05', 'STAT.01.01.06', 'STAT.01.01.06.01', 'STAT.01.01.06.02', 'STAT.01.01.06.03', 'STAT.01.01.06.04', 'STAT.01.01.06.05', 'STAT.01.01.06.06', 'STAT.01.02', 'STAT.01.02.01', 'STAT.01.02.02', 'STAT.01.02.03', 'STAT.01.02.03.01', 'STAT.01.02.03.02', 'STAT.01.02.03.03', 'STAT.01.02.03.04', 'STAT.01.02.03.05', 'STAT.01.03') theValues - c('NA', 'NA', '15074.23366', '4882.942034', '1619.59628', '1801.722877', '1019.973666', 'NA', '503.9239317', '917.2189347', '6018.830465', '1944.11311', '1427.575402', '1965.725428', 'NA', '5857.293612', '5933.770263', '6077.089518', 'NA', '1427.180073', '455.9387993', '859.766603', '1002.983331', '2225.328211') df - as.data.frame(cbind(code=theCodes, value=theValues)) print(df) code value 1 STAT.01 NA 2 STAT.01.01 NA 3 STAT.01.01.01 15074.23366 4 STAT.01.01.02 4882.942034 5 STAT.01.01.03 1619.59628 6 STAT.01.01.04 1801.722877 7 STAT.01.01.05 1019.973666 8 STAT.01.01.06 NA 9 STAT.01.01.06.01 503.9239317 10 STAT.01.01.06.02
[R] how to include a factor or class Variable
Hi How to include a factor or class variable to a fixed effect of lmer function. when i included it throws an error. Please help My code data - read.delim(C:/TestData/data.txt) Mon=as.factor(data$Month) lmerform= Y~ X2 +X3 + Month:Mon + (1|State)+ (1+ X5|State) lmerfit=lmer(formula=lmerform,data=data) summary(lmerfit) My data State YearMonth Y X2 X3 X4 X5 X6 GA 19601 27.8397.5 42.250.778.365.8 FA 19602 29.9413.3 38.152 79.266.9 GA 19613 29.8439.2 40.354 79.267.8 FA 19614 30.8459.7 39.555.379.269.6 GA 19621 31.2492.9 37.354.777.468.7 FA 19622 33.3528.6 38.163.780.273.6 GA 19633 35.6560.3 39.369.880.476.3 -- View this message in context: http://r.789695.n4.nabble.com/how-to-include-a-factor-or-class-Variable-tp4079991p4079991.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] set random numbers seed for different cpu's
Hi I'm running the same R script (throuth linux shell) of several cpu's. This R program uses random numbers and the result should be different every time. But if put jobs (through Torque) for several cpu's I get the same result. As a resealt my program saves numbers in file with randomly generated names. works like a charm on one cpu, but I get the same result from different cpu's. So my question is, how can I resolve this? How to set pseudo random number seed so that different cpu's would produce different results? Thank you in advance. -- View this message in context: http://r.789695.n4.nabble.com/set-random-numbers-seed-for-different-cpu-s-tp4080165p4080165.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Log-transform and specifying Gamma
Dear R help, I am trying to work out if I am justified in log-transforming data and specifying Gamma in the same glm. Does it have to be one or the other? I have attached an R script and the datafile to show what I mean. Also, I cannot find a mixed-model that allows Gamma errors (so I cannot find a way of including random effects). What should I do? Many thanks, Pete #trying to solve question 'can you log-transform and specify Gamma in the same model' question ToadsBd-read.table(file.choose(),header=T) list(ToadsBd) #first see how well treatment group predicts Bd score with non-log transformed data mod1-glm(Bd~factor(group)) summary(mod1) #massively overdispersed. Are the data non-normal? shapiro.test(Bd) W = 0.3652, p-value = 5.666e-13 #yes, definitely non-normal #try log-transforming data and see if that helps plot(qqnorm(Bd),log=y) #log plot straightens it out, almost, so yes log-transform helps #try model again with log transformed Bd score mod2-glm(logBd~factor(group)) summary(mod2) #a big improvement but still overdispersed #other options - specify an error family? Looks like original data are Gamma distributed #should test if variance increases or remains constant with mean on scale of the original, non-logged data par(mfrow=c(2,2)) plot(mod1) #can you tell this from a diagnostic plot? Not sure how. If not, how do you assess this? #in the meantime, assume it does and try Gamma (using default link = reciprocal) with non-logged data mod3-glm(Bd~factor(group),family=Gamma) summary(mod3) #mod3 is a major improvement on mod1 and less dispersed than mod2 but has a much larger AIC than mod2 #is it valid to specify Gamma in a model where the data have been log-transformed? #or does it have to be a choice between transformation or Gamma? #if specify both, model is quite good, but it may not be valid. Please help! mod4-glm(logBd~factor(group),family=Gamma) summary(mod4) #residual deviance now well below df, not overdispersed and the effect of group on Bd is significant #I would also like to include assessment of the effect of site, but this is a random effect requiring a mixed model #I cannot find a mixed model that works with Gamma errors. What can I do? toadgroup Bd logBd startg site 1 1 0.5 0.405 13.60 2 1 0.3 0.262 15.90 3 1 0.3 0.262 14.40 4 1 0.4 0.336 15.30 5 1 6.5 2.015 15.10 6 1 0.1 0.095 15.70 7 1 0.2 0.182 20.20 8 1 17.72.929 17.30 9 1 0.6 0.470 18.70 10 1 0.1 0.095 24.61 11 1 0.6 0.470 20 1 12 1 9 2.303 16.31 13 1 1.6 0.956 19.41 14 1 3.4 1.482 12.81 15 1 6.3 1.988 19.71 16 2 1.3 0.833 12.60 17 2 63.34.164 22.60 18 2 0.7 0.531 18.30 19 2 33.23.532 15.50 20 2 2.2 1.163 13.20 21 2 479 6.174 16.40 22 2 0.1 0.095 19.10 23 2 47.63.884 16.10 24 2 195.6 5.281 14.10 25 2 41 3.738 16.30 26 2 1984.2 7.593 13.71 27 2 6.3 1.988 13.91 28 2 126.7 4.850 22 1 29 2 105.1 4.664 12.71 30 2 6747.8 8.817 18.21 31 2 282.6 5.648 15.81 32 3 1.6 0.956 18.60 33 3 2576.3 7.854 15.30 34 3 11240 9.327 17.40 35 3 678.1 6.521 18.80 36 3 9926.8 9.203 17.50 37 3 103.4 4.648 16.10 38 3 2401.7 7.784 15.50 39 3 2616.4 7.870 16.50 40 3 35.33.592 18.90 41 3 174.7 5.169 22.70 42 3 362 5.894 17.51 43 3 2765.7 7.925 13.81 44 3 29033.8 10.276 16.51 45 3 34 3.555 21.11 46 3 258.4 5.558 15.91 47 3 10.12.407 14.91 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pairwise correlation
Hi Michael, Here is a sample of the data. Gene Array1 Array2 Array3 Array4 Array5 Array6 Array7 Array8 Array9 Array10 Array11 Fth1 26016.01 23134.66 17445.71 39856.04 27245.45 23622.98 37887.75 49857.46 25864.73 21852.51 29198.4 B2m 7573.64 7768.52 6608.24 8571.65 6380.78 6242.76 6903.92 7330.63 7256.18 5678.21 10937.05 Tmsb4x 6192.44 4277.22 5024.59 4851.51 3062.55 4562.43 7948.1 5018.58 3200.17 2855.77 6139.23 H2-D1 3141.41 3986.06 3328.62 4726.6 3589.89 2885.95 7509.88 5257.62 4742.26 3431.33 5300.72 Prdx5 3935.7 3938.9 3401.68 4193.14 4028.95 3438.19 6640.15 5486.61 4424.57 3368.83 5265.92 I want to retain the gene names in the data. What you've proposed will take them out and I'll have to append them back to the results after the cor() On 17 November 2011 09:33, Michael Weylandt [via R] ml-node+s789695n4080177...@n4.nabble.com wrote: I think something like this should do it, but I can't test without data: rownames(mydata) - mydata[,1] # Put the elements in the first column as rownames mydata - mydata[,-1] # drop the things that are now rownames Michael On Thu, Nov 17, 2011 at 9:23 AM, Musa Hassan [hidden email]http://user/SendEmail.jtp?type=nodenode=4080177i=0 wrote: Hi Michael, Thanks for the response. I have noticed that the error occurred during my data read. It appears that the rownames (which when the data is transposed become my colnames) were converted to numbers instead of strings as they should be. The original header names don't change, just the rownames. I have to figure out how to import the data and have the strings not converted. Right now am using: mydata = read.csv(mydata.csv, headers=T,stringsAsFactors=F) then to convert the data frame to matrix mydata=data.matrix(mydata) Then I just do the correlation as Peter suggested. expression=cor(t(expression)) Thanks. On 17 November 2011 08:51, R. Michael Weylandt [hidden email]http://user/SendEmail.jtp?type=nodenode=4080177i=1 wrote: On Wed, Nov 16, 2011 at 11:22 PM, muzz56 [hidden email]http://user/SendEmail.jtp?type=nodenode=4080177i=2 wrote: Thanks to everyone who replied to my post, I finally got it to work. I am however not sure how well it worked since it run so quickly, but seems like I have a 2000 x 2000 data set. Behold the great and mighty power that is R! Don't worry -- on a decent machine the correlation of a 2k x 2k data set should be pretty fast. (It's about 9 seconds on my old-ish laptop with a bunch of other junk running) My followup questions would be, how do I get only pairs with say a certain pearson correlation value additionally it seems like my output didn't retain the headers but instead replaced them with numbers making it hard to know which gene pairs correlate. This is a little worrisome: R carries column names through cor() so this would suggest you weren't using them. Were your headers listed as part of your data (instead of being names)? If so, they would have been taken as numbers. Take a look at dimnames(NAMEOFDATA) -- if your headers aren't there, then they are being treated as data instead of numbers. If they are, can you provide some reproducible code and we can debug more fully. The easiest way to send data is to use the dput() function to get a copy-pasteable plain text representation. It would also be great if you could restrict it to a subset of your data rather than the full 4M data points, but if that's hard to do, don't worry. You should have expected behavior like X - matrix(1:9,3) colnames(X) - c(A,B,C) cor(X) # Prints with labels Michael On 16 November 2011 17:11, Nordlund, Dan (DSHS/RDA) [via R] [hidden email] http://user/SendEmail.jtp?type=nodenode=4080177i=3 wrote: -Original Message- From: [hidden email]http://user/SendEmail.jtp?type=nodenode=4078114i=0 [mailto: r-help-bounces@r- project.org] On Behalf Of muzz56 Sent: Wednesday, November 16, 2011 12:28 PM To: [hidden email]http://user/SendEmail.jtp?type=nodenode=4078114i=1 Subject: Re: [R] Pairwise correlation Thanks Peter. I tried this after reading in the csv (read.csv) and converted the data to matrix (as.matrix). But when I tried the correlation, I keeping getting the error (x must be numeric) yet when I view the data, its numeric. What does R tell you if you execute the following? str(x) Just because the data looks like it is numeric when it prints doesn't mean it is. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ [hidden email] http://user/SendEmail.jtp?type=nodenode=4078114i=2mailing list https://stat.ethz.ch/mailman/listinfo/r-help
[R] Combining data
Hi all; It seemed to be easy at first, but I didn't manage to find the answer through the google search. I have a set of data for every second of the experiment, but I don't need such a high resolution for my analysis. I want to replace every 30 row of my data with their average value. And then save the new data set in a new csv file to be able to have a smaller excel data sheet. What is the command for combining certain number of data into their average value? Thank you -- Nasrin Pak MSc Student in Environmental Physics University of Calgary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lme contrast Error in `$-.data.frame`(`*tmp*`, df, value = numeric(0)) :
I am trying to run a lme model and some contrast for a matrix . lnY [1] 10.911628 11.198557 11.316971 11.464869 11.575233 11.612101 11.755903 11.722035 11.757705 11.863744 11.846515 11.852721 11.866936 11.838452 11.946680 11.885509 [17] 11.583309 11.750082 11.756005 11.630797 11.705536 11.566722 11.679448 11.703521NA 11.570949 11.716919 11.573343 11.733770 11.720801 11.804124 11.775074 [33] 11.801669 11.856955 11.875859 11.851852 11.830149 11.920156 11.954247 11.880917 11.806162 7.823646 11.909182NANA 11.912386 12.048816 11.958284 [49] 11.929021 11.986062 11.968418 11.967999 11.911608 plate [1] 2 1 2 2 1 1 1 2 2 1 2 1 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 9 Levels: 1 2 3 4 5 6 7 8 9 gb [1] tSac tSac tAceK tAceK cDMSO cDMSO tAceK tSac cDMSO tAceK cDMSO tSac cDMSO cDMSO tSac tSac tAceK tAceK tSac cDMSO tSac tAceK cDMSO tAceK tSac cDMSO tAceK cDMSO [29] tAceK tSac cDMSO cDMSO tSac tAceK tSac tAceK tSac tAceK cDMSO cDMSO tAceK tSac tAceK tSac cDMSO tAceK tSac tSac cDMSO tAceK tSac tAceK cDMSO Levels: cDMSO tAceK tSac time [1] 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 15m 15m 15m 15m 15m 15m [43] 15m 15m 15m 15m 15m 15m 15m 15m 15m 15m 15m Levels: 15m 1hr 4hr metab2-data.frame(plate,lnY,gb,time) fm1-lme(lnY ~ time*gb,random=~1|plate,metab2,na.action=na.omit) t1-contrast(fm1,a = list(gb =cDMSO ,time=15m ),b = list(gb = tAceK,time=15m)) t2-contrast(fm1,a = list(gb =cDMSO ,time=15m ),b = list(gb = tSac,time=15m)) I am doing similar contrasts in 1hr and 4hr time result: t1 lme model parameter contrast Contrast S.E. LowerUpper t df Pr(|t|) 0.01466447 0.3880718 -0.7459424 0.77527130.0439 0.97 t2 lme model parameter contrast Contrast S.E. Lower Upper t df Pr(|t|) 0.8007098 0.401809 0.01317859 1.588241 1.99 39 0.0533 but it doesnt work when my lnY is lnY [1] 14.08164 14.03683 15.23784 14.86681 15.69648 15.62681 15.38057 13.79152 15.59356 15.26301 15.49928 14.02714 15.54317 15.44776 14.51406 14.26436 14.76043 15.01506 [19] 13.75356 15.36528 13.86303 14.40074 15.39995 14.34945 14.32001 15.41146 14.43210 15.87487 14.31152 13.75980 15.44153 15.72775 13.83677 14.35888 14.08998 14.40057 [37] 15.25646 15.21430 15.21883 15.09338 15.24249 15.15223 15.19692 15.10101 15.16232 15.81154 15.30002 15.31443 15.25059 15.10284 15.38775 15.28618 15.38108 I am able to fit the model i.e i am getting my fm1 t1 lme model parameter contrast Error in `$-.data.frame`(`*tmp*`, df, value = numeric(0)) : replacement has 0 rows, data has 1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] function sum for array
I'm looking for a function that allows to sum the elements of an array along a dimension that can be different from the classical ones (rows or columns). Let's suppose for example that: - A is an array with dimensions 2 x 3 x 4 - I want to compute B, a 2 x 3 matrix with elements equal to the sum of the corrensponding elements on each of the 3 strata. I've tried to use apply(A,3,sum) but the result is a vector, not a matrix. Another solution is a less elegant B=matrix(rep(0,6),ncol=3) for(t in 1:4) B = B + A[ , , t] May anybody help? S -- --- Simone Salvadei Faculty of Economics Department of Financial and Economic Studies and Quantitative Methods University of Rome Tor Vergata e-mail: simone.salva...@uniroma2.it federico.belo...@uniroma2.it url: http://www.economia.uniroma2.it/phd/econometricsempiricaleconomics/ http://www.econometrics.it/ --- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hierarchical clustering within a size limit
You can print out the nodes and their corresponding clusters into a file by this: write.table (hc,file=hc_40clusters.cvs, quote=FALSE, sep= ) -- View this message in context: http://r.789695.n4.nabble.com/hierarchical-clustering-within-a-size-limit-tp3515354p4080551.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Time Series w/ Unequal Time Steps
Hi. I am new to R and actually have several questions related to this topic. A row in my data looks like the following: 418 12 6/21/2010 9:37:12 40.7219593 -73.9962579 1.3406345525960568 0.019682641058810173 In order, the columns are id, week, date, time, latitude, longitude, heading and displacement (no actual header though). I would like to read in the date, time, heading and displacement from a file. I would like to combine date and time into a single DateTime object. Then, I would like to do two things: (1) view a 3d scatterplot of DateTime, heading and displacement and (2) do an autoregression for heading indexed by DateTime. Thanks for any help. Regards, Keith -- View this message in context: http://r.789695.n4.nabble.com/Time-Series-w-Unequal-Time-Steps-tp4080562p4080562.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R help
Hello: I have some trouble making a prediction from an AR(p) model. After I have the AR(p) model fitted , I want to use a new data set to make predictions. But I get the error: Error in newdata - object$x.mean : non-numeric argument to binary operator. A small version of my original data looks like: X1 X2 X3 X4 40813.65 1 10 41.86755 40813.65 1 8 41.86755 40813.66 1 8 41.86755 40813.66 1 8 41.86755 40813.66 1 8 41.86755 40813.67 1 8 41.86755 40813.67 1 6 41.86755 40813.67 1 6 41.86755 40813.68 1 6 41.86755 40813.68 1 6 41.86755 40813.73 1 4 41.86755 Sh-read.table(C:\\ Desktop\\Sh.txt,sep=,,header=TRUE) model - ar.yw(Sh[,3]) My new data looks like: X3 10 8 8 8 8 8 6 6 6 6 4 4 4 4 4 4 4 4 4 4 5 5 me-read.table(C:\\Users\\351240\\Desktop\\me.txt,sep=,,header=TRUE) predict(model,me,n.ahead = 1) Then I get the error: Error in newdata - object$x.mean : non-numeric argument to binary operator. Can someone help me please. Thanks, Ana Lucia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Difference between two time series
Hello Michael, Thanks again for your reply. Actually, I am working with wind data. I have some sample data for actual load. scan(/home/sam/Desktop/tt.dat) -tt ## This is the input for the actual output of the generation t = ts(tt, start=8, end=24, frequency=1,) I have another random sequence for Generator Dispatch scan(/home/sam/Desktop/ss.dat) -ss## Input for the Generator Dispatch s= ts(ss, start=10,end=22, frequency=1) What I want to do now to take the Max and Min difference of this two sequence (t and s) over a fixed time interval. Something like, X=max(t-s, start=10, end=12) # I have an error here, I want he difference between two over an interval Y=min(t-sx, start=10, end=12) Then predict the max and min error between time t and t+1 on the basis of information that I have at t-1. Thanks again. Sam -- View this message in context: http://r.789695.n4.nabble.com/Difference-between-two-time-series-tp819843p4080672.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] White lines on persp plots in pdf format
On Nov 17, 2011, at 10:13 AM, Miguel Lacerda wrote: Hi, I am using the persp function to plot 3D surfaces, but the plots have little white lines when I print them to a pdf file (visible in Acrobat, Foxit, Evince, Xpdf and Gimp). This does not happen when I create png or tiff images. Here is some sample code: pdf(test.pdf) x - seq(0,1,length=101) f - dnorm(x, 0, 0.25) z - c() for(i in 1:100) z - cbind(z,f) persp(z, col=red,theta=40, phi=10, shade=1.5, d=4, border=NA) dev.off() The resulting graph is attached. Anyone know how to get rid of the little white lines? Thanks! Miguel See: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-are-there-unwanted-borders and the Note section of ?pdf. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] set random numbers seed for different cpu's
http://search.dilbert.com/comic/Random%20Number%20Generator In all seriousness, you could set the seed differently on each machine after putting jobs through Torque (i.e., as part of the batch script, maybe using some piece of hardware id you can get through system() somehow or other: possibly network id?) and you're very,very,very,very likely to get different results. Michael On Thu, Nov 17, 2011 at 9:30 AM, fantomas tomas.iesman...@gmail.com wrote: Hi I'm running the same R script (throuth linux shell) of several cpu's. This R program uses random numbers and the result should be different every time. But if put jobs (through Torque) for several cpu's I get the same result. As a resealt my program saves numbers in file with randomly generated names. works like a charm on one cpu, but I get the same result from different cpu's. So my question is, how can I resolve this? How to set pseudo random number seed so that different cpu's would produce different results? Thank you in advance. -- View this message in context: http://r.789695.n4.nabble.com/set-random-numbers-seed-for-different-cpu-s-tp4080165p4080165.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] set random numbers seed for different cpu's
Sorry -- that came off as very muddled. What I meant to say: To make it (almost) certain you will get different results on each machine, you can reset the PRNG seed on each machine in some way unique to that machine. What immediately came to mind was IP address, which you can access with something like this: x - system(curl -s http://checkip.dyndns.org | sed 's/[a-zA-Z/ :]//g' , intern = TRUE) # Note you might have to tweak it for your OS Michael On Thu, Nov 17, 2011 at 3:31 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: http://search.dilbert.com/comic/Random%20Number%20Generator In all seriousness, you could set the seed differently on each machine after putting jobs through Torque (i.e., as part of the batch script, maybe using some piece of hardware id you can get through system() somehow or other: possibly network id?) and you're very,very,very,very likely to get different results. Michael On Thu, Nov 17, 2011 at 9:30 AM, fantomas tomas.iesman...@gmail.com wrote: Hi I'm running the same R script (throuth linux shell) of several cpu's. This R program uses random numbers and the result should be different every time. But if put jobs (through Torque) for several cpu's I get the same result. As a resealt my program saves numbers in file with randomly generated names. works like a charm on one cpu, but I get the same result from different cpu's. So my question is, how can I resolve this? How to set pseudo random number seed so that different cpu's would produce different results? Thank you in advance. -- View this message in context: http://r.789695.n4.nabble.com/set-random-numbers-seed-for-different-cpu-s-tp4080165p4080165.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pairwise correlation
I can't see how it's stored like that and the email servers garble it up. Use dput() to create a plain text representation and paste that back in. Thanks, Michael On Thu, Nov 17, 2011 at 9:37 AM, muzz56 musah...@gmail.com wrote: Hi Michael, Here is a sample of the data. Gene Array1 Array2 Array3 Array4 Array5 Array6 Array7 Array8 Array9 Array10 Array11 Fth1 26016.01 23134.66 17445.71 39856.04 27245.45 23622.98 37887.75 49857.46 25864.73 21852.51 29198.4 B2m 7573.64 7768.52 6608.24 8571.65 6380.78 6242.76 6903.92 7330.63 7256.18 5678.21 10937.05 Tmsb4x 6192.44 4277.22 5024.59 4851.51 3062.55 4562.43 7948.1 5018.58 3200.17 2855.77 6139.23 H2-D1 3141.41 3986.06 3328.62 4726.6 3589.89 2885.95 7509.88 5257.62 4742.26 3431.33 5300.72 Prdx5 3935.7 3938.9 3401.68 4193.14 4028.95 3438.19 6640.15 5486.61 4424.57 3368.83 5265.92 I want to retain the gene names in the data. What you've proposed will take them out and I'll have to append them back to the results after the cor() On 17 November 2011 09:33, Michael Weylandt [via R] ml-node+s789695n4080177...@n4.nabble.com wrote: I think something like this should do it, but I can't test without data: rownames(mydata) - mydata[,1] # Put the elements in the first column as rownames mydata - mydata[,-1] # drop the things that are now rownames Michael On Thu, Nov 17, 2011 at 9:23 AM, Musa Hassan [hidden email]http://user/SendEmail.jtp?type=nodenode=4080177i=0 wrote: Hi Michael, Thanks for the response. I have noticed that the error occurred during my data read. It appears that the rownames (which when the data is transposed become my colnames) were converted to numbers instead of strings as they should be. The original header names don't change, just the rownames. I have to figure out how to import the data and have the strings not converted. Right now am using: mydata = read.csv(mydata.csv, headers=T,stringsAsFactors=F) then to convert the data frame to matrix mydata=data.matrix(mydata) Then I just do the correlation as Peter suggested. expression=cor(t(expression)) Thanks. On 17 November 2011 08:51, R. Michael Weylandt [hidden email]http://user/SendEmail.jtp?type=nodenode=4080177i=1 wrote: On Wed, Nov 16, 2011 at 11:22 PM, muzz56 [hidden email]http://user/SendEmail.jtp?type=nodenode=4080177i=2 wrote: Thanks to everyone who replied to my post, I finally got it to work. I am however not sure how well it worked since it run so quickly, but seems like I have a 2000 x 2000 data set. Behold the great and mighty power that is R! Don't worry -- on a decent machine the correlation of a 2k x 2k data set should be pretty fast. (It's about 9 seconds on my old-ish laptop with a bunch of other junk running) My followup questions would be, how do I get only pairs with say a certain pearson correlation value additionally it seems like my output didn't retain the headers but instead replaced them with numbers making it hard to know which gene pairs correlate. This is a little worrisome: R carries column names through cor() so this would suggest you weren't using them. Were your headers listed as part of your data (instead of being names)? If so, they would have been taken as numbers. Take a look at dimnames(NAMEOFDATA) -- if your headers aren't there, then they are being treated as data instead of numbers. If they are, can you provide some reproducible code and we can debug more fully. The easiest way to send data is to use the dput() function to get a copy-pasteable plain text representation. It would also be great if you could restrict it to a subset of your data rather than the full 4M data points, but if that's hard to do, don't worry. You should have expected behavior like X - matrix(1:9,3) colnames(X) - c(A,B,C) cor(X) # Prints with labels Michael On 16 November 2011 17:11, Nordlund, Dan (DSHS/RDA) [via R] [hidden email] http://user/SendEmail.jtp?type=nodenode=4080177i=3 wrote: -Original Message- From: [hidden email]http://user/SendEmail.jtp?type=nodenode=4078114i=0 [mailto: r-help-bounces@r- project.org] On Behalf Of muzz56 Sent: Wednesday, November 16, 2011 12:28 PM To: [hidden email]http://user/SendEmail.jtp?type=nodenode=4078114i=1 Subject: Re: [R] Pairwise correlation Thanks Peter. I tried this after reading in the csv (read.csv) and converted the data to matrix (as.matrix). But when I tried the correlation, I keeping getting the error (x must be numeric) yet when I view the data, its numeric. What does R tell you if you execute the following? str(x) Just because the data looks like it is numeric when it prints doesn't mean it is. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research
Re: [R] function sum for array
It might not be as general as you have in mind, but this works: X = array(1:24, c(2,3,4)) rowSums(X, dims = 2) Combined with aperm() it's pretty powerful. Michael On Thu, Nov 17, 2011 at 11:24 AM, Simone Salvadei simone.salva...@gmail.com wrote: I'm looking for a function that allows to sum the elements of an array along a dimension that can be different from the classical ones (rows or columns). Let's suppose for example that: - A is an array with dimensions 2 x 3 x 4 - I want to compute B, a 2 x 3 matrix with elements equal to the sum of the corrensponding elements on each of the 3 strata. I've tried to use apply(A,3,sum) but the result is a vector, not a matrix. Another solution is a less elegant B=matrix(rep(0,6),ncol=3) for(t in 1:4) B = B + A[ , , t] May anybody help? S -- --- Simone Salvadei Faculty of Economics Department of Financial and Economic Studies and Quantitative Methods University of Rome Tor Vergata e-mail: simone.salva...@uniroma2.it federico.belo...@uniroma2.it url: http://www.economia.uniroma2.it/phd/econometricsempiricaleconomics/ http://www.econometrics.it/ --- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to include a factor or class Variable
Please post this to r-sig-mixed-models instead. Or, better yet, consult your local statistician, as your question indicates a profound lack of understanding that may require more back and forth discussion than can occur on an internet help site. -- Bert On Thu, Nov 17, 2011 at 5:36 AM, arunkumar akpbond...@gmail.com wrote: Hi How to include a factor or class variable to a fixed effect of lmer function. when i included it throws an error. Please help My code data - read.delim(C:/TestData/data.txt) Mon=as.factor(data$Month) lmerform= Y~ X2 +X3 + Month:Mon + (1|State)+ (1+ X5|State) lmerfit=lmer(formula=lmerform,data=data) summary(lmerfit) My data State Year Month Y X2 X3 X4 X5 X6 GA 1960 1 27.8 397.5 42.2 50.7 78.3 65.8 FA 1960 2 29.9 413.3 38.1 52 79.2 66.9 GA 1961 3 29.8 439.2 40.3 54 79.2 67.8 FA 1961 4 30.8 459.7 39.5 55.3 79.2 69.6 GA 1962 1 31.2 492.9 37.3 54.7 77.4 68.7 FA 1962 2 33.3 528.6 38.1 63.7 80.2 73.6 GA 1963 3 35.6 560.3 39.3 69.8 80.4 76.3 -- View this message in context: http://r.789695.n4.nabble.com/how-to-include-a-factor-or-class-Variable-tp4079991p4079991.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combining data
There is no single command to do all of what you want. Read the posting guide for advice on how to ask questions that are more likely to receive helpful answers. The mean() function is a command for combining certain number of data into their average value. The write.csv() function will create a new csv file. The aggregate() function may help. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 11/17/11 7:37 AM, Nasrin Pak astronas...@gmail.com wrote: Hi all; It seemed to be easy at first, but I didn't manage to find the answer through the google search. I have a set of data for every second of the experiment, but I don't need such a high resolution for my analysis. I want to replace every 30 row of my data with their average value. And then save the new data set in a new csv file to be able to have a smaller excel data sheet. What is the command for combining certain number of data into their average value? Thank you -- Nasrin Pak MSc Student in Environmental Physics University of Calgary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combining data
Well, for What is the command for combining certain number of data into their average value? one way would be (calling the data vector, x) colMeans(matrix ( x[ seq_len(30 * floor(length(x)/30))], nrow=30)) Note that this will leave out the mean of any values with indices beyond the largest multiple of 30 less than or equal to the length of x. There are probably 87 other ways to do this, many of which might be better, simpler, faster, or slicker. -- Bert On Thu, Nov 17, 2011 at 1:15 PM, MacQueen, Don macque...@llnl.gov wrote: There is no single command to do all of what you want. Read the posting guide for advice on how to ask questions that are more likely to receive helpful answers. The mean() function is a command for combining certain number of data into their average value. The write.csv() function will create a new csv file. The aggregate() function may help. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 11/17/11 7:37 AM, Nasrin Pak astronas...@gmail.com wrote: Hi all; It seemed to be easy at first, but I didn't manage to find the answer through the google search. I have a set of data for every second of the experiment, but I don't need such a high resolution for my analysis. I want to replace every 30 row of my data with their average value. And then save the new data set in a new csv file to be able to have a smaller excel data sheet. What is the command for combining certain number of data into their average value? Thank you -- Nasrin Pak MSc Student in Environmental Physics University of Calgary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vectorizing for weighted distance
I'm not quite sure of what you mean by not worry if it's 1d R matrices. X1 and X2 are both n by d matrices and W is d by d. Thanks for the help though. Any other ideas? Thanks Sachin On Friday, November 18, 2011, R. Michael Weylandt michael.weyla...@gmail.com wrote: The fastest is probably to just implement the matrix calculation directly in R with the %*% operator. (X1-X2) %*% W %*% (X1-X2) You don't need to worry about the transposing if you are passing R vectors X1,X2. If they are 1-d matrices, you might need to. Michael On Thu, Nov 17, 2011 at 1:30 AM, Sachinthaka Abeywardana sachin.abeyward...@gmail.com wrote: Hi All, I am trying to convert the following piece of matlab code to R: XX1 = sum(w(:,ones(1,N1)).*X1.*X1,1); #square the elements of X1, weight it and repeat this vector N1 times XX2 = sum(w(:,ones(1,N2)).*X2.*X2,1); #square the elements of X2, weigh and repeat this vector N2 times X1X2 = (w(:,ones(1,N1)).*X1)'*X2; #get the weighted 'covariance' term XX1T = XX1'; #transpose z = XX1T(:,ones(1,N2)) + XX2(ones(1,N1),:) - 2*X1X2;#get the squared weighted distance which is basically doing: z=(X1-X2)' W (X1-X2) What would the best way (for SPEED) to do this? or is vectorizing as above the best? Any hints, suggestions? Thanks, Sachin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merging corpora and metadata
Greetings! I loose all my metadata after concatenating corpora. This is an example of what happens: meta(corpus.1) MetaID cid fid selfirst selend fname 1 0 1 11 2169 2518WCPD-2001-01-29-Pg217.scrb 2 0 1 14 9189 9702 WCPD-2003-01-13-Pg39.scrb 3 0 1 14 2109 2577 WCPD-2003-01-13-Pg39.scrb 17 0 1 11417863 18256WCPD-2007-04-30-Pg515.scrb meta(corpus.2) MetaID cid fid selfirst selend fname 1 0 2 211016 11600 DCPD-200900595.scrb 2 0 2 619510 20098 DCPD-201000636.scrb 3 0 2 623935 24573 DCPD-201000636.scrb 94 0 2 12716225 17128 WCPD-2009-01-12-Pg22-3.scrb tot.corpus - c(corpus.1, corpus.2) meta(tot.corpus) MetaID 10 20 30 111 0 This is from the structure of corpus.1 ..$ MetaData:List of 2 .. ..$ create_date: POSIXlt[1:1], format: 2011-11-17 21:09:57 .. ..$ creator: chr henk ..$ Children: NULL ..- attr(*, class)= chr MetaDataNode - attr(*, DMetaData)='data.frame': 17 obs. of 6 variables: ..$ MetaID : num [1:17] 0 0 0 0 0 0 0 0 0 0 ... ..$ cid : int [1:17] 1 1 1 1 1 1 1 1 1 1 ... ..$ fid : int [1:17] 11 14 14 17 46 80 80 80 91 91 ... ..$ selfirst: num [1:17] 2169 9189 2109 8315 9439 ... ..$ selend : num [1:17] 2518 9702 2577 8881 10102 ... ..$ fname : chr [1:17] WCPD-2001-01-29-Pg217.scrb WCPD-2003-01-13-Pg39.scrb WCPD-2003-01-13-Pg39.scrb WCPD-2004-05-17-Pg856.scrb ... - attr(*, class)= chr [1:3] VCorpus Corpus list Any idea on what I could do to keep the metadata in the merged corpus? Thanks, Henri-Paul -- Henri-Paul Indiogine Curriculum Instruction Texas AM University TutorFind Learning Centre Email: hindiog...@gmail.com Skype: hindiogine Website: http://people.cehd.tamu.edu/~sindiogine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vectorizing for weighted distance
I fail to see why you would need another idea: you asked how to multiply matrices efficiently, I told you how to multiply matrices efficiently. if you want to calculate X1-X2 times W times X1-X2, then simply do so: X1 - matrix(1:6, 3) X2 - matrix(7:12, 3) W = matrix(runif(9), 3) t(X1-X2) %*% W %*% (X1-X2) which gives 142.7789 142.7789 142.7789 142.7789 You could squeeze out one iota more of speed with crossprod(X1-X2, W) %*% (X1-X2) to get the same result, but unless you are doing massive scale linear processing, I'm not sure it's worth the loss of clarity. I was only giving you a heads up on the sometimes confusing difference between matrix multiplication in MATLAB and in R by which a vector is not a 1d matrix and so does not require explicit transposition. Michael On Thu, Nov 17, 2011 at 4:35 PM, Sachinthaka Abeywardana sachin.abeyward...@gmail.com wrote: I'm not quite sure of what you mean by not worry if it's 1d R matrices. X1 and X2 are both n by d matrices and W is d by d. Thanks for the help though. Any other ideas? Thanks Sachin On Friday, November 18, 2011, R. Michael Weylandt michael.weyla...@gmail.com wrote: The fastest is probably to just implement the matrix calculation directly in R with the %*% operator. (X1-X2) %*% W %*% (X1-X2) You don't need to worry about the transposing if you are passing R vectors X1,X2. If they are 1-d matrices, you might need to. Michael On Thu, Nov 17, 2011 at 1:30 AM, Sachinthaka Abeywardana sachin.abeyward...@gmail.com wrote: Hi All, I am trying to convert the following piece of matlab code to R: XX1 = sum(w(:,ones(1,N1)).*X1.*X1,1); #square the elements of X1, weight it and repeat this vector N1 times XX2 = sum(w(:,ones(1,N2)).*X2.*X2,1); #square the elements of X2, weigh and repeat this vector N2 times X1X2 = (w(:,ones(1,N1)).*X1)'*X2; #get the weighted 'covariance' term XX1T = XX1'; #transpose z = XX1T(:,ones(1,N2)) + XX2(ones(1,N1),:) - 2*X1X2; #get the squared weighted distance which is basically doing: z=(X1-X2)' W (X1-X2) What would the best way (for SPEED) to do this? or is vectorizing as above the best? Any hints, suggestions? Thanks, Sachin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spatial Statistics using R
These might get you started Analysing spatial point patterns in R by Adrian Baddeley CSIRO and University of Western Australia http://www.csiro.au/files/files/p10ib.pdf Spatial Regression Analysis in R: A Workbook, by Luc Anselin Spatial Analysis Laboratory http://geodacenter.asu.edu/system/files/rex1.pdf -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of vioravis Sent: Thursday, November 17, 2011 12:29 AM To: r-help@r-project.org Subject: [R] Spatial Statistics using R I am looking for online courses to learn Spatial Statistics using R. Statistics.com is offering an online course in December on the same topic but that schedule doesn't suit mine. Are there any other similar modes for learning spatial statistics using R??? Can someone please advice??? Thank you. Ravi -- View this message in context: http://r.789695.n4.nabble.com/Spatial-Statistics-using-R-tp4079092p4079092.h tml Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vectorizing for weighted distance
Hi Michael, Thanks for that. The X1 and X2 are vectors are typically 1000 by 3 matrices, and hoping to scale up to much larger dimensions (say 20,000 by 3). I do appreciate your help and seems like this is the best way to do this, I was just wondering if I could squeeze out just a bit more performance, thats all. Anyway thanks again, much appreciated. Thanks, Sachin On Fri, Nov 18, 2011 at 9:15 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: I fail to see why you would need another idea: you asked how to multiply matrices efficiently, I told you how to multiply matrices efficiently. if you want to calculate X1-X2 times W times X1-X2, then simply do so: X1 - matrix(1:6, 3) X2 - matrix(7:12, 3) W = matrix(runif(9), 3) t(X1-X2) %*% W %*% (X1-X2) which gives 142.7789 142.7789 142.7789 142.7789 You could squeeze out one iota more of speed with crossprod(X1-X2, W) %*% (X1-X2) to get the same result, but unless you are doing massive scale linear processing, I'm not sure it's worth the loss of clarity. I was only giving you a heads up on the sometimes confusing difference between matrix multiplication in MATLAB and in R by which a vector is not a 1d matrix and so does not require explicit transposition. Michael On Thu, Nov 17, 2011 at 4:35 PM, Sachinthaka Abeywardana sachin.abeyward...@gmail.com wrote: I'm not quite sure of what you mean by not worry if it's 1d R matrices. X1 and X2 are both n by d matrices and W is d by d. Thanks for the help though. Any other ideas? Thanks Sachin On Friday, November 18, 2011, R. Michael Weylandt michael.weyla...@gmail.com wrote: The fastest is probably to just implement the matrix calculation directly in R with the %*% operator. (X1-X2) %*% W %*% (X1-X2) You don't need to worry about the transposing if you are passing R vectors X1,X2. If they are 1-d matrices, you might need to. Michael On Thu, Nov 17, 2011 at 1:30 AM, Sachinthaka Abeywardana sachin.abeyward...@gmail.com wrote: Hi All, I am trying to convert the following piece of matlab code to R: XX1 = sum(w(:,ones(1,N1)).*X1.*X1,1); #square the elements of X1, weight it and repeat this vector N1 times XX2 = sum(w(:,ones(1,N2)).*X2.*X2,1); #square the elements of X2, weigh and repeat this vector N2 times X1X2 = (w(:,ones(1,N1)).*X1)'*X2; #get the weighted 'covariance' term XX1T = XX1'; #transpose z = XX1T(:,ones(1,N2)) + XX2(ones(1,N1),:) - 2*X1X2;#get the squared weighted distance which is basically doing: z=(X1-X2)' W (X1-X2) What would the best way (for SPEED) to do this? or is vectorizing as above the best? Any hints, suggestions? Thanks, Sachin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vectorizing for weighted distance
I'm starting to get a clearer idea of what you mean: there are two (possibly three) routes you can go: 1) If your matrices are sparse (mostly zero) there's some specialized work on multiplying them quickly 2) You can look at the RcppArmadillo package which interfaces to a very high quality linear algebra backend. I think this one is likely to give a very nice speedup without requiring too much additional work. 3) (This one is the most technically difficult, but it can be pretty powerful if done correctly) You can recompile R using a BLAS (basic linear algebra system) that's optimized for your machine, rather than a rather generic one that most computers come with. Something like this: http://math-atlas.sourceforge.net/ Michael On Thu, Nov 17, 2011 at 5:51 PM, Sachinthaka Abeywardana sachin.abeyward...@gmail.com wrote: Hi Michael, Thanks for that. The X1 and X2 are vectors are typically 1000 by 3 matrices, and hoping to scale up to much larger dimensions (say 20,000 by 3). I do appreciate your help and seems like this is the best way to do this, I was just wondering if I could squeeze out just a bit more performance, thats all. Anyway thanks again, much appreciated. Thanks, Sachin On Fri, Nov 18, 2011 at 9:15 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: I fail to see why you would need another idea: you asked how to multiply matrices efficiently, I told you how to multiply matrices efficiently. if you want to calculate X1-X2 times W times X1-X2, then simply do so: X1 - matrix(1:6, 3) X2 - matrix(7:12, 3) W = matrix(runif(9), 3) t(X1-X2) %*% W %*% (X1-X2) which gives 142.7789 142.7789 142.7789 142.7789 You could squeeze out one iota more of speed with crossprod(X1-X2, W) %*% (X1-X2) to get the same result, but unless you are doing massive scale linear processing, I'm not sure it's worth the loss of clarity. I was only giving you a heads up on the sometimes confusing difference between matrix multiplication in MATLAB and in R by which a vector is not a 1d matrix and so does not require explicit transposition. Michael On Thu, Nov 17, 2011 at 4:35 PM, Sachinthaka Abeywardana sachin.abeyward...@gmail.com wrote: I'm not quite sure of what you mean by not worry if it's 1d R matrices. X1 and X2 are both n by d matrices and W is d by d. Thanks for the help though. Any other ideas? Thanks Sachin On Friday, November 18, 2011, R. Michael Weylandt michael.weyla...@gmail.com wrote: The fastest is probably to just implement the matrix calculation directly in R with the %*% operator. (X1-X2) %*% W %*% (X1-X2) You don't need to worry about the transposing if you are passing R vectors X1,X2. If they are 1-d matrices, you might need to. Michael On Thu, Nov 17, 2011 at 1:30 AM, Sachinthaka Abeywardana sachin.abeyward...@gmail.com wrote: Hi All, I am trying to convert the following piece of matlab code to R: XX1 = sum(w(:,ones(1,N1)).*X1.*X1,1); #square the elements of X1, weight it and repeat this vector N1 times XX2 = sum(w(:,ones(1,N2)).*X2.*X2,1); #square the elements of X2, weigh and repeat this vector N2 times X1X2 = (w(:,ones(1,N1)).*X1)'*X2; #get the weighted 'covariance' term XX1T = XX1'; #transpose z = XX1T(:,ones(1,N2)) + XX2(ones(1,N1),:) - 2*X1X2; #get the squared weighted distance which is basically doing: z=(X1-X2)' W (X1-X2) What would the best way (for SPEED) to do this? or is vectorizing as above the best? Any hints, suggestions? Thanks, Sachin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] return only pairwise correlations greater than given value
This is probably not the prettiest or most efficient function ever, but this seems to do what I wanted. spec.cor - function(dat, r, ...){ require(reshape) d1 - data.frame(cor(dat)) d2 - melt(d1) d2[,3] - rep(rownames(d1), nrow(d2)/length(unique(d2[,1]))) d2 - d2[,c(variable, V3, value)] colnames(d2) - c(V1, V2, value) d2 - d2[with(d2, which(V1 != V2, arr.ind=T)), ] d2 - d2[which(d2[,3] =r | d2[,3] = -r, arr.ind=T),] d2[,1:2] - t(apply(d2[,1:2], MARGIN=1, function(x) sort(x))) d2 - unique(d2) return(d2) } data(mtcars) spec.cor(mtcars[,2:5], .6) Using as id variables V1 V2 value 2 cyl disp 0.9020329 3 cyl hp 0.8324475 4 cyl drat -0.6999381 7 disp hp 0.7909486 8 disp drat -0.7102139 I'm not sure how to make melt() quit giving the Using as id variables warning, but I don't really care either. B77S wrote: Thanks Michael, I just started on the following code (below), and realized I should ask, as this likely exists already. basically what I'd like is for the function to return (basically) what you just suggested, plus the names of the two variables (I suppose pasted together would be good). I hope that is clear, and obviously I didn't get so far as to add the names to the output. # sig.cor - function(dat, r, ...){ cv2 - data.frame(cor(dat)) var.names - rownames(cv2) list.cv2 - which(cv2 =r | cv2 = -r, arr.ind=T) cor.r - cv2[list.cv2[which(list.cv2 [,row]!=list.cv2 [,col]),]] cor.names - var.names[list.cv2[which(list.cv2 [,row]!=list.cv2 [,col]),]] return(cor.r) } data(mtcars) sig.cor(mtcars[,2:5], .90) # sig.cor(mtcars[,2:5], .90) #[1] 0.9020329 0.9020329 # Ideally this would look likt this: cyl-disp 0.9020329 Michael Weylandt wrote: What exactly do you mean returns them? More generally I suppose, what do you have in mind to do with this? You could do something like this: BigCorrelation - function(X){ return(which(abs(cor(X)) 0.9, arr.ind = T)) } but it hardly seems worth its own function call. On Thu, Nov 17, 2011 at 12:42 AM, B77S lt;bps0002@gt; wrote: Hello, I would like to find out if a function already exists that returns only pairwise correlations above/below a certain threshold (e.g, -.90, .90) Thank you. -- View this message in context: http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079028.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4081534.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] return only pairwise correlations greater than given value
Hi Brad, You do not really need to reshape the correlation matrix. This seems to do what you want: spec.cor - function(dat, r, ...) { x - cor(dat, ...) x[upper.tri(x, TRUE)] - NA i - which(abs(x) = r, arr.ind = TRUE) data.frame(matrix(colnames(x)[as.vector(i)], ncol = 2), value = x[i]) } spec.cor(mtcars[, 2:5], .6) Cheers, Josh On Wed, Nov 16, 2011 at 9:58 PM, B77S bps0...@auburn.edu wrote: Thanks Michael, I just started on the following code (below), and realized I should as as this might exist. basically what I'd like is for the function to return (basically) what you just suggested, plus the names of the two variables (I suppose pasted together would be good). I hope that is clear. # sig.cor - function(dat, r, ...){ cv2 - data.frame(cor(dat)) var.names - rownames(cv2) list.cv2 - which(cv2 =r | cv2 = -r, arr.ind=T) cor.r - cv2[list.cv2[which(list.cv2 [,row]!=list.cv2 [,col]),]] cor.names - var.names[list.cv2[which(list.cv2 [,row]!=list.cv2 [,col]),]] return(cor.r) } Michael Weylandt wrote: What exactly do you mean returns them? More generally I suppose, what do you have in mind to do with this? You could do something like this: BigCorrelation - function(X){ return(which(abs(cor(X)) 0.9, arr.ind = T)) } but it hardly seems worth its own function call. On Thu, Nov 17, 2011 at 12:42 AM, B77S lt;bps0002@gt; wrote: Hello, I would like to find out if a function already exists that returns only pairwise correlations above/below a certain threshold (e.g, -.90, .90) Thank you. -- View this message in context: http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079028.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079044.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combining data
On Nov 17, 2011, at 10:37 AM, Nasrin Pak wrote: Hi all; It seemed to be easy at first, but I didn't manage to find the answer through the google search. I have a set of data for every second of the experiment, but I don't need such a high resolution for my analysis. I want to replace every 30 row of my data with their average value. And then save the new data set in a new csv file to be able to have a smaller excel data sheet. What is the command for combining certain number of data into their average value? This aggregates mean values in groups of ten. aggregate(data.frame(a=rnorm(100)), list(rep(1:10, each=10)), FUN=mean) Group.1 a 11 -0.59492893 22 0.20087525 33 -0.06310919 44 -0.60778424 55 -0.01435818 66 -0.01159243 77 0.05921309 88 -0.04881492 99 0.43796040 10 10 -0.02968688 Thank you -- Nasrin Pak MSc Student in Environmental Physics University of Calgary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] return only pairwise correlations greater than given value
Excellent; thanks Josh. Joshua Wiley-2 wrote: Hi Brad, You do not really need to reshape the correlation matrix. This seems to do what you want: spec.cor - function(dat, r, ...) { x - cor(dat, ...) x[upper.tri(x, TRUE)] - NA i - which(abs(x) = r, arr.ind = TRUE) data.frame(matrix(colnames(x)[as.vector(i)], ncol = 2), value = x[i]) } spec.cor(mtcars[, 2:5], .6) Cheers, Josh On Wed, Nov 16, 2011 at 9:58 PM, B77S lt;bps0002@gt; wrote: Thanks Michael, I just started on the following code (below), and realized I should as as this might exist. basically what I'd like is for the function to return (basically) what you just suggested, plus the names of the two variables (I suppose pasted together would be good). I hope that is clear. # sig.cor - function(dat, r, ...){ cv2 - data.frame(cor(dat)) var.names - rownames(cv2) list.cv2 - which(cv2 =r | cv2 = -r, arr.ind=T) cor.r - cv2[list.cv2[which(list.cv2 [,row]!=list.cv2 [,col]),]] cor.names - var.names[list.cv2[which(list.cv2 [,row]!=list.cv2 [,col]),]] return(cor.r) } Michael Weylandt wrote: What exactly do you mean returns them? More generally I suppose, what do you have in mind to do with this? You could do something like this: BigCorrelation - function(X){ return(which(abs(cor(X)) 0.9, arr.ind = T)) } but it hardly seems worth its own function call. On Thu, Nov 17, 2011 at 12:42 AM, B77S lt;bps0002@gt; wrote: Hello, I would like to find out if a function already exists that returns only pairwise correlations above/below a certain threshold (e.g, -.90, .90) Thank you. -- View this message in context: http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079028.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079044.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4081643.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging corpora and metadata
Hi Henri-Paul, This can be rather tricky. It would really help if you could give us a reproducible example. In this case, because you are dealing with non standard data structures (or at least added attributes), the data exactly as R sees it. This means either A) code to create some data that demonstrates your problem or B) the output of calling dput(corpus.1) (see ?dput for what it does and what to do). One possibility (though it does not concatenate per se): combined - list(corpus.1, corpus.2) *if* (there are only attributes in corpus.1 OR corpus.2) OR (the attribute names in corpus.1 and corpus.2 are unique), then you could do: combined - c(corpus.1, corpus.2) attributes(combined) - c(attributes(corpus.1), attributes(corpus.2) but note that it is *very* likely that at least the names attributes overlap, so you would need to address that somehow. If attributes overlap, you need to somehow merge them, and what is an appropriate way to do that, I have no idea without knowing more about the data and what is expected by functions that work with it. Best regards, Josh On Thu, Nov 17, 2011 at 1:43 PM, Henri-Paul Indiogine hindiog...@gmail.com wrote: Greetings! I loose all my metadata after concatenating corpora. This is an example of what happens: meta(corpus.1) MetaID cid fid selfirst selend fname 1 0 1 11 2169 2518 WCPD-2001-01-29-Pg217.scrb 2 0 1 14 9189 9702 WCPD-2003-01-13-Pg39.scrb 3 0 1 14 2109 2577 WCPD-2003-01-13-Pg39.scrb 17 0 1 114 17863 18256 WCPD-2007-04-30-Pg515.scrb meta(corpus.2) MetaID cid fid selfirst selend fname 1 0 2 2 11016 11600 DCPD-200900595.scrb 2 0 2 6 19510 20098 DCPD-201000636.scrb 3 0 2 6 23935 24573 DCPD-201000636.scrb 94 0 2 127 16225 17128 WCPD-2009-01-12-Pg22-3.scrb tot.corpus - c(corpus.1, corpus.2) meta(tot.corpus) MetaID 1 0 2 0 3 0 111 0 This is from the structure of corpus.1 ..$ MetaData:List of 2 .. ..$ create_date: POSIXlt[1:1], format: 2011-11-17 21:09:57 .. ..$ creator : chr henk ..$ Children: NULL ..- attr(*, class)= chr MetaDataNode - attr(*, DMetaData)='data.frame': 17 obs. of 6 variables: ..$ MetaID : num [1:17] 0 0 0 0 0 0 0 0 0 0 ... ..$ cid : int [1:17] 1 1 1 1 1 1 1 1 1 1 ... ..$ fid : int [1:17] 11 14 14 17 46 80 80 80 91 91 ... ..$ selfirst: num [1:17] 2169 9189 2109 8315 9439 ... ..$ selend : num [1:17] 2518 9702 2577 8881 10102 ... ..$ fname : chr [1:17] WCPD-2001-01-29-Pg217.scrb WCPD-2003-01-13-Pg39.scrb WCPD-2003-01-13-Pg39.scrb WCPD-2004-05-17-Pg856.scrb ... - attr(*, class)= chr [1:3] VCorpus Corpus list Any idea on what I could do to keep the metadata in the merged corpus? Thanks, Henri-Paul -- Henri-Paul Indiogine Curriculum Instruction Texas AM University TutorFind Learning Centre Email: hindiog...@gmail.com Skype: hindiogine Website: http://people.cehd.tamu.edu/~sindiogine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging corpora and metadata
What package is all this from()? You might check if there is a special rbind/cbind method provided. I don't think you can easily change the behavior of c() Michael On Nov 17, 2011, at 4:43 PM, Henri-Paul Indiogine hindiog...@gmail.com wrote: Greetings! I loose all my metadata after concatenating corpora. This is an example of what happens: meta(corpus.1) MetaID cid fid selfirst selend fname 1 0 1 11 2169 2518WCPD-2001-01-29-Pg217.scrb 2 0 1 14 9189 9702 WCPD-2003-01-13-Pg39.scrb 3 0 1 14 2109 2577 WCPD-2003-01-13-Pg39.scrb 17 0 1 11417863 18256WCPD-2007-04-30-Pg515.scrb meta(corpus.2) MetaID cid fid selfirst selend fname 1 0 2 211016 11600 DCPD-200900595.scrb 2 0 2 619510 20098 DCPD-201000636.scrb 3 0 2 623935 24573 DCPD-201000636.scrb 94 0 2 12716225 17128 WCPD-2009-01-12-Pg22-3.scrb tot.corpus - c(corpus.1, corpus.2) meta(tot.corpus) MetaID 10 20 30 111 0 This is from the structure of corpus.1 ..$ MetaData:List of 2 .. ..$ create_date: POSIXlt[1:1], format: 2011-11-17 21:09:57 .. ..$ creator: chr henk ..$ Children: NULL ..- attr(*, class)= chr MetaDataNode - attr(*, DMetaData)='data.frame':17 obs. of 6 variables: ..$ MetaID : num [1:17] 0 0 0 0 0 0 0 0 0 0 ... ..$ cid : int [1:17] 1 1 1 1 1 1 1 1 1 1 ... ..$ fid : int [1:17] 11 14 14 17 46 80 80 80 91 91 ... ..$ selfirst: num [1:17] 2169 9189 2109 8315 9439 ... ..$ selend : num [1:17] 2518 9702 2577 8881 10102 ... ..$ fname : chr [1:17] WCPD-2001-01-29-Pg217.scrb WCPD-2003-01-13-Pg39.scrb WCPD-2003-01-13-Pg39.scrb WCPD-2004-05-17-Pg856.scrb ... - attr(*, class)= chr [1:3] VCorpus Corpus list Any idea on what I could do to keep the metadata in the merged corpus? Thanks, Henri-Paul -- Henri-Paul Indiogine Curriculum Instruction Texas AM University TutorFind Learning Centre Email: hindiog...@gmail.com Skype: hindiogine Website: http://people.cehd.tamu.edu/~sindiogine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] calling self written R functions
Hi All, I have written a function (say) called foo, saved in a file called foo.R. Just going by Matlab syntax I usually just change my folder path and therefore can call it at will. When it comes to R, how is the usual way of calling/loading it? because R doesnt seem to automatically find the function from a folder (which might be stupid to attempt in the first place). Thanks, Sachin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging corpora and metadata
Hi Michael, require(sos) findFn({meta}, sortby = Function) ## see that only two functions have the exact name, 'meta' ## one is titled, Meta Data Management in the package 'tm' ## seems a pretty likely choice Also, the fact that it is a truly terrible idea does not mean it is not easy: mvir - new.env() mvir$c - function(x, ...) {cat(sure you can!\n); mean(x, ...)} attach(mvir) c(x = 1:10) detach(mvir) rm(mvir) Cheers, Josh On Thu, Nov 17, 2011 at 5:25 PM, R. Michael Weylandt michael.weyla...@gmail.com michael.weyla...@gmail.com wrote: What package is all this from()? You might check if there is a special rbind/cbind method provided. I don't think you can easily change the behavior of c() Michael On Nov 17, 2011, at 4:43 PM, Henri-Paul Indiogine hindiog...@gmail.com wrote: Greetings! I loose all my metadata after concatenating corpora. This is an example of what happens: meta(corpus.1) MetaID cid fid selfirst selend fname 1 0 1 11 2169 2518 WCPD-2001-01-29-Pg217.scrb 2 0 1 14 9189 9702 WCPD-2003-01-13-Pg39.scrb 3 0 1 14 2109 2577 WCPD-2003-01-13-Pg39.scrb 17 0 1 114 17863 18256 WCPD-2007-04-30-Pg515.scrb meta(corpus.2) MetaID cid fid selfirst selend fname 1 0 2 2 11016 11600 DCPD-200900595.scrb 2 0 2 6 19510 20098 DCPD-201000636.scrb 3 0 2 6 23935 24573 DCPD-201000636.scrb 94 0 2 127 16225 17128 WCPD-2009-01-12-Pg22-3.scrb tot.corpus - c(corpus.1, corpus.2) meta(tot.corpus) MetaID 1 0 2 0 3 0 111 0 This is from the structure of corpus.1 ..$ MetaData:List of 2 .. ..$ create_date: POSIXlt[1:1], format: 2011-11-17 21:09:57 .. ..$ creator : chr henk ..$ Children: NULL ..- attr(*, class)= chr MetaDataNode - attr(*, DMetaData)='data.frame': 17 obs. of 6 variables: ..$ MetaID : num [1:17] 0 0 0 0 0 0 0 0 0 0 ... ..$ cid : int [1:17] 1 1 1 1 1 1 1 1 1 1 ... ..$ fid : int [1:17] 11 14 14 17 46 80 80 80 91 91 ... ..$ selfirst: num [1:17] 2169 9189 2109 8315 9439 ... ..$ selend : num [1:17] 2518 9702 2577 8881 10102 ... ..$ fname : chr [1:17] WCPD-2001-01-29-Pg217.scrb WCPD-2003-01-13-Pg39.scrb WCPD-2003-01-13-Pg39.scrb WCPD-2004-05-17-Pg856.scrb ... - attr(*, class)= chr [1:3] VCorpus Corpus list Any idea on what I could do to keep the metadata in the merged corpus? Thanks, Henri-Paul -- Henri-Paul Indiogine Curriculum Instruction Texas AM University TutorFind Learning Centre Email: hindiog...@gmail.com Skype: hindiogine Website: http://people.cehd.tamu.edu/~sindiogine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calling self written R functions
Hi Sachin, Nope, R does not work that way. You do have several options, though. For a function or two, consider creating/editing a workspace .Rprofile file. https://www.google.com/?q=Rprofile should bring up a fair number of pages describing this, you might look at a few. If you find yourself getting a little collection of functions, with some of them possibly depending on each other, and/or wanting to have documentation for your function(s), it is time to write a package. There was a nice video tutorial of this at the LA R User Group meeting not too long ago, you can find it here: http://www.youtube.com/watch?v=TER-rQoVs0k You can also see the official manual on extensions for how to write packages: http://cran.r-project.org/doc/manuals/R-exts.html Cheers! Josh On Thu, Nov 17, 2011 at 5:26 PM, Sachinthaka Abeywardana sachin.abeyward...@gmail.com wrote: Hi All, I have written a function (say) called foo, saved in a file called foo.R. Just going by Matlab syntax I usually just change my folder path and therefore can call it at will. When it comes to R, how is the usual way of calling/loading it? because R doesnt seem to automatically find the function from a folder (which might be stupid to attempt in the first place). Thanks, Sachin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calling self written R functions
?source source(/path/to/foo.R) will load it into R. Sarah On Thu, Nov 17, 2011 at 8:26 PM, Sachinthaka Abeywardana sachin.abeyward...@gmail.com wrote: Hi All, I have written a function (say) called foo, saved in a file called foo.R. Just going by Matlab syntax I usually just change my folder path and therefore can call it at will. When it comes to R, how is the usual way of calling/loading it? because R doesnt seem to automatically find the function from a folder (which might be stupid to attempt in the first place). Thanks, Sachin -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging corpora and metadata
Hi Josh, You're absolutely right. I suppose one could set up some sort of S3 thing for Henri's problem: c - function(..., recursive = FALSE) UseMethod(c) c.default - base::c c.corpus - function(..., recursive = FALSE) {ans = c.default(...); attributes(ans) - c(do.call(attributes, ...))} But agreed, it seems deeply risky. Cheers, Michael On Thu, Nov 17, 2011 at 9:01 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi Michael, require(sos) findFn({meta}, sortby = Function) ## see that only two functions have the exact name, 'meta' ## one is titled, Meta Data Management in the package 'tm' ## seems a pretty likely choice Also, the fact that it is a truly terrible idea does not mean it is not easy: mvir - new.env() mvir$c - function(x, ...) {cat(sure you can!\n); mean(x, ...)} attach(mvir) c(x = 1:10) detach(mvir) rm(mvir) Cheers, Josh On Thu, Nov 17, 2011 at 5:25 PM, R. Michael Weylandt michael.weyla...@gmail.com michael.weyla...@gmail.com wrote: What package is all this from()? You might check if there is a special rbind/cbind method provided. I don't think you can easily change the behavior of c() Michael On Nov 17, 2011, at 4:43 PM, Henri-Paul Indiogine hindiog...@gmail.com wrote: Greetings! I loose all my metadata after concatenating corpora. This is an example of what happens: meta(corpus.1) MetaID cid fid selfirst selend fname 1 0 1 11 2169 2518 WCPD-2001-01-29-Pg217.scrb 2 0 1 14 9189 9702 WCPD-2003-01-13-Pg39.scrb 3 0 1 14 2109 2577 WCPD-2003-01-13-Pg39.scrb 17 0 1 114 17863 18256 WCPD-2007-04-30-Pg515.scrb meta(corpus.2) MetaID cid fid selfirst selend fname 1 0 2 2 11016 11600 DCPD-200900595.scrb 2 0 2 6 19510 20098 DCPD-201000636.scrb 3 0 2 6 23935 24573 DCPD-201000636.scrb 94 0 2 127 16225 17128 WCPD-2009-01-12-Pg22-3.scrb tot.corpus - c(corpus.1, corpus.2) meta(tot.corpus) MetaID 1 0 2 0 3 0 111 0 This is from the structure of corpus.1 ..$ MetaData:List of 2 .. ..$ create_date: POSIXlt[1:1], format: 2011-11-17 21:09:57 .. ..$ creator : chr henk ..$ Children: NULL ..- attr(*, class)= chr MetaDataNode - attr(*, DMetaData)='data.frame': 17 obs. of 6 variables: ..$ MetaID : num [1:17] 0 0 0 0 0 0 0 0 0 0 ... ..$ cid : int [1:17] 1 1 1 1 1 1 1 1 1 1 ... ..$ fid : int [1:17] 11 14 14 17 46 80 80 80 91 91 ... ..$ selfirst: num [1:17] 2169 9189 2109 8315 9439 ... ..$ selend : num [1:17] 2518 9702 2577 8881 10102 ... ..$ fname : chr [1:17] WCPD-2001-01-29-Pg217.scrb WCPD-2003-01-13-Pg39.scrb WCPD-2003-01-13-Pg39.scrb WCPD-2004-05-17-Pg856.scrb ... - attr(*, class)= chr [1:3] VCorpus Corpus list Any idea on what I could do to keep the metadata in the merged corpus? Thanks, Henri-Paul -- Henri-Paul Indiogine Curriculum Instruction Texas AM University TutorFind Learning Centre Email: hindiog...@gmail.com Skype: hindiogine Website: http://people.cehd.tamu.edu/~sindiogine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pairwise correlation
Here's a function Josh Wiley provided in another thread: spec.cor - function(dat, r, ...) { x - cor(dat, ...) x[upper.tri(x, TRUE)] - NA i - which(abs(x) = r, arr.ind = TRUE) data.frame(matrix(colnames(x)[as.vector(i)], ncol = 2), value = x[i]) } Michael On Thu, Nov 17, 2011 at 4:08 PM, Musa Hassan musah...@gmail.com wrote: Hi Michael, I was able to solve this. I just used the WGCNA library which allows for stringsAsFactors to be defined in the work space making everything stored as strings remain strings. My problem now is parsing through the results to pull out only significant correlations defined by a certain Pearson correlation value say 0.8. On 17 November 2011 15:32, R. Michael Weylandt michael.weyla...@gmail.com wrote: I can't see how it's stored like that and the email servers garble it up. Use dput() to create a plain text representation and paste that back in. Thanks, Michael On Thu, Nov 17, 2011 at 9:37 AM, muzz56 musah...@gmail.com wrote: Hi Michael, Here is a sample of the data. Gene Array1 Array2 Array3 Array4 Array5 Array6 Array7 Array8 Array9 Array10 Array11 Fth1 26016.01 23134.66 17445.71 39856.04 27245.45 23622.98 37887.75 49857.46 25864.73 21852.51 29198.4 B2m 7573.64 7768.52 6608.24 8571.65 6380.78 6242.76 6903.92 7330.63 7256.18 5678.21 10937.05 Tmsb4x 6192.44 4277.22 5024.59 4851.51 3062.55 4562.43 7948.1 5018.58 3200.17 2855.77 6139.23 H2-D1 3141.41 3986.06 3328.62 4726.6 3589.89 2885.95 7509.88 5257.62 4742.26 3431.33 5300.72 Prdx5 3935.7 3938.9 3401.68 4193.14 4028.95 3438.19 6640.15 5486.61 4424.57 3368.83 5265.92 I want to retain the gene names in the data. What you've proposed will take them out and I'll have to append them back to the results after the cor() On 17 November 2011 09:33, Michael Weylandt [via R] ml-node+s789695n4080177...@n4.nabble.com wrote: I think something like this should do it, but I can't test without data: rownames(mydata) - mydata[,1] # Put the elements in the first column as rownames mydata - mydata[,-1] # drop the things that are now rownames Michael On Thu, Nov 17, 2011 at 9:23 AM, Musa Hassan [hidden email]http://user/SendEmail.jtp?type=nodenode=4080177i=0 wrote: Hi Michael, Thanks for the response. I have noticed that the error occurred during my data read. It appears that the rownames (which when the data is transposed become my colnames) were converted to numbers instead of strings as they should be. The original header names don't change, just the rownames. I have to figure out how to import the data and have the strings not converted. Right now am using: mydata = read.csv(mydata.csv, headers=T,stringsAsFactors=F) then to convert the data frame to matrix mydata=data.matrix(mydata) Then I just do the correlation as Peter suggested. expression=cor(t(expression)) Thanks. On 17 November 2011 08:51, R. Michael Weylandt [hidden email]http://user/SendEmail.jtp?type=nodenode=4080177i=1 wrote: On Wed, Nov 16, 2011 at 11:22 PM, muzz56 [hidden email]http://user/SendEmail.jtp?type=nodenode=4080177i=2 wrote: Thanks to everyone who replied to my post, I finally got it to work. I am however not sure how well it worked since it run so quickly, but seems like I have a 2000 x 2000 data set. Behold the great and mighty power that is R! Don't worry -- on a decent machine the correlation of a 2k x 2k data set should be pretty fast. (It's about 9 seconds on my old-ish laptop with a bunch of other junk running) My followup questions would be, how do I get only pairs with say a certain pearson correlation value additionally it seems like my output didn't retain the headers but instead replaced them with numbers making it hard to know which gene pairs correlate. This is a little worrisome: R carries column names through cor() so this would suggest you weren't using them. Were your headers listed as part of your data (instead of being names)? If so, they would have been taken as numbers. Take a look at dimnames(NAMEOFDATA) -- if your headers aren't there, then they are being treated as data instead of numbers. If they are, can you provide some reproducible code and we can debug more fully. The easiest way to send data is to use the dput() function to get a copy-pasteable plain text representation. It would also be great if you could restrict it to a subset of your data rather than the full 4M data points, but if that's hard to do, don't worry. You should have expected behavior like X - matrix(1:9,3) colnames(X) - c(A,B,C) cor(X) # Prints with labels Michael On 16 November 2011 17:11, Nordlund, Dan (DSHS/RDA) [via R] [hidden email] http://user/SendEmail.jtp?type=nodenode=4080177i=3 wrote:
Re: [R] calling self written R functions
Looks like the function I was looking for was source(), but thanks Joshua I certainly do need to make a package once I finish this set of re-coding into R from Matlab. Fingers crossed the effort is worth it. Thanks, Sachin On Fri, Nov 18, 2011 at 1:34 PM, Sarah Goslee sarah.gos...@gmail.comwrote: ?source source(/path/to/foo.R) will load it into R. Sarah On Thu, Nov 17, 2011 at 8:26 PM, Sachinthaka Abeywardana sachin.abeyward...@gmail.com wrote: Hi All, I have written a function (say) called foo, saved in a file called foo.R. Just going by Matlab syntax I usually just change my folder path and therefore can call it at will. When it comes to R, how is the usual way of calling/loading it? because R doesnt seem to automatically find the function from a folder (which might be stupid to attempt in the first place). Thanks, Sachin -- Sarah Goslee http://www.functionaldiversity.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Log-transform and specifying Gamma
Peter Minting peter_minting at hotmail.com writes: Dear R help, I am trying to work out if I am justified in log-transforming data and specifying Gamma in the same glm. Does it have to be one or the other? No, but I've never seen it done. I have attached an R script and the datafile to show what I mean. Also, I cannot find a mixed-model that allows Gamma errors (so I cannot find a way of including random effects). What should I do? Many thanks, Pete ToadsBd-read.table(Bd.txt,header=TRUE, colClasses=c(rep(factor,2),rep(numeric,3),factor)) with(ToadsBd,table(group,site)) ## 47 toads, 3 groups per site library(ggplot2) library(mgcv) ## plot points, add linear regressions per group/site ggplot(ToadsBd,aes(x=startg,y=logBd,colour=group))+ geom_point()+facet_grid(.~site)+geom_smooth(method=lm) ## not much going on with startg ... PERHAPS ## similar slopes across sites? ggplot(ToadsBd,aes(x=site:group,y=logBd,colour=site))+ geom_boxplot()+geom_point() ## I'm curious -- I thought the groups were just blocking ## factors, but maybe not? The patterns of group 1, 2, 3 ## are consistent across sites ... ## take a quick look at the raw data ... ggplot(ToadsBd,aes(x=site:group,y=Bd,colour=site))+ geom_boxplot()+geom_point() mod1 - lm(logBd~group*site*startg,data=ToadsBd) summary(mod1) oldpar - par(mfrow=c(2,2)) plot(mod1) par(oldpar) ## we definitely have to take care of the heteroscedasticity ... library(MASS) boxcox(mod1) ## square root transform ... ?? ToadsBd - transform(ToadsBd,sqrtLogBd=sqrt(logBd)) mod2 - lm(sqrtLogBd~group*site*startg,data=ToadsBd) oldpar - par(mfrow=c(2,2)) plot(mod2) par(oldpar) ## still not perfect, but perhaps OK library(coefplot2) coefplot2(mod2) mod3 - update(mod2,.~.-group:site:startg) coefplot2(mod3) drop1(mod3,test=F) mod4 - update(mod3,.~group+site+startg) coefplot2(mod4) ## look at results on new (transformed) scale ggplot(ToadsBd,aes(x=site:group,y=sqrtLogBd,colour=site))+ geom_boxplot()+geom_point() ## Conclusions: ## don't mess around with random effects for only three groups in two sites ## I have done a fair amount of stepwise selection, so the p-values ## really can't be taken seriously, but it was clear from the ##beginning that there were differences among groups, which *seem* ##to be consistent among sites (which really makes me wonder what ##the groups are. (The weak effect of site might well go away ##once one took the effect of snooping into account ...) ## sqrt(log(x)) seems to be adequate to get reasonably ## homogeneous variances, but it really is a very strong transformation. ## It makes the results somewhat hard to interpret. Alternatively, ##you could just look at a nonparametric test (e.g. Kruskal-Wallis ##on site:group), but nonparametric tests make it hard to ##add lots of structure to the model __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] S4 : defining [- using inheritance from 2 classes
Hi the list, I define a class 'C' that inherit from two classes 'A' and 'B'. 'A' and 'B' have no slot with similar names. setClass( Class=C, contains=c(A,B) ) To define the get operator '[' for class C, I simply use the get of A or B (the constante 'SLOT_OF_A' is a character holding the names of all the slot of A) : setMethod([,C, function(x,i,j,drop){ if(i%in%SLOT_OF_A){ x - as(x,'A') }else{ x - as(x,'B') } return(x[i,j]) } Is it possible to do something similar for the set operator '[-' ? Thanks Christophe -- View this message in context: http://r.789695.n4.nabble.com/S4-defining-using-inheritance-from-2-classes-tp4082217p4082217.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Drawing ticks in the 3rd and 4th row of a lattice
Dear all, I want to draw ticks on the 3rd and 4th row of a lattice. How do I do this ? In my search of the help, I discovered a parameter alternating,which kind of says where the ticks will be but does not suffice for me. I am running this command : - barchart(X03/1000~time|Company, data=df1[which(df1$time!=1),], horiz=F, scales=list(x=list(rot=45,labels=paste(Mar,c(07,08,09,10,11 ,par.strip.text=list(lineheight=1,lines=2,cex=.75), ylab = In Rs. Million,xlab=,layout=c(3,4),as.table=T,between=list(y=1)) where my data is : - dput(df1) structure(list(Company = structure(c(9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L, 9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L, 9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L, 9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L, 9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L, 9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L), .Label = c(Bharat Petroleum Corpn. Ltd., Chennai Petroleum Corpn. Ltd., Company Name, Essar Oil Ltd., Hindalco Industries Ltd., Hindustan Petroleum Corpn. Ltd., Indian Oil Corpn. Ltd., Mangalore Refinery Petrochemicals Ltd., Reliance Industries Ltd., Steel Authority Of India Ltd., Sterlite Industries (India) Ltd.), class = factor), time = c(7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), X03 = c(722931.1, 751620.5, 304456.3, 294868.9, 192712.6, 36695.4, 188313.4, 98954.9, 100088.7, 72379.9, 848517.5, 864562.2, 347310.9, 301022.1, 253514.5, 165661.6, 206377.7, 108897, 109336.3, 71207.6, 1003504.6, 1145993.8, 392261.5, 341086, 289737.4, 359837.2, 252964.3, 90036.2, 90474.8, 127623.2, 1411082.1, 907480.4, 364637.5, 290915.7, 255397.4, 328557.2, 202855.3, 118725.4, 116647.6, 106254.9, 1772254.7, 1204856.9, 469935.6, 313527.6, 320131.1, 384323.5, 260813.9, 137403.3, 137238.5, 136888.4, 1151658, 974902.76, 375720.36, 308284.06, 262298.6, 255014.98, 64.92, 110803.36, 110757.18, 102870.8)), row.names = c(Reliance Industries Ltd..7, Indian Oil Corpn. Ltd..7, Bharat Petroleum Corpn. Ltd..7, Hindustan Petroleum Corpn. Ltd..7, Mangalore Refinery Petrochemicals Ltd..7, Essar Oil Ltd..7, Chennai Petroleum Corpn. Ltd..7, Hindalco Industries Ltd..7, Sterlite Industries (India) Ltd..7, Steel Authority Of India Ltd..7, Reliance Industries Ltd..8, Indian Oil Corpn. Ltd..8, Bharat Petroleum Corpn. Ltd..8, Hindustan Petroleum Corpn. Ltd..8, Mangalore Refinery Petrochemicals Ltd..8, Essar Oil Ltd..8, Chennai Petroleum Corpn. Ltd..8, Hindalco Industries Ltd..8, Sterlite Industries (India) Ltd..8, Steel Authority Of India Ltd..8, Reliance Industries Ltd..9, Indian Oil Corpn. Ltd..9, Bharat Petroleum Corpn. Ltd..9, Hindustan Petroleum Corpn. Ltd..9, Mangalore Refinery Petrochemicals Ltd..9, Essar Oil Ltd..9, Chennai Petroleum Corpn. Ltd..9, Hindalco Industries Ltd..9, Sterlite Industries (India) Ltd..9, Steel Authority Of India Ltd..9, Reliance Industries Ltd..10, Indian Oil Corpn. Ltd..10, Bharat Petroleum Corpn. Ltd..10, Hindustan Petroleum Corpn. Ltd..10, Mangalore Refinery Petrochemicals Ltd..10, Essar Oil Ltd..10, Chennai Petroleum Corpn. Ltd..10, Hindalco Industries Ltd..10, Sterlite Industries (India) Ltd..10, Steel Authority Of India Ltd..10, Reliance Industries Ltd..11, Indian Oil Corpn. Ltd..11, Bharat Petroleum Corpn. Ltd..11, Hindustan Petroleum Corpn. Ltd..11, Mangalore Refinery Petrochemicals Ltd..11, Essar Oil Ltd..11, Chennai Petroleum Corpn. Ltd..11, Hindalco Industries Ltd..11, Sterlite Industries (India) Ltd..11, Steel Authority Of India Ltd..11, Reliance Industries Ltd..1, Indian Oil Corpn. Ltd..1, Bharat Petroleum Corpn. Ltd..1, Hindustan Petroleum Corpn. Ltd..1, Mangalore Refinery Petrochemicals Ltd..1, Essar Oil Ltd..1, Chennai Petroleum Corpn. Ltd..1, Hindalco Industries Ltd..1, Sterlite Industries (India) Ltd..1, Steel Authority Of India Ltd..1 ), .Names = c(Company, time, X03), reshapeLong = structure(list( varying = structure(list(X03 = c(X03.07, X03.08, X03.09, X03.10, X03.11, X03.1)), .Names = X03, v.names = X03, times = c(7, 8, 9, 10, 11, 1)), v.names = X03, idvar = Company, timevar = time), .Names = c(varying, v.names, idvar, timevar)), class = data.frame) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging corpora and metadata
Hi Joshua! 2011/11/17 Joshua Wiley jwiley.ps...@gmail.com: One possibility (though it does not concatenate per se): combined - list(corpus.1, corpus.2) Thanks I will look into it. *if* (there are only attributes in corpus.1 OR corpus.2) OR (the attribute names in corpus.1 and corpus.2 are unique), then you could do: Unfortunately this is not the case.In the meanwhile I rewrote the code that generates the corpus so that the documents are combined into a single corpus _before_ the metadata are added. That solved the problem. Thanks for your feedback and suggestions. Henri-Paul -- Henri-Paul Indiogine Curriculum Instruction Texas AM University TutorFind Learning Centre Email: hindiog...@gmail.com Skype: hindiogine Website: http://people.cehd.tamu.edu/~sindiogine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] calculating symmetric matrix
Hi All, I need to a calculation W%*%d. However I know that this matrix is symmetric (since W=t(d)%*%w). My question is considering that I only need to calculate the lower/ upper triangle (n(n+1)/2 elements) rather than the n^2 elements of the entire matrix. Is there a way to do this efficiently. My 'n' can be quite large (upto 10,000 or more?). Thanks, Sachin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Introducing \n's so that par.strip.text can produce multiline strips in lattice
Dear Dennis, Many thanks.I was wondering if there was a way to edit the variable and put \n's in it. Is there ? Thank you, Ashim On Thu, Nov 17, 2011 at 4:07 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: This worked for me - I needed to modify some of the strip labels to improve the appearance a bit and also reduced the strip font size a bit to accommodate the lengths of the strings. The main thing was to change \\n to \n. Firstly, I created a new variable called Indic as a character variable and then did some minor surgery on three of the strings: Indic - as.character(imports$Indicator) Indic[3 + 6 *(0:5)] - Chemicals and related\n products imports Indic[4 + 6 *(0:5)] - Pearls, semiprecious \nprecious stones imports Indic[5 + 6 *(0:5)] - Metaliferrous ores \nmetal scrap imports # Read Indic into the imports data frame as a factor: imports$Indic - factor(Indic) # Redo the plot: barchart(X03/1000 ~ time | Indic, data = imports[which(imports$time != 1), ], horiz = FALSE, scales = list(x = list(rot=45, labels=paste(Mar,2007:2011))), par.strip.text=list(lineheight=1, lines=2, cex = 0.8)) Dennis On Wed, Nov 16, 2011 at 11:25 PM, Ashim Kapoor ashimkap...@gmail.com wrote: Dear all, I have the following data, which has \\n in place of \n. I introduced \n's in the csv file so that I could use it in barchart in lattice. When I did that and read it into R using read.csv, it read it as \\n. My question is how do I introduce \n in the middle of a long string of quoted text so that lattice can make multiline strips. Hitting Enter which is supposed to introduce \n's does'nt work because when I goto the middle of the line and press enter Open Office thinks that I am done with editing my text and takes me to the next line. dput(imports) structure(list(Indicator = structure(c(5L, 4L, 2L, 12L, 8L, 7L, 5L, 4L, 2L, 12L, 8L, 7L, 5L, 4L, 2L, 12L, 8L, 7L, 5L, 4L, 2L, 12L, 8L, 7L, 5L, 4L, 2L, 12L, 8L, 7L, 5L, 4L, 2L, 12L, 8L, 7L ), .Label = c(, Chemicals and related\\n products imports, Coal export, Gold imports, Gold silver imports, Iron ore export, Iron steel imports, Metaliferrous ores metal scrap imports, Mica export, Ores minerals\\nexport, Other ores \\nminerals export, Pearls precious \\n semiprecious stones imports, Processed minerals\\n export ), class = factor), Units = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(, Rs.crore), class = factor), Expression = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(, Ival), class = factor), time = c(7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 1, 1, 1, 1, 1, 1), X03 = c(66170.46, 65337.72, 62669.86, 33870.17, 36779.35, 27133.25, 71829.14, 67226.04, 75086.89, 29505.61, 31750.99, 32961.26, 104786.39, 95323.8, 134276.63, 76263, 36363.61, 41500.36, 140440.36, 135877.91, 111269.69, 76678.27, 36449.89, 36808.06, 162253.77, 154346.72, 124895.76, 142437.03, 42872.16, 43881.85, 109096.024, 103622.438, 101639.766, 71750.816, 36843.2, 36456.956), id = c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L)), row.names = c(1.7, 2.7, 3.7, 4.7, 5.7, 6.7, 1.8, 2.8, 3.8, 4.8, 5.8, 6.8, 1.9, 2.9, 3.9, 4.9, 5.9, 6.9, 1.10, 2.10, 3.10, 4.10, 5.10, 6.10, 1.11, 2.11, 3.11, 4.11, 5.11, 6.11, 1.1, 2.1, 3.1, 4.1, 5.1, 6.1 ), .Names = c(Indicator, Units, Expression, time, X03, id), class = data.frame, reshapeLong = structure(list(varying = structure(list( X03 = c(X03.07, X03.08, X03.09, X03.10, X03.11, X03.1)), .Names = X03, v.names = X03, times = c(7, 8, 9, 10, 11, 1)), v.names = X03, idvar = id, timevar = time), .Names = c(varying, v.names, idvar, timevar))) On which I want to run barchart(X03/1000~time|Indicator, data=imports[which(imports$time!=1),], horiz=F, scales=list(x=list(rot=45,labels=paste(Mar,2007:2011))), par.strip.text=list(lineheight=1,lines=2)) Many thanks, Ashim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented,
[R] Need Help
Hi, I need to make a subset of my species abundance matrix with only species (columns) that have a total abundance(column sum) greater than 0.5 to do ordination in vegan package. I used following code but it is not working. Can you please give me a solution. gl1- subset(grassland[,5:44], colSums 0.05, select=2) gl1 is my new matrix ,and grassland[,5:44] is my original matrix. thanks, Dilshan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] building biOps on macports, and configure--vars
All, the MacOSX binary build of the biOps package is broken on cran, so I am trying to compile from source. I am very close; the trick is apparently that this package depends on fftw3, libjpeg and libtiff. My fftw3 is in /usr/local/, but my libjpeg and libtiff are in /opt/local/ since i got them through macports. In general, I'd like to keep as many of my third-party libraries as I can in the macports system, which means that I need to point R CMD INSTALL to /opt/local. Here is the problem I am running into: This is my current command-line: sudo R CMD INSTALL --configure-vars='LIBS=-L/opt/local/lib CPPFLAGS=-I/opt/local/include/' --configure-args='--includedir=/opt/local/include --libdir=/opt/local/include' biOps_0.2.1.1.tar.gz I found that by using the configure--vars argument i can get the 'configure' script to finish (without the configure--vars argument, the 'configure' script of course fails to find libjpeg and libtiff, and exits). However, after 'configure' runs, when the installer actually tries to build the biOps package, it fails as follows: gcc-4.2 -arch x86_64 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/x86_64 -I/usr/local/include-fPIC -g -O2 -c jpegio.c -o jpegio.o jpegio.c:36:21: error: jpeglib.h: No such file or directory pegio.c: In function ‘read_jpg_img_info’: jpegio.c:52: error: storage size of ‘cinfo’ isn’t known jpegio.c:53: error: storage size of ‘jerr’ isn’t known [[... about 30 lines of this follow...]] make: *** [jpegio.o] Error 1 ERROR: compilation failed for package ‘biOps’ The problem is, obviously, that the gcc call is missing -I/opt/local/include, even though I put that in both --configure-vars and --configure-args! I also note, that the '--configure-args' seems to have no effect at all. It's '--configure-vars' that allows the 'configure' script to finish successfully. I've searched and read various docs, but I don't know what else I can do at this point (i'd like to avoid copying my libraries into /usr/local if at all possible, and it does seem that the point of the configure--foo arguments is to let me do that). Am I missing a necessary argument? Is it a bug in the biOps package? Is there any solution/workaround that doesn't involve copying stuff into /usr/local? Thanks for any help. Yours, Timothy Teravainen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Delete Rows Dynamically Within a Loop
Ok guys, as requested, I will add more info so that you understand why a simple vector operation is not possible. It's not easy to explain in few words but let's see. I have a huge amount of points over a 2D space. I divide my space in a grid with a given resolution,say, 100m. The main loop that I am not sure if it's mandatory or not (any alternative is welcomed) is to go through EACH cell/pixel that contains at least 2 points (right now I am using the method quadratcount within the package spatstat). Inside this loop, thus for each one of this non empty cells, I have to find and keep only a maximum of 10 Male-Female pairs that are within 3 meters from each other. The 3-meter buffer can be done using the disc function within spatstat. To select points falling inside a buffer you can use the method pnt.in.poly within the SDMTools package. All that because pixels have a maximum capacity that cannot be exceeded. Since in each cell there can be hundreds or thousands of points I am trying to find a smart way to use another loop/similar method to: 1)go trough each point at a time 2)create a buffer a select points with different sex 3)Save the closest Male-Female (0-1) pair in another dataframe (called new_colonies) 4)Remove those points from the dataframe so that it shrinks and I don't have to consider them anymore 5) as soon as that new dataframe reaches 10 rows stop everything and go to the next cell (thus skipping all remaining points. Here is the code that I developed to be run within each cell (right now it takes too long): head(df,20): X Y Sex ID 2 583058.2 2882774 1 1 3 582915.6 2883378 0 2 4 582592.8 2883297 1 3 5 582793.0 2883410 1 4 6 582925.7 2883397 1 5 7 582934.2 2883277 0 6 8 582874.7 2883336 0 7 9 583135.9 2882773 1 8 10 582955.5 2883306 1 9 11 583090.2 2883331 0 10 12 582855.3 2883358 1 11 13 582908.9 2883035 1 12 14 582608.8 2883715 0 13 15 582946.7 2883488 1 14 16 582749.8 2883062 0 15 17 582906.4 2883317 0 16 18 582598.9 2883390 0 17 19 582890.2 2883413 0 18 20 582752.8 2883361 0 19 21 582953.1 2883230 1 20 for(i in 1:dim(df)[1]){ new_colonies - data.frame(ID1=0,ID2=0,X=0,Y=0) discbuff - disc(radius, centre=c(df$X[i], df$Y[i])) #define the points and polygon pnts = cbind(df$X[-i],df$Y[-i]) polypnts = cbind(x = discbuff$bdry[[1]]$x, y = discbuff$bdry[[1]]$y) out = pnt.in.poly(pnts,polypnts) out$ID - df$ID[-i] if (any(out$pip == 1)) { pnt.inBuffID - out$ID[which(out$pip == 1)] cond - df$Sex[i] != df$Sex[pnt.inBuffID] if (any(cond)){ eucdist - sqrt((df$X[i] - df$X[pnt.inBuffID][cond])^2 + (df$Y[i] - df$Y[pnt.inBuffID][cond])^2) IDvect - pnt.inBuffID[cond] new_colonies_temp - data.frame(ID1=df$ID[i], ID2=IDvect[which(eucdist==min(eucdist))], X=(df$X[i] + df$X[pnt.inBuffID][cond][which(eucdist==min(eucdist))]) / 2, Y=(df$Y[i] + df$Y[pnt.inBuffID][cond][which(eucdist==min(eucdist))]) / 2) new_colonies - rbind(new_colonies,new_colonies_temp) if (dim(new_colonies)[1] == maxdensity) break } } } new_colonies - new_colonies[-1,] -- View this message in context: http://r.789695.n4.nabble.com/Delete-Rows-Dynamically-Within-a-Loop-tp4081777p4081777.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Obtaining a derivative of nls() SSlogis function
Hello, I am wondering if someone can help me. I have the following function that I derived using nls() SSlogis. I would like to find its derivative. I thought I had done this using deriv(), but for some reason this isn't working out for me. Here is the function: asym - 84.951 xmid - 66.90742 scal - -6.3 x.seq - seq(1, 153,, 153) nls.fn - asym/((1+exp((xmid-x.seq)/scal))) try #1 deriv(nls.fn) #get an Error in .Internal(deriv.default(expr, namevec, function.arg, tag, hessian)) : 'namevec' is missing try #2 deriv(nls.fn, namevec=c(asym, xmid, scal)) #this doesn't seem to give me the expression, and the gradients are zero. I've tried to do this with Ryacas as well, but I'm lost. Can anyone help? Thank you, Katrina [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.