[R] smoothScatter problems
Hello, I'm having some trouble getting a good result for a smoothScatter plot. I have some data that I want to log-plot, but when I use smoothScatter the result is not correct. The problem seems to be that with the log=x argument smoothScatter calculates the bins linearly, so the plot will be skewed towards the right. See for example: file(http://dckd.nl/~jeroen/drop/example.rdata;) smoothScatter(d,log=x) smoothScatter(log(d$x),d$y) I could also use the latter way to produce the result, but I would like to use the original units on the x-axis, not the log units. I also want to use these results in a publication, but the PDFs saved from these results are not very nice. With Acrobat there are lots of blocks, and with other viewers there are many white lines through the blue colours. Thanks, Jeroen. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ROC curve using epicalc (after logistic regression) (re-sent)
Dear R-help, I am resending as I believe I screwed up the e-mail address to R-help earlier. Sorry for my lack of attention to detail, and for any inconvenience. I have also sent the question to the package maintainer, as suggested in the posting guide. Regards, Cliff -- Forwarded message -- From: Clifford Long gnolff...@gmail.com Date: Sun, Jul 26, 2009 at 8:46 PM Subject: Fwd: ROC curve using epicalc (after logistic regression) To: cvira...@medicine.psu.ac.th Dear Virasakdi Chongsuvivatwong, After sending the message below to the R-help mailing list, it occurred to me that I probably should also have sent a copy to you, per R posting guidance. I would be interested in any thoughts or suggestions that you might have regarding my difficulty using the ROCR routine in the epicalc package. (I've used this before, and find it to be a very helpful package ... thanks.) Is my issue related to the way the data is structured for the glm routine - meaning not with individual cases, but instead by counts (per DOE treatment) of pass, fail, and total? Or perhaps I've made another error? I'll understand if you don't have the time to look this over. In case you do, any direction/guidance will be appreciated. Thank you for your time, and for this excellent package. Regards, Cliff Long -- Forwarded message -- From: Clifford Long gnolff...@gmail.com Date: Sun, Jul 26, 2009 at 3:52 PM Subject: ROC curve using epicalc (after logistic regression) To: R-help@r-project.org Dear R-help list, I'm attempting to use the ROC routine from the epicalc package after performing a logistic regression analysis. My code is included after the sessionInfo() result. The datafile (GasketMelt1.csv) is attached. I updated both R and the epicalc packages and tried again before sending this request. sessionInfo result: R version 2.9.1 (2009-06-26) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] caret_4.19 lattice_0.17-25 epicalc_2.9.1.2 survival_2.35-4 [5] foreign_0.8-36 loaded via a namespace (and not attached): [1] grid_2.9.1 tools_2.9.1 Header information from package 'epicalc': Package: epicalc Version: 2.9.1.2 Date: 2009-07-14 My code ... # # Logistic Regression (the model result is as expected) # dfile = 'GasketMelt1.csv' gmelt.df = read.csv(dfile, header = TRUE, as.is = TRUE) names(gmelt.df) gmelt.df$p = gmelt.df$Pass / gmelt.df$Total gmelt.glm = glm(p ~ Time + Temperature + Depth + Time*Temperature + Time*Depth + Temperature*Depth, family = binomial(link = logit), data=gmelt.df, weight=Total) summary(gmelt.glm) # # ROC # library(epicalc) lroc(gmelt.glm, graph = TRUE, line.col = red) The error message: lroc(gmelt.glm, graph = TRUE, line.col = red) Error in dimnames(x) - dn : length of 'dimnames' [2] not equal to array extent Have I overlooked something? Many thanks to anyone who might have a suggestion. Cliff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] Version 0.7 of package tsDyn, nonlinear time series
Hi Version 0.7 of package tsDyn presented at useR! 2009 is now on CRAN, extended with several new features. The package tsDyn is aimed at estimating nonlinear time series models which exhibit regime specific properties. The regime switching dynamics can either be described by smooth transition (STAR and LSTAR) or threshold effects (SETAR). The package furthermore offers nonlinear models such as neural networks (NNET), and additive autoregressive (AAR) models. The version 0.7 enhances the functionalities for the threshold autoregression models (SETAR) by extending the estimation techniques, providing a complete testing framework and finally allowing its use in a multivariate framework (generalizing VAR and VECM models). Those new features are particularly interesting in economic applications as they generalize the concept of cointegration: with threshold cointegration, variables are still meant to share a long-run relationship, but their adjustment need not occur instantaneously but only after the deviation exceeds some critical threshold. This allows the model to take into account possible effects of transaction costs or stickiness of prices. Further, it permits us to capture asymmetries in the adjustment process, where positive or negative deviations are not corrected to the same extent. A second vignette available at http://code.google.com/p/tsdyn/wiki/ThresholdCointegration offers an overview of the threshold cointegration framework and describes the implemented functions. It can serve as an ideal introduction for people interested in discovering this field. Main new features include: -added possibilty to have two thresolds and hence three regimes in setar and selectSETAR (arg nthresh) -new functions for unit roots tests: KapShinTest() and BBCTest() -new functions for estimating VAR and VECM: lineVar -new function for estimating TVECM: function TVECM() -new function for estimating TVAR: function TVAR() -new function to test for setar: function setarTest() -new function to test for TVAR: function TVAR.LRtest() -new function to test for TVECM: functions TVECM.SeoTest() and HanSeo_TVECM() -new function to simulate/bootstrap a TVAR: function TVAR.sim() -new function to simulate/bootstrap a TVECM: function TVECM.sim() -new function to simulate/bootstrap a setar: function setar.sim() -new function to estimate regime-specific variance in setar: function resVar() -new function to extend a bootstrap replication in setarTest: function extendBoot() -added in selectSETAR() and setar() following args: include, common, model, trim, MM, ML, MH, model, restriction -added in selectSETAR(): criterion SSR (sum of squares residual) and argument max.iter -extended arg th in selectSETAR to search inside an intervall or around a point or on the whole grid Note that minor fixes/improvements are expected soon, those will be only reported in the specific mailing list: ts...@googlegroups.com Any comments/remarks, suggestions and bug reports are sure welcome! Matthieu Stigler ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
Hi I am Niveen Samy. I am interesting by Linear Programming problem, and I want a program to solve it using any language as Java ,pascal,c++,Miranda functional programming language or any language can I learn it. If u have an already solution to this problem please send it to me and explain how can i use it. I work by Linux and windows. thank u very much. please replay to me. with best wishes [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FW: Qury Related With R
Hi Romain, Attached is my R script that script I put into the R work space and through source(RScriptToCallJava.R) command I call the script and my java application is execute. Is it the proper way to call the java application? If not, then please can you explain the directory structure that needs for java application? In java I used the property file. but when I load the property file using .jaddClassPath(D:/R_BTE_Jar/BTE/app.properties) command, it is not load. Thanks Romain for your response. Bed Singh -Original Message- From: Romain Francois [mailto:romain.franc...@dbmail.com] Sent: Saturday, July 25, 2009 6:32 PM To: bed.si...@oracle.com Cc: r-help@r-project.org Subject: Re: [R] FW: Qury Related With R Hi, The file did not make it through the mailing list. Maybe you are looking for ?read.dcf Can you describe the way your application interacts with R. Romain On 07/25/2009 10:35 AM, bed.si...@oracle.com wrote: Hi, I am using the R-2.9.1 with Window XP. Queries: 1. I am running the java application which needs to load property file in R. So can you please tell me how I can load my property file in R session so that my application can find that property file? Attached is my property file for sample. 2. Is there any directory structure required for java application in R format? Thanks Regards, Bed Singh From: ericdov...@gmail.com [mailto:ericdov...@gmail.com] On Behalf Of Eric Doviak Sent: Friday, July 24, 2009 8:42 PM To: bed.si...@oracle.com Subject: Re: Qury Related With R Hi Bed, I'm sorry. I simply don't know. Your best bet would be to ask on R-help: r-help@r-project.org Good luck, - Eric bed.si...@oracle.com wrote: Hi Eric, I am using the R-2.9.1 with Window XP. Queries: I am running the java application which needs to load app.property file in R. So can you please tell me how I can load my property file in R session so that my application can found that property file? Attached is my property file. Is there any directory structure required for java application in R format? Please help me. Thanks in Advance Eric. Regards, Oracle logo.gif Bed Singh Oracle Financial Services PrimeSourcing Mumbai, India Oracle Financial Services Software Limited was formerly i-flex solutions limited. Romain Francois Independent R Consultant +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/tlNb : RGG#155, 156 and 157 |- http://tr.im/rw0p : useR! slides `- http://tr.im/rw0b : RGG#154: demo of atomic functions __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] Beta Verson of tikzDevice Released!
The tikzDevice package provides a new graphics device for R which enables direct output of graphics in a LaTeX-friendly way. The device output consists of files containing instructions for the TikZ graphics language and may be imported directly into LaTeX documents using the \input{} command. The beta version of tikzDevice is now available here: https://r-forge.r-project.org/R/?group_id=440 An additional location for downloading source tarballs and windows binaries is:http://github.com/Sharpie/RTikZDevice/downloads There are many significant improvements compared to the alpha version: Features: - Rd documentation - A vignette - Proper string placement (because of string width and character metric calculations via latex) - Custom LaTeX headers, footers and typesetting engines - R-Level Annotation of graphics with TikZ commands (see http://www.texample.net for great examples of using TikZ commands) Limitations: - ASCII character support only - No recognition of the R symbol font (i.e. no plotmath symbols) - A bevy of other quirks and personality traits that will make themselves known in time The device requires a working installation of LaTeX and the TIkZ package in order to function. This is because font metrics are currently calculated through direct calls to the LaTeX compiler. Unfortunately, this results in some significant computational overhead- it may take several seconds to create a plot that contains a lot of text. In an attempt to offset this behavior, the tikzDevice uses the filehash package to store font metrics that it has already computed. Hopefully the more the device is used, the faster it will be. We suggest reviewing the package vignette, especially the section R Options That Aï¬ect Package Behavior for more information on how the caching process works. We think the package is quite usable as it is, but there are surely many bugs that we don't know about. We welcome bug reports at our R-Forge tracker: https://r-forge.r-project.org/tracker/?group_id=440 Enjoy! - The tikzDevice Team [[alternative HTML version deleted]] ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FW: Qury Related With R
Hi, I think you can just do something like read the parameters into R, and then use the parameters argument of the .jinit function. Something like this perhaps: props - readLines( app.properties ) props - strsplit( gsub( \\\t, , grep( =, props, value = TRUE ) ), = ) params - sapply( props, function(x){ sprintf( -D%s=%s, x[1], x[2] ) } ) library(rJava) .jinit(classpath=D:/R_BTE_Jar/BTE/BackTestingApp.jar, parameters= c( -Xmx512m, params ) ) Let me know if this works. Romain On 07/27/2009 06:02 AM, bed.si...@oracle.com wrote: Hi Romain, Attached is my R script that script I put into the R work space and through source(“RScriptToCallJava.R”) command I call the script and my java application is execute. Is it the proper way to call the java application? If not, then please can you explain the directory structure that needs for java application? In java I used the property file. but when I load the property file using .jaddClassPath(D:/R_BTE_Jar/BTE/app.properties) command, it is not load. Thanks Romain for your response. Bed Singh -Original Message- From: Romain Francois [mailto:romain.franc...@dbmail.com] Sent: Saturday, July 25, 2009 6:32 PM To: bed.si...@oracle.com Cc: r-help@r-project.org Subject: Re: [R] FW: Qury Related With R Hi, The file did not make it through the mailing list. Maybe you are looking for ?read.dcf Can you describe the way your application interacts with R. Romain On 07/25/2009 10:35 AM, bed.si...@oracle.com wrote: Hi, I am using the R-2.9.1 with Window XP. Queries: 1. I am running the java application which needs to load property file in R. So can you please tell me how I can load my property file in R session so that my application can find that property file? Attached is my property file for sample. 2. Is there any directory structure required for java application in R format? Thanks Regards, Bed Singh From: ericdov...@gmail.com [mailto:ericdov...@gmail.com] On Behalf Of Eric Doviak Sent: Friday, July 24, 2009 8:42 PM To: bed.si...@oracle.com Subject: Re: Qury Related With R Hi Bed, I'm sorry. I simply don't know. Your best bet would be to ask on R-help: r-help@r-project.org Good luck, - Eric bed.si...@oracle.com wrote: Hi Eric, I am using the R-2.9.1 with Window XP. Queries: I am running the java application which needs to load app.property file in R. So can you please tell me how I can load my property file in R session so that my application can found that property file? Attached is my property file. Is there any directory structure required for java application in R format? Please help me. Thanks in Advance Eric. Regards, Oracle logo.gif Bed Singh Oracle Financial Services PrimeSourcing Mumbai, India Oracle Financial Services Software Limited was formerly i-flex solutions limited. Romain Francois Independent R Consultant +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/tlNb : RGG#155, 156 and 157 |- http://tr.im/rw0p : useR! slides `- http://tr.im/rw0b : RGG#154: demo of atomic functions -- Romain Francois Independent R Consultant +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/tlNb : RGG#155, 156 and 157 |- http://tr.im/rw0p : useR! slides `- http://tr.im/rw0b : RGG#154: demo of atomic functions __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help about package reldist (Relative Distribution)
Dear R users: i try to use package reldist to measure wage distribution. In package reldist : y mean sample from comparison distribution yo mean sample from reference distribution but I would like to compare more than two years ( total of fifteen years, from 1979, 1981, 1983..to 2007) how should i correct my programs, then i could compare fifteen year's wage distribution? fig2b - reldist(y=mu1981$b1,yo=mu1979$b1,ci=F,smooth=0.4, yowgt=mu1979$weight2,ywgt=mu1981$weight2, bar=TRUE, yolabs=seq(-1,3,by=0.5), ylim=c(0,2.5),cex=0.8, ylab=Relative Density, xlab=Proportion of the Original Cohort) title(main=Fig2(b),cex=0.6) Any help would be very appreciated !! Regards, Hsieh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] smoothScatter problems
My humble apologies for double posting. I made an error with my subscription and erroneously thought that my message was not sent to the list. Jeroen. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 64 bit compiled version of R on windows
Did anybody try the REvolution R Enterprise 2.0? Is it worthwhile to buy for the 64bit windows OS? Thanks. Huang On Tue, Mar 31, 2009 at 2:43 AM, Duncan Murdoch murd...@stats.uwo.cawrote: On 3/30/2009 12:46 PM, Vadlamani, Satish {FLNA} wrote: Hi: 1) Does anyone have experience with 64 bit compiled version of R on windows? Is this available or one has to compile it oneself? 2) If we do compile the source in 64 bit, would we then need to compile any additional modules also in 64 bit? R for Windows is compiled using the MinGW port of gcc, and the 64 bit version of that compiler is not really ready for general use yet, so compiling for 64 bits is not completely straightforward. Revolution Computing has announced on the R-devel list that they are beta testing a build, with some information at http://www.revolution-computing.com/products/windows-64bit.php The page says it is scheduled for release at the end of March, so there should be something available soon. Duncan Murdoch I am just trying to prepare for the time when I will get larger datasets to analyze. Each of the datasets is about 1 GB in size and I will try to bring in about 16 of them in memory at the same time. At least that is the plan. I asked a related question in the past and someone recommended the product RevolutionR - I am looking into this also. If you can think of any other options, please mention. I have not been doing low level programming for a while now and therefore, the self compilation on windows would be the least preferable (and then I have to worry about how to compile any modules that I need). Thanks. Thanks. Satish __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems hist() and density
Thank you. I'm sorry for my question. Sure, I've to integrate a density function to get the probabilities... I didn't noticed the small breaks and that's why I was confused. I expected a histogram with probabilities for the realizations. Am Sonntag, den 26.07.2009, 14:27 -0400 schrieb jim holtman: sums to one I should have said. On Sun, Jul 26, 2009 at 7:43 AM, Jan Teichmannjan.teichm...@googlemail.com wrote: Hello, I have a problem with the hist() function and showing densities. The densities sum to 50 and not to 1! I use R version 2.9.1 (2009-06-26) and I load the seqinR library. My data is the following vector: [1] 0.140 0.200 0.220 0.2828283 0.160 0.160 0.360 [8] 0.160 0.220 0.260 0.200 0.300 0.220 0.2342342 [15] 0.180 0.220 0.160 0.230 0.200 0.220 0.240 [22] 0.200 0.220 0.220 0.260 0.200 0.160 0.220 [29] 0.2342342 0.200 0.220 0.200 0.200 0.140 0.180 [36] 0.220 0.160 0.160 0.140 0.220 0.200 0.2871287 [43] 0.290 0.200 0.1836735 0.200 0.200 0.290 0.240 [50] 0.220 0.280 0.200 0.2745098 0.220 0.230 0.180 [57] 0.230 0.180 0.260 0.220 0.222 0.220 0.260 [64] 0.220 0.220 0.260 0.220 0.200 0.220 I use the following command: tmp - hist(data, freq=FALSE, plot=FALSE) and that's the result: $breaks [1] 0.14 0.16 0.18 0.20 0.22 0.24 0.26 0.28 0.30 0.32 0.34 0.36 $counts [1] 10 4 15 19 8 5 2 5 0 0 1 $intensities [1] 7.2463754 2.8985507 10.8695652 13.7681159 5.7971014 3.6231884 [7] 1.4492754 3.6231884 0.000 0.000 0.7246377 $density [1] 7.2463754 2.8985507 10.8695652 13.7681159 5.7971014 3.6231884 [7] 1.4492754 3.6231884 0.000 0.000 0.7246377 $mids [1] 0.15 0.17 0.19 0.21 0.23 0.25 0.27 0.29 0.31 0.33 0.35 $xname [1] data $equidist [1] TRUE attr(,class) [1] histogram __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Strange Memory issue
Hi, I am testing out some things with the kernlab library. The dataframe is 22,000 rows of 32 columns. The command I execute is: model - ksvm(label ~ ., data = traindata, type=C-svc, kernel = rbfdot, class.weights= c(0 =1, 1 =3), kpar = automatic, C = 10, cross = 3, prob.model = TRUE) I have both a Macintosh and linux machine running Fedora. The Macintosh has 2GB of RAM. On the Macintosh, this command runs and completes without any error. The Linux machine has 4GB of RAM On the Linux machine, I get a memory error: Error: cannot allocate vector of size 988.7 Mb Can anybody help me figure out why?? Thanks.. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] How to create a permanent dataset in R.
Hi I use saved R file stand.R which looks like this stand - list( b110 = c(49.45000, 21.46, 11.468333, 24.33500, 28.112240), b120 = c(49.77333, 19.386667, 7.736667, 20.87500, 21.753788), b130 = c(49.60833, 18.365000, 5.708333, 19.23167, 17.265409), b140 = c(50.61333, 15.528333, 2.02, 15.66167, 7.419889), b160 = c(51.25833, 14.026667, 0.50, 14.03667, 2.032885), b180 = c(55.10167, 9.076667, -3.281667, 9.66000, -19.887711), b222 = c(56.14500, 14.52, 3.425000, 14.91667, 13.271665), b225 = c(53.43833, 16.168333, 4.695000, 16.8, 16.189711), e7898 =c(50.44200, 17.496000, 5.238000, 18.26400, 16.66), lab.nk = c(50.7, 20.29000, 14.04, 24.67, 34.68), mapico = c(65.89, 13.065, 21.07, 24.79, 58.195), tp100.pr = c(47.88250, 21.09500, 12.697500), tp100old = c(49.61,22.71,13.93,26.64,31.52), tp100 = c(49.08,22.24,11.19,24.89402335,26.71196071), tp200 = c(49.46, 18.5, 8.88, 20.52082844, 25.64100582), tp200ro = c(49.32, 18.60, 9.85, 21.05, 27.90), tp302 = c(49.5, 19.7, 10.8, 22.47, 28.73), tp303.00 = c(49.41000, 16.76000, 6.39, 17.9368, 20.8701), tp303.40 = c(48.72000, 15.09000, 5.43)) and is in my personal function package and call data(stand) in Rprofile.site to recover it in each session Regards Petr r-help-boun...@r-project.org napsal dne 24.07.2009 20:06:57: (redirecting to r-help; it seems more appropriate for such a question) Hello, On Fri, Jul 24, 2009 at 8:11 AM, Albert EINstEINsateeshvar...@gmail.com wrote: Actually, we know that If we create a dataset in R ,after closing the session the dataset automatically is closed. I tried for creating dataset permanently in R.I am facing the problem of creating permanent dataset. can we create dataset permanently If possible? I am not sure what exactly you want to do, but you could use save.image() to save R's workspace and re-load it when re-opening R via load(). Otherwise you can use Rcmdr or JGR to graphically, that is via point-and-click, perform these operations. Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Open/Close Console
Dear list, one of my procedures is highly memory intensive (time-dependent spatially distributed data). Unfortunately, I could only tackle this problem with the dull idea of opening a new console window, running the procedure, and closing the new window. #previous procedure #open new console gui_path - C:/Programs/R/R-2.9.0/bin/Rgui.exe shell.exec(gui_path) memory.limit(size = 4000) #run the memory intensive procedure quit(save = no) #subsequent procedure It works, when I send my code line-by-line from TINN-R to R. But it fails (=executes the code in the old window) when I send the entire code at once. I am grateful for any hints! Regards, Haeru [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] labelling points plotted in a 2D plan
Hi r-help-boun...@r-project.org napsal dne 25.07.2009 23:15:04: Thanks for the answer Tal! But I can't get it to work correctly! :( Please bear with me this is the first time I am using R! and I am in a rush to correct a paper in fact on the plane I am plotting a table fullpointed=read.table(fullpoints_backup.txt,h=F) plot(range(-2.5,0.95),range(0.00,1.00),type=n,axes=TRUE) Most probably you do not want axes labeled with range(-2.5,.95). Look at xlab and ylab parameters and also to xlim and ylim parameters plot(0, 0, xlim=range(-2.5,0.95),ylim=range(0.00,1.00), xlab=bla, ylab=blabla,type=n,axes=TRUE) and in this table there are 300 points I want to label the first 175 points with A and the others with S I couldn't figure how to configure correctly labels.to.plot - sample(c(A,B), 100, replace = T) and text(x, y , labels = labels.to.plot) ? for instance: 0,48875 0,142857143 the point plotted will be labelled a 0,409 0,142857143 the point plotted will be labelled a 0,45611 0,25 labelled a 0,49833 0,2 labelled a It is hard to tell what type of data you really have. Output from str(fullpointed) could help. In case you have numeric values text(fullpointed[,1], fullpointed[,2], labels=c(rep(A, 175), rep(S,125))) (untested) could do what you want. But maybe I am completely wrong because I do not know what first 175 points means. Regards Pter #the first 175 0,61158 0,125labelled S 0,5709 0,125labelled S 0,53266 0,125labelled S # the remaining Regards On Sat, Jul 25, 2009 at 5:32 PM, Tal Galili tal.gal...@gmail.com wrote: Sure, Here is an example: # get some random data to play with x - runif(100) y - runif(100) labels.to.plot - sample(c(A,B), 100, replace = T) # set up the window, play them one by one to see what they do plot.window(ylim = c(0,1), xlim = c(0,1)) plot.new() axis(1) axis(2) box() # plot the things you wished to plot, where you wanted them plotted text(x, y , labels = labels.to.plot) Cheers, Tal On Sat, Jul 25, 2009 at 7:20 PM, Khaled OUANES koua...@gmail.com wrote: hey thanks for the answer but I couldn't achieve it? would you explain a bit more? I have like 300 points to label! thanks -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] How to create a permanent dataset in R.
On 7/27/09, Albert EINstEIN sateeshvar...@gmail.com wrote: can we create our own packages in R. It would be very helpful for us, if you provide any information regarding this. Yes. See this [1]. Liviu [1] http://cran.r-project.org/doc/manuals/R-exts.pdf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] can we create our own packages in R?
Hi, can someone provide me information regarding ; Here we are reloading the data when we start the R, In R there are some default packages with datasets .Can we create a dataset permanently in any packages so that we can get the dataset without reloading,like creating dataset in a permanent library in SAS ? can we create our own packages in R. It would be very helpful for us, if someone provide any information regarding this. Thank you in advance -- View this message in context: http://www.nabble.com/can-we-create-our-own-packages-in-R--tp24677043p24677043.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] How to create a permanent dataset in R.
Hi, Thank you very much all of you for giving me valuable solution -- View this message in context: http://www.nabble.com/Re%3A--Rd--How-to-create-a-permanent-dataset-in-R.-tp24649306p24677150.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] creating and populating an environment
Hi, I often work with R by writing long(ish) Excel-VBA macros interspersed with calls to R via RExcel. A typical example of this would be: Sub VBAMacro() 'fetch some data from an excel sheet 'do some basic stuff on said data 'transfer data from vba to R 'run some R statements 'get data back to vba 'show results on the excel sheet 'clean R by deleting all vars that were created: rrun rm(a,b,c,) end sub This has two obvious disadvantages, as I have to make sure: 1) not to use R variable names which may already exist 2) to remove all variables (garbage collection) In order to overcome these issues I was wondering if I should execute all R statements inside the R macro in a separate namespace. I have looked at new.env() but am not really sure how it is supposed to be used. If I type temp-new.env(), how do I make sure that all variables declared from then on end up in the temp environment? Once I am done, is rm(temp) sufficient to get rid of all its content? Basically, I would like to replace the above example with: Sub VBAMacro() rrun A-new.env() 'fetch some data from an excel sheet 'do some basic stuff on said data 'transfer data from vba to R 'run some R statements 'get data back to vba 'show results on the excel sheet rrun rm(A) end sub Thanks Christian Prinoth DISCLAIMER:\ L'utilizzo non autorizzato del presente mes...{{dropped:17}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Forecasting Inflation
Dear All, I wanted to forecast Inflation for Indian Economy. please send what techniques to be used after the variable selection. WPI, CPI, Money supply, IIP, Interest rate and so on..How i can use R for the same [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] creating and populating an environment
Hi, I often work with R by writing long(ish) Excel-VBA macros interspersed with calls to R via RExcel. A typical example of this would be: Sub VBAMacro() 'fetch some data from an excel sheet 'do some basic stuff on said data 'transfer data from vba to R 'run some R statements 'get data back to vba 'show results on the excel sheet 'clean R by deleting all vars that were created: rrun rm(a,b,c,) end sub This has two obvious disadvantages, as I have to make sure: 1) not to use R variable names which may already exist 2) to remove all variables (garbage collection) In order to overcome these issues I was wondering if I should execute all R statements inside the R macro in a separate namespace. I have looked at new.env() but am not really sure how it is supposed to be used. If I type temp-new.env(), how do I make sure that all variables declared from then on end up in the temp environment? Once I am done, is rm(temp) sufficient to get rid of all its content? Basically, I would like to replace the above example with: Sub VBAMacro() rrun A-new.env() 'fetch some data from an excel sheet 'do some basic stuff on said data 'transfer data from vba to R 'run some R statements 'get data back to vba 'show results on the excel sheet rrun rm(A) end sub Thanks Christian Prinoth DISCLAIMER:\ L'utilizzo non autorizzato del presente mes...{{dropped:16}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read binary file seek()
I want to read in a binary file using the readBin() function. In order to skip uninformative parts of the file I use the seek() function, I need to specify the number of bits to skip rather than the number of bytes to skip. E.g. seek(to.read,origin=current,blockSize) with blockSize giving the number of bits Does anybody know if this works? Any help would be highly appreciated. Best, A. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dumping data objects
I am using R version 2.9.1 (2009-06-26) on Windows. I am trying to dump some data objects from R so that I can subsequently import them into S-PLUS (version 6.2). Using dump(foo) appends Ls to integers, as explained in the documentation for deparseOpts. Here is a simple example. I tired using .deparseOpts(control=S_compatible) [1] 128 dump(foo) But still get: C:\wqm\Runcat dumpdata.R foo - structure(list(Kilocycles = c(94L, 96L, 99L), Status = structure(c(2L, 2L, 1L), .Label = c(Censored, Failed), class = factor), Weight = c(1L, 1L, 2L)), .Names = c(Kilocycles, Status, Weight), class = data.frame, row.names = c(1, 2, 3)) I have also tried a number of different combinations of options to .deparseOpts, but the resulting file always has the appended Ls. What am I missing? Bill Meeker William Q. Meeker Department of Statistics 2109 Snedecor Hall Iowa State University Ames, Iowa 50011 Phone: 515-294-5336 Fax: 515-294-4040 Home Fax: 515-232-1323 www.public.iastate.edu/~wqmeeker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can we create our own packages in R?
On Mon, Jul 27, 2009 at 02:32:24AM -0700, Albert EINstEIN wrote: can we create our own packages in R. It would be very helpful for us, if someone provide any information regarding this. Yes, you can. There is an entire manual documenting this: http://cran.r-project.org/doc/manuals/R-exts.html cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conversion a ts time to another class.
Dear R collegues, I am trying to change a ts time such as 2009.004 to a str or POSIX class as 2009-01-01. Is there any function or method to do it?. Thank you beforehand. Chuse. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read binary file seek()
readBin reads in bytes. If you want to read starting at a variable 'bit' location, then you will have to write a function that will read in the bytes and then shift the data the corresponding number of bits. Exactly what are you reading in and what processing do you want to do on it. On Mon, Jul 27, 2009 at 7:34 AM, Andreas Poschandreas.po...@tugraz.at wrote: I want to read in a binary file using the readBin() function. In order to skip uninformative parts of the file I use the seek() function, I need to specify the number of bits to skip rather than the number of bytes to skip. E.g. seek(to.read,origin=current,blockSize) with blockSize giving the number of bits Does anybody know if this works? Any help would be highly appreciated. Best, A. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normal mixture model
Hi Cindy, you need the summary function mclustsummary - summary(mclustBICoutputobject,data) to get all the information. Some (like best model) is given if you just print out the summary object. Some other information (like estimated parameter values) are accessible as components of the summary object, like mclustsummary$parameters$... Try str(mclustsummary) to see what's there (unfortunately this is not fully documented). For more detail see the help pages. Hope this helps, Christian On Sun, 26 Jul 2009, cindy Guo wrote: Hi, Christian, Thank you for the reply. I just tried. Does the function mclustBIC only give the best model, or does it also do EM to get the cluster means and variances according to the best model it picks? I didn't find it. Is there a way to automatically select the best number of components and do EM? Because I need to do the normal mixture model in a loop (one EM at an iteration), so I want it to do everything automatically. Thanks, Cindy On Sun, Jul 26, 2009 at 3:46 PM, Christian Hennig chr...@stats.ucl.ac.ukwrote: You can use mclustBIC in package mclust (uses the BIC for deciding about the number of components and hierarchical clustering for initialisation). Christian On Sun, 26 Jul 2009, cindy Guo wrote: Hi, All, I want to fit a normal mixture model. Which package in R is best for this? I was using the package 'mixdist', but I need to group the data into groups before fitting model, and different groupings seem to lead to different results. What other package can I use which is stable? And are there packages that can automatically determine the number of components? Thank you, Cindy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about rpart decision trees (being used to predict customer churn)
-- begin included message --- Hi, I am using rpart decision trees to analyze customer churn. I am finding that the decision trees created are not effective because they are not able to recognize factors that influence churn. I have created an example situation below. What do I need to do to for rpart to build a tree with the variable experience? My guess is that this would happen if rpart used the loss matrix while creating the tree. experience - as.factor(c(rep(good,90), rep(bad,10))) cancel - as.factor(c(rep(no,85), rep(yes,5), rep(no,5), rep(yes,5))) table(experience, cancel) cancel experience no yes bad 5 5 good 85 5 rpart(cancel ~ experience) n= 100 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 100 10 no (0.900 0.100) * I tried the following commands with no success. rpart(cancel ~ experience, control=rpart.control(cp=.0001)) rpart(cancel ~ experience, parms=list(split='information')) rpart(cancel ~ experience, parms=list(split='information'), control=rpart.control(cp=.0001)) rpart(cancel ~ experience, parms=list(loss=matrix(c(0,1,1,0), nrow=2, ncol=2))) --- end inclusion The program works fine with rpart(as.numeric(cancel) ~ experience), which does a fit to try and predict the probability of cancellation rather than a YES/NO decision for each node. I usually find this more informative, particularly for early analysis. Brieman et al in the original CART book refer to this as odds regression. In this analysis, if a split leads to one child with 30% cancel and another with 5% cancellation the split is successful. When using a factor as the y variable, this split is scored as useless, since the parent and both children are scored as NO. By adjusting the losses to be just right you can get your data to split. You need to make them such that 85/5 is predicted as 'no cancel' and 5/5 as 'yes cancel'; 1:2 losses would suffice. In the example where you set losses to 1:1 both nodes are scored as a 'yes'. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conversion a ts time to another class.
try this: x - 2009.004 # get the number of days in the year days - unclass(as.POSIXct(paste(floor(x)+1, -1-1, sep='')) - + as.POSIXct(paste(floor(x), '-1-1', sep=''))) # now compute the number of seconds from start of year current - as.POSIXct(paste(floor(x), '-1-1', sep='')) + days * (x %% 1) * 86400 current [1] 2009-01-02 11:02:23 EST On Mon, Jul 27, 2009 at 8:00 AM, Chuse chusechus...@gmail.com wrote: Dear R collegues, I am trying to change a ts time such as 2009.004 to a str or POSIX class as 2009-01-01. Is there any function or method to do it?. Thank you beforehand. Chuse. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] what to do about face 1 at size 16 could not be loaded
Hi, on some machines (all Linux, same behaviour with versions 2.9.0 and 2.9.1) I get errors plot(E2) Error in text.default(x, y, txt, cex = cex, font = font) : X11 font -adobe-helvetica-%s-%s-*-*-%d-*-*-*-*-*-*-*, face 1 at size 16 could not be loaded (but not on others; the R-installation is always the same (from sources), while the Linux-distribution should also be similar; the above failure occurs for Suse 11.0). What to do about such errors? Is it possible to tell R to ignore such errors (likely this is only about some text, but by using options like ann=F I didn't succeed to get it working; E2 above by the way is a data frame)? Or could I make a complete installation of R, which includes all the fonts etc. R expects to find? (Apparently there is no standard what fonts should be there, and so failures in this area are to be expected, or?) Thanks for your consideration. Oliver __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] How to create a permanent dataset in R.
Thank you very much for your reply. Yes. See this [1]. Liviu [1] http://cran.r-project.org/doc/manuals/R-exts.pdf -- View this message in context: http://www.nabble.com/Re%3A--Rd--How-to-create-a-permanent-dataset-in-R.-tp24649306p24678083.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] create dataset permanently in package (i.e. default or our own package)
Hi, actually while opening R console and R commander we see some packages like car and datasets. in this packages we have default datasets are available. example: women and prestige like that. now i created a sales dataset importing from excel, xml or text file. now i want to store that dataset permanently in any one of the package like i mentioned above (car or datasets). now i closed my R session. after some time i opened R console and R commander. Now I will not create again sales dataset.While clicking any one of package that sales dataset should be found. if possible please give me the code it will be very helpful for us. Thanks in advance. -- View this message in context: http://www.nabble.com/create-dataset-permanently-in-package-%28i.e.-default-or-our-own-package%29-tp24679076p24679076.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FW: Qury Related With R
Hi Romain, Exactly this is the script which I need. Thanks a lot for helping. Romain, Is there any way to modified a particular line in the property file through R script? If yes then please explain how. Cheers! BS -Original Message- From: Romain Francois [mailto:romain.franc...@dbmail.com] Sent: Monday, July 27, 2009 12:53 PM To: bed.si...@oracle.com Cc: r-help@r-project.org Subject: Re: [R] FW: Qury Related With R Hi, I think you can just do something like read the parameters into R, and then use the parameters argument of the .jinit function. Something like this perhaps: props - readLines( app.properties ) props - strsplit( gsub( \\\t, , grep( =, props, value = TRUE ) ), = ) params - sapply( props, function(x){ sprintf( -D%s=%s, x[1], x[2] ) } ) library(rJava) .jinit(classpath=D:/R_BTE_Jar/BTE/BackTestingApp.jar, parameters= c( -Xmx512m, params ) ) Let me know if this works. Romain On 07/27/2009 06:02 AM, bed.si...@oracle.com wrote: Hi Romain, Attached is my R script that script I put into the R work space and through source(RScriptToCallJava.R) command I call the script and my java application is execute. Is it the proper way to call the java application? If not, then please can you explain the directory structure that needs for java application? In java I used the property file. but when I load the property file using .jaddClassPath(D:/R_BTE_Jar/BTE/app.properties) command, it is not load. Thanks Romain for your response. Bed Singh -Original Message- From: Romain Francois [mailto:romain.franc...@dbmail.com] Sent: Saturday, July 25, 2009 6:32 PM To: bed.si...@oracle.com Cc: r-help@r-project.org Subject: Re: [R] FW: Qury Related With R Hi, The file did not make it through the mailing list. Maybe you are looking for ?read.dcf Can you describe the way your application interacts with R. Romain On 07/25/2009 10:35 AM, bed.si...@oracle.com wrote: Hi, I am using the R-2.9.1 with Window XP. Queries: 1. I am running the java application which needs to load property file in R. So can you please tell me how I can load my property file in R session so that my application can find that property file? Attached is my property file for sample. 2. Is there any directory structure required for java application in R format? Thanks Regards, Bed Singh From: ericdov...@gmail.com [mailto:ericdov...@gmail.com] On Behalf Of Eric Doviak Sent: Friday, July 24, 2009 8:42 PM To: bed.si...@oracle.com Subject: Re: Qury Related With R Hi Bed, I'm sorry. I simply don't know. Your best bet would be to ask on R-help: r-help@r-project.org Good luck, - Eric bed.si...@oracle.com wrote: Hi Eric, I am using the R-2.9.1 with Window XP. Queries: I am running the java application which needs to load app.property file in R. So can you please tell me how I can load my property file in R session so that my application can found that property file? Attached is my property file. Is there any directory structure required for java application in R format? Please help me. Thanks in Advance Eric. Regards, Oracle logo.gif Bed Singh Oracle Financial Services PrimeSourcing Mumbai, India Oracle Financial Services Software Limited was formerly i-flex solutions limited. Romain Francois Independent R Consultant +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/tlNb : RGG#155, 156 and 157 |- http://tr.im/rw0p : useR! slides `- http://tr.im/rw0b : RGG#154: demo of atomic functions -- Romain Francois Independent R Consultant +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/tlNb : RGG#155, 156 and 157 |- http://tr.im/rw0p : useR! slides `- http://tr.im/rw0b : RGG#154: demo of atomic functions __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nnet library and FANN package'm
Hello ! I'd like to know to which of the FANN package network corresponds the R nnet network ? In more details, what is the R nnet activation function, what is the training algorithm (rprop, quickprop, ...) ? Also, it seems that the R nnet decay parameter in nnet corresponds to the learning_rate parameter in FANN. Correct ? Many thanks in advance ! Luc Moulinier IGBMC Strasbourg -- View this message in context: http://www.nabble.com/nnet-library-and-FANN-package%27m-tp24680597p24680597.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] downsampling
Dear Philipp and R-Users, thank you very much for the help. However, both approx() and spline() seem to select the number of required data points from the original data (at the correct positions, of course) and ignore the remaining data points, as the following example demonstrates: a= c(1,0,2,1,0) approx(a,n=3) $x [1] 1 3 5 $y [1] 1 2 0 Essentially, what approx has done (spline does the same) is to simply select the first, third, and fifth entry (as we want to downsample a 5 point vector into a three point vector). The second and fourth data point are completely ignored. This can result in quite dramatic changes of your data, if the data points selected by approx() or spline() happen to be outliers and if you downsample data by a rather strong factor. Best, Jan Philipp Pagel wrote: On Fri, Jul 24, 2009 at 03:16:58AM -0600, Warren Young wrote: Michael Knudsen wrote: On Fri, Jul 24, 2009 at 9:32 AM, Jan Wienerjan.wie...@tuebingen.mpg.de wrote: x=sample(1:5, 115, replace=TRUE) How do I downsample this vector to 100 entries? Are there any R functions or packages that provide such functionality. What exactly do you mean by downsampling? It means that the original 115 points should be treated as a continuous function of x, or t, or whatever the horizontal axis is, with new values coming from this function at 100 evenly-spaced points along this function. There probably is a proper function for that and some expert will point it out. Until then I'll share my thoughts: # make up some data foo - data.frame(x= 1:115, y=jitter(sin(1:115/10), 1000)) plot(foo) # use approx for interpolation bar - approx(foo, n=30) lines(bar, col='red', lwd=2) # or use spline for interpolation bar - spline(foo, n=30) lines(bar, col='green', lwd=2) # or fit a loess curve # had to play with span to make it look ok model - loess(y~x, foo, span=1/2) x - seq(1, 115, length.out=30) bar - predict(model, newdata=data.frame(x=x, y=NA)) lines(x, bar, col='blue', lwd=2) Jan, does that help a little? cu Philipp -- Dr. Jan M. Wiener Centre for Cognitive Science University of Freiburg, Institute of Computer Science and Social Research (IIG) Friedrichstr. 50, D-79098 Freiburg, GERMANY - e-mail: m...@jan-wiener.net phone: ++49 (0)761 203 4951 url: www.jan-wiener.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] numbers on barplot
Hello all, I have this simple barplot code: ifn - id.dat dat - read.table(ifn) ofn - id.png bitmap(ofn, type = png256, width = 30, height = 30, pointsize = 30, bg = white,res=50) par(mar=c(5, 5, 3, 2),lwd=5) par(cex.main=1.6,cex.lab=1.6,cex.axis=1.6) names(dat)-c(NumberOfPeople,Average) Graph-barplot(dat$Average) dev.off() and here is the data (id.dat): 150.08 60.09 70.37 I want to write down the NumberOfPeople on top of each of the bars. Can anybody help me on this? Thanks, Mohsen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] numbers on barplot
The only thing you're missing is the midpoints of the bars. Since you specified Graph - barplot(dat$Average) You can get the midpoints from the Graph object. So to put the number on top of each bar you might use something like: text(Graph, dat$Average, dat$Average) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mohsen Jafarikia Sent: Monday, July 27, 2009 10:02 AM To: r-h...@stat.math.ethz.ch Subject: [R] numbers on barplot Hello all, I have this simple barplot code: ifn - id.dat dat - read.table(ifn) ofn - id.png bitmap(ofn, type = png256, width = 30, height = 30, pointsize = 30, bg = white,res=50) par(mar=c(5, 5, 3, 2),lwd=5) par(cex.main=1.6,cex.lab=1.6,cex.axis=1.6) names(dat)-c(NumberOfPeople,Average) Graph-barplot(dat$Average) dev.off() and here is the data (id.dat): 150.08 60.09 70.37 I want to write down the NumberOfPeople on top of each of the bars. Can anybody help me on this? Thanks, Mohsen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S. News World Report (2008). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] downsampling
On Mon, Jul 27, 2009 at 02:42:33PM +0200, Jan M. Wiener wrote: However, both approx() and spline() seem to select the number of required data points from the original data (at the correct positions, of course) and ignore the remaining data points, as the following example demonstrates: a= c(1,0,2,1,0) approx(a,n=3) $x [1] 1 3 5 $y [1] 1 2 0 Essentially, what approx has done (spline does the same) is to simply select the first, third, and fifth entry (as we want to downsample a 5 point vector into a three point vector). The second and fourth data point are completely ignored. That seems to be what Warren described as the 'degenerate case' where approx will 'just throw away every other sample'. If you choose a differetn n (e.g. n=4) interpolation does happen. This can result in quite dramatic changes of your data, if the data points selected by approx() or spline() happen to be outliers and if you downsample data by a rather strong factor. Yes, that could affect your downsampled data. For more robustness it would probably be better to fit a proper model (if you have one) or a lowess curve (or smooth.spline) and go from there. cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] offset and poisson regression
Not sure that the list is the best place for this question, but we are going mad with this... We are trying to fit a poisson regression to count data, eg the number of fledged youngs of blue tits (NPe) as a function of the clutch size (GPc) and other environment variables. Here are the original data (dumped) (we just omit the environment variables to simplify): tab- structure(list(NPe = c(3L, 5L, 2L, 6L, NA, 4L, 4L, 4L, 3L, NA, NA, 4L, 5L, 2L, 0L, 5L, NA, 1L, NA, 2L, 5L, 4L, 0L, 4L, NA, NA, 6L, 4L, 0L, 4L, 4L, 0L, 6L, 5L, 6L, 3L, NA, 6L, 5L, 3L, 6L, 7L, NA, 7L, 6L, 4L, NA, 1L, NA, NA, 7L, 6L, NA, 5L, NA, NA, NA, 0L, 0L, NA, NA, 5L, NA, 3L, NA, NA, NA, 5L, NA, NA, 6L, NA, NA, NA, 0L, 6L, NA, NA, NA, NA, 5L, 5L, 4L, NA, 4L, 0L, 4L, 5L, 5L, 4L, 0L, 0L, 5L, 6L, 5L, 1L, NA, 0L, 7L, 0L, 0L, 3L, 3L, 7L, NA, 0L, 6L, 4L, 4L, 5L, 0L, 5L, 4L, 7L, 4L, 7L, 5L, 5L, 0L, NA, 5L, 7L, NA, 8L, 7L, 5L, 0L), GPc = c(5L, 6L, 6L, 7L, NA, 5L, 6L, 5L, 6L, 6L, 4L, 5L, 5L, 6L, 6L, 6L, 4L, 4L, 4L, 3L, 5L, 6L, 3L, 5L, 5L, 7L, 6L, 5L, 5L, 5L, 4L, 5L, 6L, 5L, 6L, 5L, 5L, 7L, 6L, 4L, 7L, 8L, 9L, 7L, 7L, 7L, 4L, 5L, 5L, 4L, 7L, 6L, 5L, 5L, 6L, 2L, 7L, 6L, 8L, NA, NA, 7L, 6L, 6L, NA, 6L, 6L, 5L, 5L, 5L, 7L, 7L, 6L, 6L, 6L, 6L, 7L, 5L, 5L, 7L, 7L, 6L, 6L, 8L, 6L, 7L, 5L, 5L, 8L, 8L, 7L, 7L, 6L, 7L, 6L, 5L, 6L, 7L, 8L, 6L, 7L, 7L, 5L, 7L, 6L, 5L, 9L, 5L, 4L, 7L, 6L, 6L, 5L, 8L, 5L, 7L, 6L, 7L, 7L, 7L, 6L, 7L, 5L, 8L, 7L, 7L, 6L)), .Names = c(NPe, GPc), class = data.frame, row.names = c(NA, -127L)) It seems logical to insert clutch size as an offset term, since we are actually interested in the ratio fledged youngs/clutch size. However, the final results are quite surprising: modsr0-glm(NPe~offset(GPc),family=poisson,data=tab) if we compute the predictions, we get numbers which looks like a gross overestimation of the reality (eg 14.6, 39.7, etc...) -including the fact that it implies that one can have more fledged youngs than eggs !: [1] 0.7 2.0 2.0 5.4 0.7 2.0 0.7 2.0 0.7 0.7 2.0 2.0 2.0 0.3 0.1 0.7 2.0 [18] 0.1 0.7 2.0 0.7 0.7 0.7 0.3 0.7 2.0 0.7 2.0 0.7 5.4 2.0 0.3 5.4 14.6 [35] 5.4 5.4 5.4 0.7 5.4 2.0 0.7 2.0 14.6 5.4 2.0 0.7 5.4 2.0 2.0 5.4 2.0 [52] 2.0 2.0 5.4 0.7 0.7 14.6 14.6 5.4 5.4 2.0 5.4 2.0 0.7 5.4 14.6 2.0 5.4 [69] 5.4 0.7 5.4 0.7 39.7 0.7 0.3 5.4 2.0 2.0 0.7 14.6 0.7 5.4 2.0 5.4 5.4 [86] 2.0 5.4 14.6 5.4 5.4 2.0 Otherwise, if clutch size is inserted as a variable (and not as an offset), predictions are much more realistic, with no extreme values : modsr0-glm(NPe~GPc,family=poisson,data=tab) round(exp(predict(modsr0)),1) [1] 3.2 3.7 3.7 4.4 3.2 3.7 3.2 3.7 3.2 3.2 3.7 3.7 3.7 2.7 2.2 3.2 3.7 2.2 3.2 3.7 3.2 3.2 [23] 3.2 2.7 3.2 3.7 3.2 3.7 3.2 4.4 3.7 2.7 4.4 5.3 4.4 4.4 4.4 3.2 4.4 3.7 3.2 3.7 5.3 4.4 [45] 3.7 3.2 4.4 3.7 3.7 4.4 3.7 3.7 3.7 4.4 3.2 3.2 5.3 5.3 4.4 4.4 3.7 4.4 3.7 3.2 4.4 5.3 [67] 3.7 4.4 4.4 3.2 4.4 3.2 6.2 3.2 2.7 4.4 3.7 3.7 3.2 5.3 3.2 4.4 3.7 4.4 4.4 3.7 4.4 5.3 [89] 4.4 4.4 3.7 Can any sound statistician provide a hint about what to do or how to interprete this ? Thanks in advance, Renaud and Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] numbers on barplot
names(dat)-c(NumberOfPeople,Average) Graph-barplot(dat$Average) barplot(dat$Average, ylim=c(0,max(dat[,2]+.2))) text(Graph, dat[,2], dat[,1], pos=3) The reason for the ylim is so that the number for the righthand bar does not go outside the plot area. --- On Mon, 7/27/09, Mohsen Jafarikia jafari...@gmail.com wrote: From: Mohsen Jafarikia jafari...@gmail.com Subject: [R] numbers on barplot To: r-h...@stat.math.ethz.ch Received: Monday, July 27, 2009, 10:01 AM Hello all, I have this simple barplot code: ifn - id.dat dat - read.table(ifn) ofn - id.png bitmap(ofn, type = png256, width = 30, height = 30, pointsize = 30, bg = white,res=50) par(mar=c(5, 5, 3, 2),lwd=5) par(cex.main=1.6,cex.lab=1.6,cex.axis=1.6) names(dat)-c(NumberOfPeople,Average) Graph-barplot(dat$Average) dev.off() and here is the data (id.dat): 15 0.08 6 0.09 7 0.37 I want to write down the NumberOfPeople on top of each of the bars. Can anybody help me on this? Thanks, Mohsen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ Looking for the perfect gift? Give the gift of Flickr! http://www.flickr.com/gift/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pairs plot
Hi all, I want to plot trough pairs() plot a matrix with 4 columns. I want to make a trhee plot in a graph. Plotting pairs colum 2,3,4 on y axis and 1 on X axis. You mean (a plot with three graphs) ommitting the first pair with itself. And only the pairs with colum 1 with the other not all pairs. I. e. this matrix 4177 289390 8740 17220 3907 301510 8530 17550 3975 316970 8640 17650 3651 364220 9360 21420 3031 387390 9960 23410 2912 430180 11040 25820 3018 499930 12240 27620 2685 595010 13800 31670 2884 661870 14760 37170 Thanks in advance. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How should i change the SAS Codes into R Codes?
Dear R users, I have a SAS codes with several loops in it, and i hope to use R to do the same task. The SAS codes are as follows, /*to generate the dataset*/ DATA Single_Simulation; DO se=0 to 1 by 0.01; DO sp=0 to 1 by 0.01; DO DR=0 to 1 by 0.01; TR=(DR+sp-1)/(se+sp-1+1.0e-12); Adjust_Factor=TR/(DR+1.0e-12); OUTPUT; END; END; END; RUN; /*to select some data*/ DATA sampledata; SET Single_Simulation; IF DR=0.02 sp=1; RUN; #I tried the following codes with R,failed num-seq(0, 1, by = 0.01) for (se in num) { for (sp in num) { for (DR in num) { TR=(DR+sp-1)/(se+sp-1+1.0e-12) Adjust_Factor=TR/(DR+1.0e-12) } } } My questions are, 1. What is the correct codes for R to do the similar task? 2. Sometimes, the simulated dataset are very large, so i need to put the generated dataset in a file with large disk place, not in the memory, how to do it? SAS can do it with the 'libname' argument, do R have the similar method? Thanks a lot. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How should i change the SAS Codes into R Codes?
singleSim - expand.grid(se = 0:100/100, sp = 0:100/100, DR = 0:100/100) singleSim - within(singleSim, { TR - (DR+sp-1)/(se+sp-1+1.0e-12) AdjustFactor - TR/(DR+1.0e-12) }) sampleData - subset(singleSim, DR == .02 sp == 1) write.csv(sampleData, output.csv) Hope this helps Matt mangosolutions - R and S consultants Tel +44 (0)1249 767700 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of zhijie zhang Sent: 27 July 2009 16:04 To: r-h...@stat.math.ethz.ch Subject: [R] How should i change the SAS Codes into R Codes? Dear R users, I have a SAS codes with several loops in it, and i hope to use R to do the same task. The SAS codes are as follows, /*to generate the dataset*/ DATA Single_Simulation; DO se=0 to 1 by 0.01; DO sp=0 to 1 by 0.01; DO DR=0 to 1 by 0.01; TR=(DR+sp-1)/(se+sp-1+1.0e-12); Adjust_Factor=TR/(DR+1.0e-12); OUTPUT; END; END; END; RUN; /*to select some data*/ DATA sampledata; SET Single_Simulation; IF DR=0.02 sp=1; RUN; #I tried the following codes with R,failed num-seq(0, 1, by = 0.01) for (se in num) { for (sp in num) { for (DR in num) { TR=(DR+sp-1)/(se+sp-1+1.0e-12) Adjust_Factor=TR/(DR+1.0e-12) } } } My questions are, 1. What is the correct codes for R to do the similar task? 2. Sometimes, the simulated dataset are very large, so i need to put the generated dataset in a file with large disk place, not in the memory, how to do it? SAS can do it with the 'libname' argument, do R have the similar method? Thanks a lot. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] offset and poisson regression
You should use offset(log(Gpc)) instead of offset(Gpc) options(width = 65) fm - glm(NPe ~ 1 + offset(log(GPc)), family = poisson,data = tab) fitted(fm) 1234678 3.181818 3.818182 3.818182 4.454545 3.181818 3.818182 3.181818 9 12 13 14 15 16 18 3.818182 3.181818 3.181818 3.818182 3.818182 3.818182 2.545455 20 21 22 23 24 27 28 1.909091 3.181818 3.818182 1.909091 3.181818 3.818182 3.181818 29 30 31 32 33 34 35 3.181818 3.181818 2.545455 3.181818 3.818182 3.181818 3.818182 36 38 39 40 41 42 44 3.181818 4.454545 3.818182 2.545455 4.454545 5.090909 4.454545 45 46 48 51 52 54 58 4.454545 4.454545 3.181818 4.454545 3.818182 3.181818 3.818182 59 62 64 68 71 75 76 5.090909 4.454545 3.818182 3.181818 4.454545 3.818182 3.818182 81 82 83 85 86 87 88 4.454545 3.818182 3.818182 3.818182 4.454545 3.181818 3.181818 89 90 91 92 93 94 95 5.090909 5.090909 4.454545 4.454545 3.818182 4.454545 3.818182 96 98 99 100 101 102 103 3.181818 4.454545 5.090909 3.818182 4.454545 4.454545 3.181818 104 106 107 108 109 110 111 4.454545 3.181818 5.727273 3.181818 2.545455 4.454545 3.818182 112 113 114 115 116 117 118 3.818182 3.181818 5.090909 3.181818 4.454545 3.818182 4.454545 119 121 122 124 125 126 127 4.454545 3.818182 4.454545 5.090909 4.454545 4.454545 3.818182 All the best, Renaud 2009/7/27 Renaud Scheifler renaud.scheif...@univ-fcomte.fr: Not sure that the list is the best place for this question, but we are going mad with this... We are trying to fit a poisson regression to count data, eg the number of fledged youngs of blue tits (NPe) as a function of the clutch size (GPc) and other environment variables. Here are the original data (dumped) (we just omit the environment variables to simplify): tab- structure(list(NPe = c(3L, 5L, 2L, 6L, NA, 4L, 4L, 4L, 3L, NA, NA, 4L, 5L, 2L, 0L, 5L, NA, 1L, NA, 2L, 5L, 4L, 0L, 4L, NA, NA, 6L, 4L, 0L, 4L, 4L, 0L, 6L, 5L, 6L, 3L, NA, 6L, 5L, 3L, 6L, 7L, NA, 7L, 6L, 4L, NA, 1L, NA, NA, 7L, 6L, NA, 5L, NA, NA, NA, 0L, 0L, NA, NA, 5L, NA, 3L, NA, NA, NA, 5L, NA, NA, 6L, NA, NA, NA, 0L, 6L, NA, NA, NA, NA, 5L, 5L, 4L, NA, 4L, 0L, 4L, 5L, 5L, 4L, 0L, 0L, 5L, 6L, 5L, 1L, NA, 0L, 7L, 0L, 0L, 3L, 3L, 7L, NA, 0L, 6L, 4L, 4L, 5L, 0L, 5L, 4L, 7L, 4L, 7L, 5L, 5L, 0L, NA, 5L, 7L, NA, 8L, 7L, 5L, 0L), GPc = c(5L, 6L, 6L, 7L, NA, 5L, 6L, 5L, 6L, 6L, 4L, 5L, 5L, 6L, 6L, 6L, 4L, 4L, 4L, 3L, 5L, 6L, 3L, 5L, 5L, 7L, 6L, 5L, 5L, 5L, 4L, 5L, 6L, 5L, 6L, 5L, 5L, 7L, 6L, 4L, 7L, 8L, 9L, 7L, 7L, 7L, 4L, 5L, 5L, 4L, 7L, 6L, 5L, 5L, 6L, 2L, 7L, 6L, 8L, NA, NA, 7L, 6L, 6L, NA, 6L, 6L, 5L, 5L, 5L, 7L, 7L, 6L, 6L, 6L, 6L, 7L, 5L, 5L, 7L, 7L, 6L, 6L, 8L, 6L, 7L, 5L, 5L, 8L, 8L, 7L, 7L, 6L, 7L, 6L, 5L, 6L, 7L, 8L, 6L, 7L, 7L, 5L, 7L, 6L, 5L, 9L, 5L, 4L, 7L, 6L, 6L, 5L, 8L, 5L, 7L, 6L, 7L, 7L, 7L, 6L, 7L, 5L, 8L, 7L, 7L, 6L)), .Names = c(NPe, GPc), class = data.frame, row.names = c(NA, -127L)) It seems logical to insert clutch size as an offset term, since we are actually interested in the ratio fledged youngs/clutch size. However, the final results are quite surprising: modsr0-glm(NPe~offset(GPc),family=poisson,data=tab) if we compute the predictions, we get numbers which looks like a gross overestimation of the reality (eg 14.6, 39.7, etc...) -including the fact that it implies that one can have more fledged youngs than eggs !: [1] 0.7 2.0 2.0 5.4 0.7 2.0 0.7 2.0 0.7 0.7 2.0 2.0 2.0 0.3 0.1 0.7 2.0 [18] 0.1 0.7 2.0 0.7 0.7 0.7 0.3 0.7 2.0 0.7 2.0 0.7 5.4 2.0 0.3 5.4 14.6 [35] 5.4 5.4 5.4 0.7 5.4 2.0 0.7 2.0 14.6 5.4 2.0 0.7 5.4 2.0 2.0 5.4 2.0 [52] 2.0 2.0 5.4 0.7 0.7 14.6 14.6 5.4 5.4 2.0 5.4 2.0 0.7 5.4 14.6 2.0 5.4 [69] 5.4 0.7 5.4 0.7 39.7 0.7 0.3 5.4 2.0 2.0 0.7 14.6 0.7 5.4 2.0 5.4 5.4 [86] 2.0 5.4 14.6 5.4 5.4 2.0 Otherwise, if clutch size is inserted as a variable (and not as an offset), predictions are much more realistic, with no extreme values : modsr0-glm(NPe~GPc,family=poisson,data=tab) round(exp(predict(modsr0)),1) [1] 3.2 3.7 3.7 4.4 3.2 3.7 3.2 3.7 3.2 3.2 3.7 3.7 3.7 2.7 2.2 3.2 3.7 2.2 3.2 3.7 3.2 3.2 [23] 3.2 2.7 3.2 3.7 3.2 3.7 3.2 4.4 3.7 2.7 4.4 5.3 4.4 4.4 4.4 3.2 4.4 3.7 3.2 3.7 5.3 4.4 [45] 3.7 3.2 4.4 3.7 3.7 4.4 3.7 3.7 3.7 4.4 3.2 3.2 5.3 5.3 4.4 4.4 3.7 4.4 3.7 3.2 4.4 5.3 [67] 3.7 4.4 4.4 3.2 4.4 3.2 6.2 3.2 2.7 4.4 3.7 3.7 3.2 5.3 3.2 4.4 3.7 4.4 4.4 3.7 4.4 5.3 [89] 4.4 4.4 3.7 Can any sound statistician provide a
Re: [R] create dataset permanently in package (i.e. default or our own package)
Mr. Einstein, On Jul 27, 2009, at 7:27 AM, Albert EINstEIN wrote: Hi, actually while opening R console and R commander we see some packages like car and datasets. in this packages we have default datasets are available. example: women and prestige like that. now i created a sales dataset importing from excel, xml or text file. now i want to store that dataset permanently in any one of the package like i mentioned above (car or datasets). now i closed my R session. after some time i opened R console and R commander. Now I will not create again sales dataset.While clicking any one of package that sales dataset should be found. if possible please give me the code it will be very helpful for us. I'm not sure if this is exactly what you're asking, but this is the question I'm going to answer: How can I include/bundle a dataset with a package that I am developing? Answer: Simply save your dataset into an *.rda/*.RData file and place it in the data directory of the package you're developing. So, for example, let's say you've done so and the name of your file is TheoryOfRelativity.rda.Once the package is installed (and loaded(?)), you can load the datafile it by calling: data(TheoryOfRelativity) I'm not sure that saving your own data file into *some random* package's data directory is a good idea, or what you're asking, but you can do that, too. Just specify the package you need to load it from. For example, if I saved 'TheoryOfRelativity.rda' in the 'nnet' package's data directory (for some weird reason). You can load it like so: load(TheoryOfRelativity, package='nnet') HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating new data frame using loop
Hi all, I sent a request round last week asking for help with using a for loop to read and separate a large dataset. The response I got worked great, but now I have another problem with using my loop. Basically I have a number of different files containing columned data. There are 132 datasets, named such that I have something in the form... precip_colxxx.txt ...where xxx is a number ranging from 1 to 132. What I want to do is read in every 13th table and extract the third column, and then place this in a new dataset. The new dataset will thus compose of 11 columns of data. I have written the following bit of script to read in every 13th table separately, however I'm not sure how to do the next step of creating a new data frame and dumping the third column of my tables into this data frame. Is there are chance I will have to do a nested loop? for (i in seq(1,120,13)) { nm - sprintf('precip_col%03d.txt', i) precip - read.table(nm, header=T) } Thanks very much in advance. Andy _ icons. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] numbers on barplot
Unless you are intentionally trying to distort your data and make the graph harder to read (you don't want to do that), it is better to put the numbers in the margin rather than at the top of the bars. Try the following line after the barplot: mtext( dat$NumberOfPeople, side=1, line=1.5, at=Graph, cex=1.6 ) -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Mohsen Jafarikia Sent: Monday, July 27, 2009 8:02 AM To: r-h...@stat.math.ethz.ch Subject: [R] numbers on barplot Hello all, I have this simple barplot code: ifn - id.dat dat - read.table(ifn) ofn - id.png bitmap(ofn, type = png256, width = 30, height = 30, pointsize = 30, bg = white,res=50) par(mar=c(5, 5, 3, 2),lwd=5) par(cex.main=1.6,cex.lab=1.6,cex.axis=1.6) names(dat)-c(NumberOfPeople,Average) Graph-barplot(dat$Average) dev.off() and here is the data (id.dat): 150.08 60.09 70.37 I want to write down the NumberOfPeople on top of each of the bars. Can anybody help me on this? Thanks, Mohsen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs plot
Look at the pairs2 function in the TeachingDemos package. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Jose Narillos de Santos Sent: Monday, July 27, 2009 9:02 AM To: r-help@r-project.org Subject: [R] pairs plot Hi all, I want to plot trough pairs() plot a matrix with 4 columns. I want to make a trhee plot in a graph. Plotting pairs colum 2,3,4 on y axis and 1 on X axis. You mean (a plot with three graphs) ommitting the first pair with itself. And only the pairs with colum 1 with the other not all pairs. I. e. this matrix 4177 289390 8740 17220 3907 301510 8530 17550 3975 316970 8640 17650 3651 364220 9360 21420 3031 387390 9960 23410 2912 430180 11040 25820 3018 499930 12240 27620 2685 595010 13800 31670 2884 661870 14760 37170 Thanks in advance. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How should i change the SAS Codes into R Codes?
I see Matt Aldridge has given you the answers to your specific questions. If you are used to using SAS you might find Bob Meunchen's book Muenchen, R. A. (2008). R for SAS and SPSS Users (1st ed.). Springer. useful. A shorter version is available as a pdf at http://rforsasandspssusers.com/. The pdf is a very useful guide for anyone beginning to use R. --- On Mon, 7/27/09, zhijie zhang rusers...@gmail.com wrote: From: zhijie zhang rusers...@gmail.com Subject: [R] How should i change the SAS Codes into R Codes? To: r-h...@stat.math.ethz.ch Received: Monday, July 27, 2009, 11:03 AM Dear R users, I have a SAS codes with several loops in it, and i hope to use R to do the same task. The SAS codes are as follows, /*to generate the dataset*/ DATA Single_Simulation; DO se=0 to 1 by 0.01; DO sp=0 to 1 by 0.01; DO DR=0 to 1 by 0.01; TR=(DR+sp-1)/(se+sp-1+1.0e-12); Adjust_Factor=TR/(DR+1.0e-12); OUTPUT; END; END; END; RUN; /*to select some data*/ DATA sampledata; SET Single_Simulation; IF DR=0.02 sp=1; RUN; #I tried the following codes with R,failed num-seq(0, 1, by = 0.01) for (se in num) { for (sp in num) { for (DR in num) { TR=(DR+sp-1)/(se+sp-1+1.0e-12) Adjust_Factor=TR/(DR+1.0e-12) } } } My questions are, 1. What is the correct codes for R to do the similar task? 2. Sometimes, the simulated dataset are very large, so i need to put the generated dataset in a file with large disk place, not in the memory, how to do it? SAS can do it with the 'libname' argument, do R have the similar method? Thanks a lot. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now http://ca.toolbar.yahoo.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normal mixture model
Hi, Christian, Yes, it works. Thank you very much. It's really helpful. Cindy On Mon, Jul 27, 2009 at 5:39 AM, Christian Hennig chr...@stats.ucl.ac.ukwrote: Hi Cindy, you need the summary function mclustsummary - summary(mclustBICoutputobject,data) to get all the information. Some (like best model) is given if you just print out the summary object. Some other information (like estimated parameter values) are accessible as components of the summary object, like mclustsummary$parameters$... Try str(mclustsummary) to see what's there (unfortunately this is not fully documented). For more detail see the help pages. Hope this helps, Christian On Sun, 26 Jul 2009, cindy Guo wrote: Hi, Christian, Thank you for the reply. I just tried. Does the function mclustBIC only give the best model, or does it also do EM to get the cluster means and variances according to the best model it picks? I didn't find it. Is there a way to automatically select the best number of components and do EM? Because I need to do the normal mixture model in a loop (one EM at an iteration), so I want it to do everything automatically. Thanks, Cindy On Sun, Jul 26, 2009 at 3:46 PM, Christian Hennig chr...@stats.ucl.ac.uk wrote: You can use mclustBIC in package mclust (uses the BIC for deciding about the number of components and hierarchical clustering for initialisation). Christian On Sun, 26 Jul 2009, cindy Guo wrote: Hi, All, I want to fit a normal mixture model. Which package in R is best for this? I was using the package 'mixdist', but I need to group the data into groups before fitting model, and different groupings seem to lead to different results. What other package can I use which is stable? And are there packages that can automatically determine the number of components? Thank you, Cindy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating new data frame using loop
Hi Andy, On Jul 27, 2009, at 12:18 PM, Andrew Aldersley wrote: Hi all, I sent a request round last week asking for help with using a for loop to read and separate a large dataset. The response I got worked great, but now I have another problem with using my loop. Basically I have a number of different files containing columned data. There are 132 datasets, named such that I have something in the form... precip_colxxx.txt ...where xxx is a number ranging from 1 to 132. What I want to do is read in every 13th table and extract the third column, and then place this in a new dataset. The new dataset will thus compose of 11 columns of data. I have written the following bit of script to read in every 13th table separately, however I'm not sure how to do the next step of creating a new data frame and dumping the third column of my tables into this data frame. Is there are chance I will have to do a nested loop? for (i in seq(1,120,13)) { nm - sprintf('precip_col%03d.txt', i) precip - read.table(nm, header=T) } You can do this by building up your columns into a list, then using a combo of do.call and cbind. For example: mydata - list() for (i in seq(1,120,13)) { nm - sprintf('precip_col%03d.txt', i) precip - read.table(nm, header=T) mydata[[i]] - precip[,3] } mydata - do.call(cbind, mydata) The first param in do.call is the function you want to call, the second param is a *list* of parameters you'd like to pass into the function. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Should nlme's augPred work with a factor covariate?
I'm having difficulty getting the augPred function from the nlme package to work when the primary covariate is a factor. I don't know if it is intended to work in these situations, but I can't immediately see anything in the documentation that forbids this - ideally I'd like to be able to plot the results just like I can for numeric primary covariates. Many thanks - Gavin Kelly (Cancer Research UK, Bioinformatics Biostatistics) library(nlme) fm - lme(ergoStool) augPred(fm) Error in Summary.factor(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, : min not meaningful for factors sessionInfo() R version 2.9.1 (2009-06-26) i386-pc-mingw32 locale: LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.1252;LC_MONETARY=English_United Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] nlme_3.1-92 loaded via a namespace (and not attached): [1] grid_2.9.1 lattice_0.17-25 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determine the dimension-names of an element in an array in R
Hi there, Thanks again for your reply. I know for-loop is always a solution to my problem and I had already coded using for-loop. But the number of levels for each dimension is large enough in actual problem and hence it was time-consuming. So, I was just wondering if there are any other alternative way-outs to solving my problem. That's why I tried with apply functions (sapply)assuming that this might work out faster even fractionally as compared to for-loop. Cheers, Sauvik On Mon, Jul 27, 2009 at 12:28 AM, Poersching poerschin...@web.de wrote: Sauvik De schrieb: Hi: Lots of thanks for your valuable time! But I am not sure how you would like to use the function in this situation. As I had mentioned that the first element of my output array should be like: cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs) in my below code. and the output array of correlation I wish to get using sapply as follows: Correl = sapply(Correl,function(d) cor(DataArray_1[...],DataArray_2[...], use=pairwise.complete.obs)) So it would be of great help if you could kindly specify how to utilise your function findIndex in ... Apologies for all this! Thanks Regards, Sauvik Hey, sorry, I haven't understood your problem last time, but now this solution should solve your problem, so I hope. :-) It's only a for to loop, but an apply function may work too. I will think about this, but for now... ;-) la-length(a) lb-length(b) lc-length(c) ld-length(d) for (ia in 1:la) { for (ib in 1:lb) { for (ic in 1:lc) { for (id in 1:ld) { Correl[ia,ib,ic,id]-cor( DataArray_1[dimnames(Correl)[[1]][ia], dimnames(Correl)[[2]][ib], dimnames(Correl)[[4]][id],] , DataArray_2[dimnames(Correl)[[1]][ia], dimnames(Correl)[[3]][ic], dimnames(Correl)[[4]][id],] , use=pairwise.complete.obs) } } } } ## with function findIndex you can find the dimensions with ## i.e. cor values greater 0.5 or smaller -0.5, like: findIndex(Correl,Correl[Correl0.5]) findIndex(Correl,Correl[Correl(-0.5)]) I have changed the code of the function findIndex in line which contents: el[j]-which(is.element(data,element[j])) Rigards, Christian On Sun, Jul 26, 2009 at 3:54 PM, Poerschingpoerschin...@web.de wrote: Sauvik De schrieb: Hi Gabor: Many thanks for your prompt reply! The code is fine. But I need it in more general form as I had mentioned that I need to input any 0 to find its dimension-names. Actually, I was using sapply to calculate correlation and this idea was required in the middle of correlation calculation. I am providing the way I tried my calculation. a= c(A1,A2,A3,A4,A5) b= c(B1,B2,B3) c= c(C1,C2,C3,C4) d= c(D1,D2) e= c(E1,E2,E3,E4,E5,E6,E7,E8) DataArray_1 = array(c(rnorm(240)),dim=c(length(a),length(b), length(d),length(e)),dimnames=list(a,b,d,e)) DataArray_2 = array(c(rnorm(320)), dim=c(length(a),length(c), length(d),length(e)),dimnames=list(a,c,d,e)) #Defining an empty array which will contain the correlation values (output array) Correl = array(NA, dim=c(length(a),length(b), length(c),length(d)),dimnames=list(a,b,c,d)) #Calculating Correlation between attributes b c over values of e Correl = sapply(Correl,function(d) cor(DataArray_1[...],DataArray_2[...], use=pairwise.complete.obs)) This is where I get stuck. In the above, d is acting as an element in the Correl array. Hence I need to get the dimension-names for d. #The first element of Correl will be: cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs) So my problem boils down to extracting the dim-names in terms of element(d) and not in terms of Correl (that I have mentioned as ... in the above code) My sincere thanks for your valuable time suggestions. Many Thanks Kind Regards, Sauvik On Sun, Jul 26, 2009 at 5:26 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this: ix - c(1, 3, 4, 2) mapply([, dimnames(mydatastructure), ix) [1] S1 T3 U4 V2 On Sat, Jul 25, 2009 at 5:12 PM, Sauvik Desauvik.s...@gmail.com wrote: Hi: How can I extract the dimension-names of a pre-defined element in a multidimensional array in R ? A toy example is provided below: I have a 4-dimensional array with each dimension having certain length. In the below example, mydatastructure explains the structure of my data. mydatastructure = array(0, dim=c(length(b),length(z),length(x),length(d)), dimnames=list(b,z,x,d)) where, b=c(S1,S2,S3,S4,S5)
[R] plotting a PNG from an in-memory object
Hi, I have code which, via rJava can bring up a JFrame to display an image. What I'd like to be able to is to capture that image and make an R plot out of it (analogous to plotting a PNG file, but not from an actual file). I can rite Java code that could be called from R to take a snapshot of the window, but is it at all possible to some return that data to the R side to display via plot()? Any pointers would be appreciated -- Rajarshi Guha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] probability on a barplot
Dear R People: I have a barplot created from a table. What is the best way to set up the barplot such that is shows probability rather totals, please? I've tried plot also but it shows horizontal bars rather than vertical bars. Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] probability on a barplot
Please ignore the previous email I figured it out. -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dataframe to list conversion
Hi all, I have been experimenting with writing my own matrix column sum function. I want it to return a list. csum-function(m) { a = data.frame(m) s = lapply(a,sum) return(s) } I wish to use the same code up until the return(s) that I have listed above. The problem is that s, I believe, is a data frame (looks like this:) $X1 [1] 148 $X2 [1] 156 $X3 [1] 164 $X4 [1] 172 $X5 [1] 180 I would like a vector with these values (148,156,etc). How may I do this? tia! -- View this message in context: http://www.nabble.com/dataframe-to-list-conversion-tp24682262p24682262.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to deal with this random variable?
Hello to everybody, I have a data frame with 100 measures of quality for 3 variables: A, B and C. These quality variables are measured in diferent times along the productive process. My data comes from 5 experiments (5 replicates with 20 measures for replicate). I also have a final measure (Z) but just one measure for each unit, that is, for the 20 units that are measured on each replica. My objetive is to study the relationships between the 3 quality parameters with the last measure, that is: lm(Z ~ A+B+C, data=mydata) I have found significant differences between replicas for each qualite parameters (A, B and C) and I would like to include the replica effect as a random effect: lme(Z ~ A+B+C, data=mydata, random=~1|replica) And here is my problem. I know that there are signifficant diferences between replicas but since the final measure, Z, is the same for each replica I do not know how to deal with. Can you help me? How could I take into account the variability due to the replica when I want to study the effects of variables A, B and C on the final result of a productive process? Thank you in advance. - Manuel Ramón Fernández Group of Reproductive Biology (GBR) University of Castilla-La Mancha (Spain) mra...@jccm.es -- View this message in context: http://www.nabble.com/How-to-deal-with-this-random-variable--tp24684341p24684341.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Superscripts and rounding
I am new to the world of R/programming so this may be a really easy question. I thank you for your patience and help in advance I would like the characters km^2 to be displayed on the plot subtitle as km squared - two as a superscript. I would also like to have the numbers from the data set for longitude and latitude to be rounded to four decimal places. Thank you. plot ( decade[['date']], decade[['value']], type = 'l', col = 'lightsteelblue4', ylab = 'Discharge [cms]', main = sprintf('%s [%s]', stn[['metadata']][['name']], stn[['metadata']][['id']]), km^2 - expression sub = sprintf('Seasonal station with natural streamflow - Lat: %s Lon: %s Gross Area %s km^2 - Effective Area %s km^2', stn[['metadata']][['latitude']], stn[['metadata']][['longitude']],stn[['metadata']][['grossarea']], stn[['metadata']][['effectivearea']]), cex.sub = 1, font.sub = 3, col.sub = black ) -- View this message in context: http://www.nabble.com/Superscripts-and-rounding-tp24682319p24682319.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe to list conversion
have a look at ?unlist(); you can also use sapply() in this case instead of lapply(). Best, Dimitris voidobscura wrote: Hi all, I have been experimenting with writing my own matrix column sum function. I want it to return a list. csum-function(m) { a = data.frame(m) s = lapply(a,sum) return(s) } I wish to use the same code up until the return(s) that I have listed above. The problem is that s, I believe, is a data frame (looks like this:) $X1 [1] 148 $X2 [1] 156 $X3 [1] 164 $X4 [1] 172 $X5 [1] 180 I would like a vector with these values (148,156,etc). How may I do this? tia! -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe to list conversion
Hi voidobscura, Try either csum2 - function(m){ a = data.frame(m) s = lapply(a,sum) do.call(c, s) } or colSums(m) See ?do.call and ?colSums for more details. HTH, Jorge On Mon, Jul 27, 2009 at 11:03 AM, voidobscura nshah...@gmail.com wrote: Hi all, I have been experimenting with writing my own matrix column sum function. I want it to return a list. csum-function(m) { a = data.frame(m) s = lapply(a,sum) return(s) } I wish to use the same code up until the return(s) that I have listed above. The problem is that s, I believe, is a data frame (looks like this:) $X1 [1] 148 $X2 [1] 156 $X3 [1] 164 $X4 [1] 172 $X5 [1] 180 I would like a vector with these values (148,156,etc). How may I do this? tia! -- View this message in context: http://www.nabble.com/dataframe-to-list-conversion-tp24682262p24682262.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to deal with this random variable?
This sounds way too complicated for this forum, which is designed to provide help to users on the use of the R language, not remote statistical consulting. While you may receive replies, I would argue that you would do better to find a local statistical expert with whom to work -- not least because they should probably have a deep understanding of how your experiment was conducted, data gathered, measurements made, etc. to be able to give you worthwhile advice. Long distance consulting based on incomplete understanding is very risky. Caveat emptor! Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Manuel Ramon Sent: Monday, July 27, 2009 9:54 AM To: r-help@r-project.org Subject: [R] How to deal with this random variable? Hello to everybody, I have a data frame with 100 measures of quality for 3 variables: A, B and C. These quality variables are measured in diferent times along the productive process. My data comes from 5 experiments (5 replicates with 20 measures for replicate). I also have a final measure (Z) but just one measure for each unit, that is, for the 20 units that are measured on each replica. My objetive is to study the relationships between the 3 quality parameters with the last measure, that is: lm(Z ~ A+B+C, data=mydata) I have found significant differences between replicas for each qualite parameters (A, B and C) and I would like to include the replica effect as a random effect: lme(Z ~ A+B+C, data=mydata, random=~1|replica) And here is my problem. I know that there are signifficant diferences between replicas but since the final measure, Z, is the same for each replica I do not know how to deal with. Can you help me? How could I take into account the variability due to the replica when I want to study the effects of variables A, B and C on the final result of a productive process? Thank you in advance. - Manuel Ramón Fernández Group of Reproductive Biology (GBR) University of Castilla-La Mancha (Spain) mra...@jccm.es -- View this message in context: http://www.nabble.com/How-to-deal-with-this-random-variable--tp24684341p2468 4341.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] skip plot/blank plot on purpose (multi-plot question)
Hi, Say that I've got a function that has the following code in it: X11(width=10, height=10) layout(rbind(c(1,1,1,2,2,2), c(3,4,5,6,7,8), c(9,10,11,12,13,14)), height=c(3,1,1)) layout.show(14) Sometimes when I call this function it will turn out by design that one or more of the data sets that I use to create the plots in positions 3-14 are empty. As there is a day of the week relationship between 3-8 and 9-14, and say that 12 is an empty set, how can I skip 12, leave it blank or make it blank, and then make the next data set plot in position 13? 1) Is there some generic way to call plot and have it plot, but it plots nothing so I don't see anything at all in position 12? This could be a blank plot function I call when I notice the data set is empty. 2) Is there some generic way to specify the position number I want the next plot to use so that I'd not plot 12 but would specify 13? Thanks, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to correlate nominal variables?
Benoit Vaillant made me aware of an indexing mistake in the computation of Cramer's V. The col.sum indexes rows instead of columns. This is a correction of the code: cramers.v=function(x){ x=as.data.frame(x) chisq=0 row.sum=NULL col.sum=NULL row.sum=rowSums(table(x)) col.sum=colSums(table(x)) for(k in 1:dim(table(x))[1]){ for(l in 1:dim(table(x))[2]){ chisq=chisq+((table(x)[k,l]-(row.sum[k]*col.sum[l])/(dim(x)[1]))^2)/((row.sum[k]*col.sum[l])/(dim(x)[1])) cramers.v=sqrt(chisq/(dim(x)[1]*(min(dim(table(x)))-1))) } } } Daniel Malter wrote: You can copy the code below to your R-code editor. For Yule's Q, the data is expected in two vectors. For cramer's phi, the data is expected in separate columns of a matrix or dataframe. ##Run this code yule.Q=function(x,y){(table(x,y)[1,1]*table(x,y)[2,2]-table(x,y)[1,2]*table(x,y)[2,1])/(table(x,y)[1,1]*table(x,y)[2,2]+table(x,y)[1,2]*table(x,y)[2,1])} ##create test data vector.one=rbinom(100,1,0.4) vector.two=rbinom(100,1,0.8) table(vector.one,vector.two) ##compute yule's Q yule.Q(vector.one,vector.two) ##just put your two vector names there ##Cramer's V ##Run this code cramers.v=function(x){ x=as.data.frame(x) chisq=0 row.sum=NULL col.sum=NULL for(i in 1:dim(table(x))[1]) row.sum[i]=sum(table(x)[i,]) for(j in 1:dim(table(x))[2]) col.sum[j]=sum(table(x)[j,]) for(k in 1:dim(table(x))[1]){ for(l in 1:dim(table(x))[2]){ chisq=chisq+((table(x)[k,l]-(row.sum[k]*col.sum[l])/(dim(x)[1]))^2)/((row.sum[k]*col.sum[l])/(dim(x)[1])) cramers.v=sqrt(chisq/(dim(x)[1]*(min(dim(table(x)))-1))) } } } ##create test data toanalyze=cbind(rbinom(100,2,0.4),rbinom(100,1,0.6)) toanalyze2=cbind(rep(c(0,1),each=50),rep(c(0,1),each=50)) ##compute cramer's v for the test data v1=cramers.v(toanalyze) ## just put your dataframe or matrix name v2=cramers.v(toanalyze2) v1 ##cramer's v v2 ##cramer's v Timo Stolz wrote: Dear R-Users, I need functions to calculate Yule's Y or Cramérs Index, in order to correlate variables that are nominally scaled? Am I wrong? Are such functions existing? Sincerely, Timo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/how-to-correlate-nominal-variables--tp18441195p24686228.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Forumla format?
Hi, Quick question. I'm working on training an SVM. I have a dataframe with about 50 columns. I want to train on 46 of them. Is there a way to say All except columns 22,23,25 and 31? It would be nice to not have to do +c1 +c2 +c3 +c4, etc for all 48 columns. Thanks! -N [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Searching for specific values in a matrix
i am able to return the first column, but anything else returns this: 0 rows (or 0-length row.names) any idea? On Tue, Jul 21, 2009 at 12:49 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: On Jul 21, 2009, at 3:27 PM, Mehdi Khan wrote: I understand your explanation about the test for even numbers. However I am still a bit confused as to how to go about finding a particular value. Here is an example data set col # attr1attr2 attr 3LONLAT 17209 DNANA -122.9409 38.27645 17210BCNANA -122.9581 38.36304 17211 BNANA -123.6851 41.67121 17212BCNANA -123.0724 38.93073 17213 CNANA -123.7240 41.84403 17214 NA 464NA -122.9430 38.30988 17215 CNANA -123.4442 40.65369 17216BCNANA -122.9389 38.31551 17217 CNANA -123.0747 38.97998 17218 CNANA -123.6580 41.59610 17219 CNANA -123.4513 40.70992 17220 CNANA -123.0901 39.06473 17221BCNANA -123.0653 38.94845 17222BCNANA -122.9464 38.36808 17223 NA 464NA -123.0143 38.70205 17224 NANA 5 -122.8609 37.94137 17225 NANA 5 -122.8628 37.95057 17226 NANA 7 -122.8646 37.95978 For future reference, perhaps paste this in a way that's easy for us to paste into a running R session so we can use it, like so: df - data.frame( coln=c(17209, 17210, 17211, 17212, 17213, 17214, 17215, 17216, 17217, 17218, 17219, 17220, 17221, 17222, 17223, 17224, 17225, 17226), attr1=c(D,BC,B,BC,C,NA,C,BC,C,C,C,C,BC,BC,NA,NA,NA,NA), attr2=c( NA,NA,NA,NA,NA,464,NA,NA,NA,NA,NA,NA,NA,NA,464,NA,NA,NA), attr3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5,7), LON=c( -122.9409,-122.9581,-123.6851,-123.0724,-123.7240,-122.9430,-123.4442,-122.9389,-123.0747,-123.6580,-123.4513,-123.0901,-123.0653,-122.9464,-123.0143,-122.8609,-122.8628,-122.8646), LAT=c(38.27645,38.36304,41.67121,38.93073,41.84403,38.30988,40.65369,38.31551,38.97998,41.59610,40.70992,39.06473,38.94845,38.36808,38.70205,37.94137,37.95057,37.95978)) If I wanted to find the row with Lat = 37.95978 Using an indexing vector: R lats - df$LAT == 37.95978 # or with the %~% from before: # lats - df$LAT %~% 37.95978 R df[lats,] coln attr1 attr2 attr3 LON LAT 18 17226 NANA 7 -122.8646 37.95978 Using the subset function: R subset(df, LAT == 37.95978) coln attr1 attr2 attr3 LON LAT 18 17226 NANA 7 -122.8646 37.95978 , how would i do that? How would I find the rows with BC? R subset(df, attr1 == 'BC') coln attr1 attr2 attr3 LON LAT 2 17210BCNANA -122.9581 38.36304 4 17212BCNANA -123.0724 38.93073 8 17216BCNANA -122.9389 38.31551 13 17221BCNANA -123.0653 38.94845 14 17222BCNANA -122.9464 38.36808 If you try with an indexing vector the NA's will trip you up: R df[df$attr1 == 'BC',] coln attr1 attr2 attr3 LON LAT 217210BCNANA -122.9581 38.36304 417212BCNANA -123.0724 38.93073 NA NA NANANANA NA 817216BCNANA -122.9389 38.31551 13 17221BCNANA -123.0653 38.94845 14 17222BCNANA -122.9464 38.36808 NA.1NA NANANANA NA NA.2NA NANANANA NA NA.3NA NANANANA NA NA.4NA NANANANA NA So you could do something like: df[df$attr1 == 'BC' !is.na(df$attr1),] coln attr1 attr2 attr3 LON LAT 2 17210BCNANA -122.9581 38.36304 4 17212BCNANA -123.0724 38.93073 8 17216BCNANA -122.9389 38.31551 13 17221BCNANA -123.0653 38.94845 14 17222BCNANA -122.9464 38.36808 HTH, -steve -- Steve Lianoglou Graduate Student: Physiology, Biophysics and Systems Biology Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contacthttp://cbio.mskcc.org/%7Elianos/contact [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Forumla format?
Hi, On Jul 27, 2009, at 3:01 PM, Noah Silverman wrote: Hi, Quick question. I'm working on training an SVM. I have a dataframe with about 50 columns. I want to train on 46 of them. Is there a way to say All except columns 22,23,25 and 31? Assume your dataframe is called my.data: my.data[,-c(22,23,25,31)] Returns the data.frame w/o columns 22,23,25 and 31. -steve It would be nice to not have to do +c1 +c2 +c3 +c4, etc for all 48 columns. Yes, it is nice, isn't it? :-) -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] skip plot/blank plot on purpose (multi-plot question)
Well, all of this can be done quite nicely with lattice graphics: ?xyplot (See, e.g. the skip argument) 1) Is there some generic way to call plot and have it plot, but it plots nothing so I don't see anything at all in position 12? This could be a blank plot function I call when I notice the data set is empty. -- But if you do not wish to learn lattice, please at least read the docs on standard graphics: ?plot (the type argument) ?plot.default (the axes argument) -- Bert Gunter Genentech, Inc. 2) Is there some generic way to specify the position number I want the next plot to use so that I'd not plot 12 but would specify 13? Thanks, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Working with tables with missing levels
Hello I'm trying to write a function to calculate the relative entropy between two distributions. The data I have is in table format, for example: t1 - prop.table(table(c(0,0,2,4,4))) t2 - prop.table(table(c(0,2,2,2,3))) t1 0 2 4 0.4 0.2 0.4 t2 0 2 3 0.2 0.6 0.2 The relative entropy is given by H[P||Q] = sum(p * log2(p/q)) with the conventions that 0*log2(0/q) = 0 and p*log2(p/0) = Inf. I'm not sure about what is the best way to achieve that. Is there a way to test if a table has a value for a given level, so that I can detect that, for example, t1 is missing levels 1 and 3 and t2 is missing levels 1 and 4 (is level the correct terminology here?)? Simply trying to access t1[[1]], for example, gives a subscript out of bounds error. Another option would be to expand the tables, so that, for example, t1 becomes 0 1 2 3 4 0.4 0.0 0.2 0.0 0.4 Is there a way to do that? Thanks, Andre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cross-validating two matrices
Hello, I am trying to help a colleague with an R problem (see below) to whit I can only generate a very inelegant solution. Any advice would be welcome. Thanks, Brian If you have two matrices, say a species by trait matrix (Matrix 1 below) and a plot by species matrix (Matrix 2 below), is there a straightforward way to prune one matrix so that the species list matches those in a second matrix? Ideally, I need to prune both matrices so that they include only the species found in both. In this case, the two matrices would therefore include only sp1,sp4,sp5. Thank you so much for any suggestions! E.g.: Matrix 1: Trait1 sp1 1.0 sp2 1.2 sp4 3.1 sp5 4.0 sp7 4.5 Matrix 2 sp1 sp3 sp4 sp5 sp6 plot1 1 2 5 1 0 plot2 3 0 1 2 5 plot3 1 1 2 3 1 plot4 0 1 2 1 0 Brian C. McCarthy, Ph.D. Professor of Forest Ecology Dept. of Environmental Plant Biology 317 Porter Hall Ohio University Athens, OH 45701-2979 USA T: 740-593-1615 F: 740-593-1130 E: mccar...@ohio.edu W: http://www.plantbio.ohiou.edu/index.php/directory/faculty_page/brian_mccarthy/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working with tables with missing levels
Hi Andre, Just about expending the table, The way you could do this is by using factors, for example: t1 - prop.table(table(factor(c(0,0,2,4,4 t2 - prop.table(table(factor( c(0,2,2,2,3 The rest is for more knowledgeable people then me to say... On Mon, Jul 27, 2009 at 10:21 PM, Andre Nathan an...@digirati.com.brwrote: Hello I'm trying to write a function to calculate the relative entropy between two distributions. The data I have is in table format, for example: t1 - prop.table(table(c(0,0,2,4,4))) t2 - prop.table(table(c(0,2,2,2,3))) t1 0 2 4 0.4 0.2 0.4 t2 0 2 3 0.2 0.6 0.2 The relative entropy is given by H[P||Q] = sum(p * log2(p/q)) with the conventions that 0*log2(0/q) = 0 and p*log2(p/0) = Inf. I'm not sure about what is the best way to achieve that. Is there a way to test if a table has a value for a given level, so that I can detect that, for example, t1 is missing levels 1 and 3 and t2 is missing levels 1 and 4 (is level the correct terminology here?)? Simply trying to access t1[[1]], for example, gives a subscript out of bounds error. Another option would be to expand the tables, so that, for example, t1 becomes 0 1 2 3 4 0.4 0.0 0.2 0.0 0.4 Is there a way to do that? Thanks, Andre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Searching for specific values in a matrix
On Jul 27, 2009, at 2:54 PM, Mehdi Khan wrote: i am able to return the first column, but anything else returns this: 0 rows (or 0-length row.names) any idea? I'm not sure what you're doing. The result you're getting happens when no rows pass the logical test that you are using to index the rows of your data.frame for. Can you show the code that you are using (based on the example data you gave) that is giving you the 0 rows result? -steve On Tue, Jul 21, 2009 at 12:49 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: On Jul 21, 2009, at 3:27 PM, Mehdi Khan wrote: I understand your explanation about the test for even numbers. However I am still a bit confused as to how to go about finding a particular value. Here is an example data set col # attr1attr2 attr 3LONLAT 17209 DNANA -122.9409 38.27645 17210BCNANA -122.9581 38.36304 17211 BNANA -123.6851 41.67121 17212BCNANA -123.0724 38.93073 17213 CNANA -123.7240 41.84403 17214 NA 464NA -122.9430 38.30988 17215 CNANA -123.4442 40.65369 17216BCNANA -122.9389 38.31551 17217 CNANA -123.0747 38.97998 17218 CNANA -123.6580 41.59610 17219 CNANA -123.4513 40.70992 17220 CNANA -123.0901 39.06473 17221BCNANA -123.0653 38.94845 17222BCNANA -122.9464 38.36808 17223 NA 464NA -123.0143 38.70205 17224 NANA 5 -122.8609 37.94137 17225 NANA 5 -122.8628 37.95057 17226 NANA 7 -122.8646 37.95978 For future reference, perhaps paste this in a way that's easy for us to paste into a running R session so we can use it, like so: df - data.frame( coln=c(17209, 17210, 17211, 17212, 17213, 17214, 17215, 17216, 17217, 17218, 17219, 17220, 17221, 17222, 17223, 17224, 17225, 17226), attr1 = c (D ,BC ,B,BC,C,NA,C,BC,C,C,C,C,BC,BC,NA,NA,NA,NA), attr2=c( NA,NA,NA,NA,NA,464,NA,NA,NA,NA,NA,NA,NA,NA,464,NA,NA,NA), attr3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5,7), LON = c ( -122.9409 ,-122.9581 ,-123.6851 ,-123.0724 ,-123.7240 ,-122.9430 ,-123.4442 ,-122.9389 ,-123.0747 ,-123.6580 ,-123.4513 ,-123.0901 ,-123.0653,-122.9464,-123.0143,-122.8609,-122.8628,-122.8646), LAT = c (38.27645,38.36304,41.67121,38.93073,41.84403,38.30988,40.65369,38.31551,38.97998,41.59610,40.70992,39.06473,38.94845,38.36808,38.70205,37.94137,37.95057,37.95978 )) If I wanted to find the row with Lat = 37.95978 Using an indexing vector: R lats - df$LAT == 37.95978 # or with the %~% from before: # lats - df$LAT %~% 37.95978 R df[lats,] coln attr1 attr2 attr3 LON LAT 18 17226 NANA 7 -122.8646 37.95978 Using the subset function: R subset(df, LAT == 37.95978) coln attr1 attr2 attr3 LON LAT 18 17226 NANA 7 -122.8646 37.95978 , how would i do that? How would I find the rows with BC? R subset(df, attr1 == 'BC') coln attr1 attr2 attr3 LON LAT 2 17210BCNANA -122.9581 38.36304 4 17212BCNANA -123.0724 38.93073 8 17216BCNANA -122.9389 38.31551 13 17221BCNANA -123.0653 38.94845 14 17222BCNANA -122.9464 38.36808 If you try with an indexing vector the NA's will trip you up: R df[df$attr1 == 'BC',] coln attr1 attr2 attr3 LON LAT 217210BCNANA -122.9581 38.36304 417212BCNANA -123.0724 38.93073 NA NA NANANANA NA 817216BCNANA -122.9389 38.31551 13 17221BCNANA -123.0653 38.94845 14 17222BCNANA -122.9464 38.36808 NA.1NA NANANANA NA NA.2NA NANANANA NA NA.3NA NANANANA NA NA.4NA NANANANA NA So you could do something like: df[df$attr1 == 'BC' !is.na(df$attr1),] coln attr1 attr2 attr3 LON LAT 2 17210BCNANA -122.9581 38.36304 4 17212BCNANA -123.0724 38.93073 8 17216BCNANA -122.9389 38.31551 13 17221BCNANA -123.0653 38.94845 14 17222BCNANA -122.9464 38.36808 HTH, -steve -- Steve Lianoglou Graduate Student: Physiology, Biophysics and Systems Biology Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide
Re: [R] Cross-validating two matrices
Hi, On Jul 27, 2009, at 3:29 PM, Brian McCarthy wrote: Hello, I am trying to help a colleague with an R problem (see below) to whit I can only generate a very inelegant solution. Any advice would be welcome. Thanks, Brian If you have two matrices, say a species by trait matrix (Matrix 1 below) and a plot by species matrix (Matrix 2 below), is there a straightforward way to prune one matrix so that the species list matches those in a second matrix? Ideally, I need to prune both matrices so that they include only the species found in both. In this case, the two matrices would therefore include only sp1,sp4,sp5. Thank you so much for any suggestions! E.g.: Matrix 1: Trait1 sp1 1.0 sp2 1.2 sp4 3.1 sp5 4.0 sp7 4.5 Matrix 2 sp1 sp3 sp4 sp5 sp6 plot1 1 2 5 1 0 plot2 3 0 1 2 5 plot3 1 1 2 3 1 plot4 0 1 2 1 0 keep - intersect(rownames(matrix1), colnames(matrix2)) m1 - matrix1[keep,] m2 - matrix2[,keep] -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working with tables with missing levels
Try this: t1 - prop.table(table(factor(c(0,0,2,4,4), levels = 0:4))) t2 - prop.table(table(factor(c(0,2,2,2,3), levels = 0:4))) On Mon, Jul 27, 2009 at 4:21 PM, Andre Nathan an...@digirati.com.br wrote: Hello I'm trying to write a function to calculate the relative entropy between two distributions. The data I have is in table format, for example: t1 - prop.table(table(c(0,0,2,4,4))) t2 - prop.table(table(c(0,2,2,2,3))) t1 0 2 4 0.4 0.2 0.4 t2 0 2 3 0.2 0.6 0.2 The relative entropy is given by H[P||Q] = sum(p * log2(p/q)) with the conventions that 0*log2(0/q) = 0 and p*log2(p/0) = Inf. I'm not sure about what is the best way to achieve that. Is there a way to test if a table has a value for a given level, so that I can detect that, for example, t1 is missing levels 1 and 3 and t2 is missing levels 1 and 4 (is level the correct terminology here?)? Simply trying to access t1[[1]], for example, gives a subscript out of bounds error. Another option would be to expand the tables, so that, for example, t1 becomes 0 1 2 3 4 0.4 0.0 0.2 0.0 0.4 Is there a way to do that? Thanks, Andre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Forumla format?
Hi, I'm not sure that would work for the formula format of an SVM function. the idea is normally svm(label ~ c1 + c2 +c3, data=mydata); It doesn't work to say svm(label ~ -c(22,23,24), data=mydata) On 7/27/09 12:17 PM, Steve Lianoglou wrote: Hi, On Jul 27, 2009, at 3:01 PM, Noah Silverman wrote: Hi, Quick question. I'm working on training an SVM. I have a dataframe with about 50 columns. I want to train on 46 of them. Is there a way to say All except columns 22,23,25 and 31? Assume your dataframe is called my.data: my.data[,-c(22,23,25,31)] Returns the data.frame w/o columns 22,23,25 and 31. -steve It would be nice to not have to do +c1 +c2 +c3 +c4, etc for all 48 columns. Yes, it is nice, isn't it? :-) -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] skip plot/blank plot on purpose (multi-plot question)
On Mon, Jul 27, 2009 at 12:21 PM, Bert Guntergunter.ber...@gene.com wrote: Well, all of this can be done quite nicely with lattice graphics: ?xyplot (See, e.g. the skip argument) 1) Is there some generic way to call plot and have it plot, but it plots nothing so I don't see anything at all in position 12? This could be a blank plot function I call when I notice the data set is empty. -- But if you do not wish to learn lattice, please at least read the docs on standard graphics: ?plot (the type argument) ?plot.default (the axes argument) Thank you. It was the ?plot.default/axis argument that I was looking for. I knew type=n. Cheers, Mark -- Bert Gunter Genentech, Inc. 2) Is there some generic way to specify the position number I want the next plot to use so that I'd not plot 12 but would specify 13? Thanks, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Forumla format?
Hi, On Jul 27, 2009, at 3:47 PM, Noah Silverman wrote: Hi, I'm not sure that would work for the formula format of an SVM function. the idea is normally svm(label ~ c1 + c2 +c3, data=mydata); It doesn't work to say svm(label ~ -c(22,23,24), data=mydata) You're quite right. Sorry, I misunderstood the question ... I'm actually not sure if/how you could do that as I don't use the formula formulation too much. Is it possible to build up your formula as a string, and then convert to formula w/ as.formula? ie. f - as.formula(sprintf('label ~ %s', paste(colnames(my.data)[- c(22,23,24), collapse= + )) Then use `f` in your call to svm? Maybe? -steve On 7/27/09 12:17 PM, Steve Lianoglou wrote: Hi, On Jul 27, 2009, at 3:01 PM, Noah Silverman wrote: Hi, Quick question. I'm working on training an SVM. I have a dataframe with about 50 columns. I want to train on 46 of them. Is there a way to say All except columns 22,23,25 and 31? Assume your dataframe is called my.data: my.data[,-c(22,23,25,31)] Returns the data.frame w/o columns 22,23,25 and 31. -steve It would be nice to not have to do +c1 +c2 +c3 +c4, etc for all 48 columns. Yes, it is nice, isn't it? :-) -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working with tables with missing levels
On Mon, 2009-07-27 at 16:34 -0300, Henrique Dallazuanna wrote: Try this: t1 - prop.table(table(factor(c(0,0,2,4,4), levels = 0:4))) t2 - prop.table(table(factor(c(0,2,2,2,3), levels = 0:4))) Is there a way to do this given an already existing table? The problem is that I actually build the distributions as I read data from files, something like distr - NULL for (file in files) { x - as.matrix(read.table(file)) t - c(distr, table(x)) distr - tapply(t, names(t), sum) } distr - prop.table(distr) So I only know the maximum level after the distributions are created. Thanks, Andre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Searching for specific values in a matrix
On Jul 27, 2009, at 3:50 PM, Mehdi Khan wrote: the problem is, it works with the example data i gave. however, it does NOT work with the data set i have, which is 600,000 rows. the class is still a data frame. So the problem must be in your data, or what you think is in your data. Somehow you're constructing a boolean query that returns false for every row. As long as you're not getting any memory errors, the size of your data doesn't change the mechanics of how this would work. I suspect you're not getting 0 rows for every possible query you can come up with, right? Look at the first 10 lines of your dataset and try to select some rows from your entire data.frame by using values you can see in the first 10 rows you've just looked at. I'm expecting this would work, in which case I'm not sure how much more help I can provide. -steve On Mon, Jul 27, 2009 at 12:15 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: On Jul 27, 2009, at 2:54 PM, Mehdi Khan wrote: i am able to return the first column, but anything else returns this: 0 rows (or 0-length row.names) any idea? I'm not sure what you're doing. The result you're getting happens when no rows pass the logical test that you are using to index the rows of your data.frame for. Can you show the code that you are using (based on the example data you gave) that is giving you the 0 rows result? -steve On Tue, Jul 21, 2009 at 12:49 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: On Jul 21, 2009, at 3:27 PM, Mehdi Khan wrote: I understand your explanation about the test for even numbers. However I am still a bit confused as to how to go about finding a particular value. Here is an example data set col # attr1attr2 attr 3LONLAT 17209 DNANA -122.9409 38.27645 17210BCNANA -122.9581 38.36304 17211 BNANA -123.6851 41.67121 17212BCNANA -123.0724 38.93073 17213 CNANA -123.7240 41.84403 17214 NA 464NA -122.9430 38.30988 17215 CNANA -123.4442 40.65369 17216BCNANA -122.9389 38.31551 17217 CNANA -123.0747 38.97998 17218 CNANA -123.6580 41.59610 17219 CNANA -123.4513 40.70992 17220 CNANA -123.0901 39.06473 17221BCNANA -123.0653 38.94845 17222BCNANA -122.9464 38.36808 17223 NA 464NA -123.0143 38.70205 17224 NANA 5 -122.8609 37.94137 17225 NANA 5 -122.8628 37.95057 17226 NANA 7 -122.8646 37.95978 For future reference, perhaps paste this in a way that's easy for us to paste into a running R session so we can use it, like so: df - data.frame( coln=c(17209, 17210, 17211, 17212, 17213, 17214, 17215, 17216, 17217, 17218, 17219, 17220, 17221, 17222, 17223, 17224, 17225, 17226), attr1 = c (D ,BC ,B,BC,C,NA,C,BC,C,C,C,C,BC,BC,NA,NA,NA,NA), attr2=c( NA,NA,NA,NA,NA,464,NA,NA,NA,NA,NA,NA,NA,NA,464,NA,NA,NA), attr3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5,7), LON = c ( -122.9409 ,-122.9581 ,-123.6851 ,-123.0724 ,-123.7240 ,-122.9430 ,-123.4442 ,-122.9389 ,-123.0747 ,-123.6580 ,-123.4513 ,-123.0901 ,-123.0653,-122.9464,-123.0143,-122.8609,-122.8628,-122.8646), LAT = c (38.27645,38.36304,41.67121,38.93073,41.84403,38.30988,40.65369,38.31551,38.97998,41.59610,40.70992,39.06473,38.94845,38.36808,38.70205,37.94137,37.95057,37.95978 )) If I wanted to find the row with Lat = 37.95978 Using an indexing vector: R lats - df$LAT == 37.95978 # or with the %~% from before: # lats - df$LAT %~% 37.95978 R df[lats,] coln attr1 attr2 attr3 LON LAT 18 17226 NANA 7 -122.8646 37.95978 Using the subset function: R subset(df, LAT == 37.95978) coln attr1 attr2 attr3 LON LAT 18 17226 NANA 7 -122.8646 37.95978 , how would i do that? How would I find the rows with BC? R subset(df, attr1 == 'BC') coln attr1 attr2 attr3 LON LAT 2 17210BCNANA -122.9581 38.36304 4 17212BCNANA -123.0724 38.93073 8 17216BCNANA -122.9389 38.31551 13 17221BCNANA -123.0653 38.94845 14 17222BCNANA -122.9464 38.36808 If you try with an indexing vector the NA's will trip you up: R df[df$attr1 == 'BC',] coln attr1 attr2 attr3 LON LAT 217210BCNANA -122.9581 38.36304 417212BCNANA -123.0724 38.93073 NA NA NANANANA NA 817216BCNANA -122.9389 38.31551 13 17221BCNANA -123.0653 38.94845 14 17222BCNANA -122.9464 38.36808 NA.1NA NANANANA NA NA.2NA NANANANA NA NA.3NA NANANANA NA NA.4NA NANANANA NA So you could do
Re: [R] probability on a barplot
Kindly, post the solution to the problem, so that it will benefit others. An example could would be great. Cheers../Murli -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erin Hodgess Sent: Monday, July 27, 2009 1:33 PM To: R help Subject: [R] probability on a barplot Please ignore the previous email I figured it out. -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] computing the radius of an arc
Alex Brenning, the developer of the RSAGA package told me that and I quote the RSAGA package (which uses functions from the free geographical information system [GIS] SAGA GIS) has a curvature function that is designed to calculate the curvature of surfaces, in particular raster (i.e. gridded) digital elevation models. I am not aware of a function in SAGA GIS or other GIS that would calculate curvatures along a line, especially not in 3D I shall try to develop it and if I am successful I shall make it available. Cheers../Murli -Original Message- From: Greg Snow [mailto:greg.s...@imail.org] Sent: Friday, July 24, 2009 12:41 PM To: Nair, Murlidharan T; Hans W Borchers; r-h...@stat.math.ethz.ch; Bert Gunter; 'Gabor Grothendieck' Subject: RE: [R] computing the radius of an arc There is a function rsaga.local.morphometry in the RSAGA package that says it computes curvature (among other things). It looks like that function was designed for a different type of data than yours is, but it may work, or if not, then you may be able to adapt some of the code to work with your data. Another idea: since the curvature is a function of the radius of the circle similar to the curve (or possibly sphere), you could find the least squares estimate of the circle/sphere that best fits the data (or a weighted subset along the lines of loess) then take the curvature from the radius of that circle/sphere. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Nair, Murlidharan T Sent: Friday, July 24, 2009 8:53 AM To: Hans W Borchers; r-h...@stat.math.ethz.ch; Bert Gunter; 'Gabor Grothendieck' Subject: Re: [R] computing the radius of an arc Thanks, for all the suggestions. Indeed I am interested in computing curvature (http://en.wikipedia.org/wiki/Curvature) of the curve that is given as a discrete set of points. I have the curve in 3D but I can certainly write it in 2D as well. Code is given at the bottom. I will try with the suggestions you have given, but if anyone has done this before, I would appreciate your help. Cheers../Murli [snip] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create dataset permanently in package (i.e. default or our own package)
Albert EINstEIN wrote: Hi, actually while opening R console and R commander we see some packages like car and datasets. in this packages we have default datasets are available. example: women and prestige like that. now i created a sales dataset importing from excel, xml or text file. now i want to store that dataset permanently in any one of the package like i mentioned above (car or datasets). now i closed my R session. after some time i opened R console and R commander. Now I will not create again sales dataset.While clicking any one of package that sales dataset should be found. if possible please give me the code it will be very helpful for us. Thanks in advance. The steps you need to follow are: 1. Open a new R session. 2. Create your 'sales' dataset and any other datasets you want to preserve. Make sure these are the only objects in your workspace- i.e. they are the only names that come up when you use ls(). 3. Create a new package using package.skeleton('myPackage') This will create a new folder called myPackage that contains a folder called data. Inside data will be a .rda file for your sales dataset and any other datasets you had in your environment at the time package.sekeleton was run. Now you need to install your package. This can be done by: system('R CMD INSTALL myPackage') Or from the command line: R CMD INSTALL myPackage Note that if you are using Windows, you will need to install Duncan Murdoch's Rtools package located at: http://www.murdoch-sutherland.com/Rtools/ If the installer asks anything about modifying your PATH, allow it to do so. Once the package has been installed, you can load your dataset using: library(myPackage) data(sales) If you want to add additional data sets, save them to individual .rda files using: save('myFirstNewDataset',file='myFirstNewDataset.rda') save('mySecondNewDataset',file='mySecondNewDataset.rda') Then move the .rda files to the data folder inside the myPackage folder and re-run R CMD INSTALL Hope that helps! -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://www.nabble.com/create-dataset-permanently-in-package-%28i.e.-default-or-our-own-package%29-tp24679076p24688214.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to correlate nominal variables?
Hi Timo, I need functions to calculate Yule's Y or Cramérs Index... Are such functions existing? Also look at assocstats() in package vcd. Regards, Mark. Timo Stolz wrote: Dear R-Users, I need functions to calculate Yule's Y or Cramérs Index, in order to correlate variables that are nominally scaled? Am I wrong? Are such functions existing? Sincerely, Timo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/how-to-correlate-nominal-variables--tp18441195p24688304.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reordering the columns of my dataframe
Hi R-helpers, I have written this line of code: data-cbind(data[,1],data[,2:6],data[,18],data[,7:17]) to reorder the columns of my dataframe, but I'm losing the column names of my 1st and 18th columns (they are now named data[,1] and data[,18] respectively). Can I use cbind to do this (without losing my column names) or is there another way? Many thanks, Mark Na [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reordering the columns of my dataframe
On 28/07/2009, at 9:45 AM, Mark Na wrote: Hi R-helpers, I have written this line of code: data-cbind(data[,1],data[,2:6],data[,18],data[,7:17]) to reorder the columns of my dataframe, but I'm losing the column names of my 1st and 18th columns (they are now named data[,1] and data[,18] respectively). Can I use cbind to do this (without losing my column names) or is there another way? Just do data - data[,c(1:6,18,7:17)] cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Searching for specific values in a matrix
no luck, it's okay, i will figure it out! i might isolate and recombine all the columns, maybe that will work. thanks for the help! No, wait .. no luck in being able to select out rows from your data.frame using values you see somewhere in the top 10 rows? Can you just paste in some key lines in your session so we can see? For instance, let's assume your data is in my.data, I'd like to see the results for: # Replace the column values (1:5) with other columns # you want to use for selection R my.data[1:10,1:5] # Now show me your query and it's result that returns # a 0-row data.frame, for example using a value # that appears in that column from the previous query R my.data[my.data[,1] == 'something',] 0 rows (or 0-length row.names) There should be a simple answer to what's going wrong here. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Split rownames into factors
Hi Guys, I was wondering how you would go about solving the following problem: I have a list where the grouping information is in the row names. Rowname [,1] X1Jan08 324 X1Jun08 65 X1Dec08 543 X2Jan08 23 X2Jun08 54 X2Dec08 8765 X3Jan08 213 X3Jun08 43 X3Dec08 65 How can I create the following dataframe: ValueDateGroup [1,] 324 Jan 08X1 [2,] 65 Jun 08X1 [3,] 543 Dec 08X1 etc. Thanks for your help! James -- View this message in context: http://www.nabble.com/Split-rownames-into-factors-tp24689181p24689181.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Searching for specific values in a matrix
Ahh .. On Jul 27, 2009, at 6:01 PM, Mehdi Khan wrote: Even when choosing a value from the first few rows, it doesn't work. okay here it goes: rearranged[1:10, 1:5] xy band1 VSCAT.001 soiltype 1 -124.3949 40.42468NANA CD 2 -124.3463 40.27358NANA CD 3 -124.3357 40.25226NANA CD 4 -124.3663 40.40241NANA CD 5 -124.3674 40.49810NANA CD 6 -124.3083 40.24744NA 464 NA 7 -124.3017 40.31295NANAD 8 -124.3375 40.47557NA 464 NA 9 -124.2511 40.11697 1NA NA 10 -124.2532 40.12640 1NA NA query- rearranged$y== 40.42468 rearranged[query,] [1] x y band1 VSCAT.001 soiltype 0 rows (or 0-length row.names) This isn't working because the numbers you see for y (40.42468) isn't precisely what that number is. As I mentioned before you should use an almost.equals type of search for this scenario. My %~% function isn't working in your session because that is a function I've defined myself. You can of course use it, you just have to define it in your workspace. Paste these lines into your workspace (or save them to a file and source that file into your workspace). ## === almost.equal functions almost.equal - function(x, y, tolerance=.Machine$double.eps^0.5) { abs(x - y) tolerance } %~% - function(x, y) almost.equal(x, y) ## === end paste == Now you can use %~% once that's in. Let's use the almost.equal function now because I don't know if the default tolerance here is too strict (I suspect showing the value for rearranged$y[1] will show you more significant digits than you're seeing in the table(?)) query - almost.equal(rearranged$y, 40.42468, tolerance=0.0001) rearranged[query,] This will get you something. query- rearranged$ VSCAT.001== 464 except it's a huge table (I guess I have to get rid of all rows with NA). Yes, I believe I mentioned earlier that you have to axe the NA matches manually: query - rearranged$VSCAT.001 == 464 !is.na(rearranged$VSCAT.001) rearranged[query,] Will get you what you want. I tried using the %~% but R doesn't recognize it. So maybe it has to do with the rounding errors? Rounding errors won't happen with integer comparisons (and it looks like the VSCAT.001 columns is integers, no?). -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Searching for specific values in a matrix
Nothing wrong with rolling your own, but see ?all.equal for R's built-in almost.equal version. Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Steve Lianoglou Sent: Monday, July 27, 2009 3:17 PM To: Mehdi Khan Cc: r-help@r-project.org Subject: Re: [R] Searching for specific values in a matrix Ahh .. On Jul 27, 2009, at 6:01 PM, Mehdi Khan wrote: Even when choosing a value from the first few rows, it doesn't work. okay here it goes: rearranged[1:10, 1:5] xy band1 VSCAT.001 soiltype 1 -124.3949 40.42468NANA CD 2 -124.3463 40.27358NANA CD 3 -124.3357 40.25226NANA CD 4 -124.3663 40.40241NANA CD 5 -124.3674 40.49810NANA CD 6 -124.3083 40.24744NA 464 NA 7 -124.3017 40.31295NANAD 8 -124.3375 40.47557NA 464 NA 9 -124.2511 40.11697 1NA NA 10 -124.2532 40.12640 1NA NA query- rearranged$y== 40.42468 rearranged[query,] [1] x y band1 VSCAT.001 soiltype 0 rows (or 0-length row.names) This isn't working because the numbers you see for y (40.42468) isn't precisely what that number is. As I mentioned before you should use an almost.equals type of search for this scenario. My %~% function isn't working in your session because that is a function I've defined myself. You can of course use it, you just have to define it in your workspace. Paste these lines into your workspace (or save them to a file and source that file into your workspace). ## === almost.equal functions almost.equal - function(x, y, tolerance=.Machine$double.eps^0.5) { abs(x - y) tolerance } %~% - function(x, y) almost.equal(x, y) ## === end paste == Now you can use %~% once that's in. Let's use the almost.equal function now because I don't know if the default tolerance here is too strict (I suspect showing the value for rearranged$y[1] will show you more significant digits than you're seeing in the table(?)) query - almost.equal(rearranged$y, 40.42468, tolerance=0.0001) rearranged[query,] This will get you something. query- rearranged$ VSCAT.001== 464 except it's a huge table (I guess I have to get rid of all rows with NA). Yes, I believe I mentioned earlier that you have to axe the NA matches manually: query - rearranged$VSCAT.001 == 464 !is.na(rearranged$VSCAT.001) rearranged[query,] Will get you what you want. I tried using the %~% but R doesn't recognize it. So maybe it has to do with the rounding errors? Rounding errors won't happen with integer comparisons (and it looks like the VSCAT.001 columns is integers, no?). -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Splitting matrix into several small matrices
Dear R users... I need to split this matrix(or dataframe), for example, z - matrix(c(13,1,1,1,1,12,0,0,0,0,8,1,0,1,1,8,0,1,0,0, 10,1,1,1,1,3,0,1,0,0,3,1,0,1,1,6,1,1,1,1),8,5,byrow = T) z [,1] [,2] [,3] [,4] [,5] [1,] 131111 [2,] 120000 [3,]81011 [4,]90100 [5,] 101111 [6,]30100 [7,]31011 [8,]61111 (actually, z matrix is big, about 1000*15 matrix) to 4 matrices like this way, #- 1st matrix-- 131111 101111 61111 #- 2nd matrix-- 120000 #- 3rd matrix-- 81011 31011 #- 4th matrix-- 90100 30100 Any comments will be greatly appreciated. Kathryn Lord -- View this message in context: http://www.nabble.com/Splitting-matrix-into-several-small-matrices-tp24689585p24689585.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Draw plot.table axis on right hand side
With an ordinary plot, to customise the axis it is possible to suppress drawing the axis and then call Axis. I have been trying to change the location of the y-axis on a plot.table plot to the right hand side, but cannot even work out how to suppress drawing the labels. Here is a toy example of the sort of plot I am working with. Any suggestions as to how to have the axis on the right hand side not the left hand side would be appreciated. set.seed(2) data - data.frame(x= floor(runif(80)*5)+1, y=floor(runif(80)*5)+1) data$a - c(Cow, Dog, Fish, Mouse, Frog)[data$x] data$b - c(Banana, Apple, Pear, Orange, Melon)[data$y] plot(table(data$a, data$b), col=rainbow(5), las=1, main=) Regards, Sean. -- Sean Carmody The Stubborn Mule http://www.stubbornmule.net http://twitter.com/seancarmody -- Sean Carmody The Stubborn Mule http://www.stubbornmule.net http://twitter.com/seancarmody [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Double Truncation Fit??
Hey guys, Do you all know of a function that provides fitting for double-sided truncation? truncreg accounts for one-sided truncation, but not two, or at least I don't how to. Our outlier values are -115 on the left side and -55 on the right. Help appreciated, Vivek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split rownames into factors
Hi, I'm not an R expert, but I thought I'd give your question a shot anyway. First, it looks like you're starting with a matrix, rather than a list. Let's hope I guessed that right: m = matrix(c(324, 65, 543, 23, 54, 8765, 213, 43, 65)) rownames(m) = c('X1Jan08', 'X1Jun08', 'X1Dec08', 'X2Jan08', 'X2Jun08', 'X2Dec08', 'X3Jan08', 'X3Jun08', 'X3Dec08') m [,1] X1Jan08 324 X1Jun08 65 X1Dec08 543 X2Jan08 23 X2Jun08 54 X2Dec08 8765 X3Jan08 213 X3Jun08 43 X3Dec08 65 You can pull the individual values out of the compound thing in row names using regular expressions like so: gsub('([A-Z]\\d+)([A-Za-z]+)(\\d+)', '\\2 \\3', rownames(m), perl=T) With that, we can make a data.frame: df = data.frame(Value=m[,1], Date=gsub('([A-Z]\\d+)([A-Za-z]+)(\\d+)', '\\2 \\3', rownames(m), perl=T), Group=gsub('([A-Z]\\d+)([A-Za-z]+)(\\d+)', '\\1', rownames(m), perl=T)) The old compound row names hold over from the matrix, but we can cure that easily enough: rownames(df) = NULL df Value Date Group 1 324 Jan 08X1 265 Jun 08X1 3 543 Dec 08X1 423 Jan 08X2 554 Jun 08X2 6 8765 Dec 08X2 7 213 Jan 08X3 843 Jun 08X3 965 Dec 08X3 Both Date and Group will be coerced to factors, which is probably what you want with Group and maybe not with Data. If I'm wrong and you really have a list, it's not that different. First, get a vector of values: data.list = list(X1Jan08=324, X1Jun08=65, X1Dec08=543, X2Jan08=23, X2Jun08=54, X2Dec08=8765, X3Jan08=213, X3Jun08=43, X3Dec08=65) values = as.vector(data.list, mode=integer) The rest is very similar to what's above. I hope this helps, -chris On Mon, Jul 27, 2009 at 3:10 PM, jimdarejamesdar...@gmail.com wrote: Hi Guys, I was wondering how you would go about solving the following problem: I have a list where the grouping information is in the row names. Rowname [,1] X1Jan08 324 X1Jun08 65 X1Dec08 543 X2Jan08 23 X2Jun08 54 X2Dec08 8765 X3Jan08 213 X3Jun08 43 X3Dec08 65 How can I create the following dataframe: Value Date Group [1,] 324 Jan 08 X1 [2,] 65 Jun 08 X1 [3,] 543 Dec 08 X1 etc. Thanks for your help! James -- View this message in context: http://www.nabble.com/Split-rownames-into-factors-tp24689181p24689181.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] frequent sequences
Hello, I'm having a few issues mining frequent sequences. I've read the documentation and played around with arules and arulesSequences with little success. For example if I have a vector, a = t(t(c(1,2,3,0,1,2,3,5,6,7))); I'd like to be able mine the association rules {1,2}--{3}, {2}--{3}, {1}--{2}, etc. Can anyone point me in the right direction here. From what I've read using arules, it doesn't seem to take a single vector and extract sequences from the vector to mine from within itself. Any help would be appreciated! Thanks, John -- View this message in context: http://www.nabble.com/frequent-sequences-tp24687821p24687821.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Non-Linear Regression with two Predictors
Hi and thank you for your reply, in my new regression formular the parameter delta is inserted: fit - nls(dataset$V2~(( alpha / ( 1 + exp( beta - gamma* dataset$V1 ) ) ) + (dataset$V6*delta)),data=dataset,start=startparam) The sense is, that dataset$V6 is a dummy variable that represents the german reunion. I expect, that the delta in the regression is about 1.000.000 because of the reunion. The logistic function thus has a jump at this point. But I would like to get the exact paramter value for delta (the jump) as the parameters for the logistic function of growth (alpha to gamma). The partial derivative to delta would be like a stair-function. It is 0 until 1990 and 1 there after. Any idea? Thank you! Regards Moshe Olshansky schrieb: Hi, I believe that since delta does not appear in the function you are optimizing, it's partial derivative with respect to delta is always 0 and so the gradient is singular. Why do you need delta at all? --- On Mon, 27/7/09, Berlinerfee berliner...@yahoo.de wrote: From: Berlinerfee berliner...@yahoo.de Subject: [R] Non-Linear Regression with two Predictors To: r-h...@stat.math.ethz.ch Received: Monday, 27 July, 2009, 2:52 AM Hello there, I am using nls the first time for a non-linear regression with a logistic growth function: startparam - c(alpha=3e+07,beta=4000,gamma=2) fit - nls(dataset$V2~(( alpha / ( 1 + exp( beta - gamma * dataset$V1 ) ) ) ),data=dataset,start=startparam) Everything works fine and i get good results. Now I would like to improve the results using my DUMMY Variable (dataset$V6) the runs half of the time 0 and then 1. This is my new nls: startparam - c(alpha=3e+07,beta=4000,gamma=2,delta=100) fit - nls(dataset$V2~(( alpha / ( 1 + exp( beta - gamma * dataset$V1 ) ) ) + (dataset$V6*dataset$V1*delta) ),data=dataset,start=startparam) I get Singular Gradient Matrice. May anyone give me the right nls function for this problem?? Regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Searching for specific values in a matrix
the problem is, it works with the example data i gave. however, it does NOT work with the data set i have, which is 600,000 rows. the class is still a data frame. On Mon, Jul 27, 2009 at 12:15 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: On Jul 27, 2009, at 2:54 PM, Mehdi Khan wrote: i am able to return the first column, but anything else returns this: 0 rows (or 0-length row.names) any idea? I'm not sure what you're doing. The result you're getting happens when no rows pass the logical test that you are using to index the rows of your data.frame for. Can you show the code that you are using (based on the example data you gave) that is giving you the 0 rows result? -steve On Tue, Jul 21, 2009 at 12:49 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: On Jul 21, 2009, at 3:27 PM, Mehdi Khan wrote: I understand your explanation about the test for even numbers. However I am still a bit confused as to how to go about finding a particular value. Here is an example data set col # attr1attr2 attr 3LONLAT 17209 DNANA -122.9409 38.27645 17210BCNANA -122.9581 38.36304 17211 BNANA -123.6851 41.67121 17212BCNANA -123.0724 38.93073 17213 CNANA -123.7240 41.84403 17214 NA 464NA -122.9430 38.30988 17215 CNANA -123.4442 40.65369 17216BCNANA -122.9389 38.31551 17217 CNANA -123.0747 38.97998 17218 CNANA -123.6580 41.59610 17219 CNANA -123.4513 40.70992 17220 CNANA -123.0901 39.06473 17221BCNANA -123.0653 38.94845 17222BCNANA -122.9464 38.36808 17223 NA 464NA -123.0143 38.70205 17224 NANA 5 -122.8609 37.94137 17225 NANA 5 -122.8628 37.95057 17226 NANA 7 -122.8646 37.95978 For future reference, perhaps paste this in a way that's easy for us to paste into a running R session so we can use it, like so: df - data.frame( coln=c(17209, 17210, 17211, 17212, 17213, 17214, 17215, 17216, 17217, 17218, 17219, 17220, 17221, 17222, 17223, 17224, 17225, 17226), attr1=c(D,BC,B,BC,C,NA,C,BC,C,C,C,C,BC,BC,NA,NA,NA,NA), attr2=c( NA,NA,NA,NA,NA,464,NA,NA,NA,NA,NA,NA,NA,NA,464,NA,NA,NA), attr3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5,7), LON=c( -122.9409,-122.9581,-123.6851,-123.0724,-123.7240,-122.9430,-123.4442,-122.9389,-123.0747,-123.6580,-123.4513,-123.0901,-123.0653,-122.9464,-123.0143,-122.8609,-122.8628,-122.8646), LAT=c(38.27645,38.36304,41.67121,38.93073,41.84403,38.30988,40.65369,38.31551,38.97998,41.59610,40.70992,39.06473,38.94845,38.36808,38.70205,37.94137,37.95057,37.95978)) If I wanted to find the row with Lat = 37.95978 Using an indexing vector: R lats - df$LAT == 37.95978 # or with the %~% from before: # lats - df$LAT %~% 37.95978 R df[lats,] coln attr1 attr2 attr3 LON LAT 18 17226 NANA 7 -122.8646 37.95978 Using the subset function: R subset(df, LAT == 37.95978) coln attr1 attr2 attr3 LON LAT 18 17226 NANA 7 -122.8646 37.95978 , how would i do that? How would I find the rows with BC? R subset(df, attr1 == 'BC') coln attr1 attr2 attr3 LON LAT 2 17210BCNANA -122.9581 38.36304 4 17212BCNANA -123.0724 38.93073 8 17216BCNANA -122.9389 38.31551 13 17221BCNANA -123.0653 38.94845 14 17222BCNANA -122.9464 38.36808 If you try with an indexing vector the NA's will trip you up: R df[df$attr1 == 'BC',] coln attr1 attr2 attr3 LON LAT 217210BCNANA -122.9581 38.36304 417212BCNANA -123.0724 38.93073 NA NA NANANANA NA 817216BCNANA -122.9389 38.31551 13 17221BCNANA -123.0653 38.94845 14 17222BCNANA -122.9464 38.36808 NA.1NA NANANANA NA NA.2NA NANANANA NA NA.3NA NANANANA NA NA.4NA NANANANA NA So you could do something like: df[df$attr1 == 'BC' !is.na(df$attr1),] coln attr1 attr2 attr3 LON LAT 2 17210BCNANA -122.9581 38.36304 4 17212BCNANA -123.0724 38.93073 8 17216BCNANA -122.9389 38.31551 13 17221BCNANA -123.0653 38.94845 14 17222BCNANA -122.9464 38.36808 HTH, -steve -- Steve Lianoglou Graduate Student: Physiology, Biophysics and Systems Biology Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contacthttp://cbio.mskcc.org/%7Elianos/contact -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center
Re: [R] Searching for specific values in a matrix
no luck, it's okay, i will figure it out! i might isolate and recombine all the columns, maybe that will work. thanks for the help! On Mon, Jul 27, 2009 at 1:00 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: On Jul 27, 2009, at 3:50 PM, Mehdi Khan wrote: the problem is, it works with the example data i gave. however, it does NOT work with the data set i have, which is 600,000 rows. the class is still a data frame. So the problem must be in your data, or what you think is in your data. Somehow you're constructing a boolean query that returns false for every row. As long as you're not getting any memory errors, the size of your data doesn't change the mechanics of how this would work. I suspect you're not getting 0 rows for every possible query you can come up with, right? Look at the first 10 lines of your dataset and try to select some rows from your entire data.frame by using values you can see in the first 10 rows you've just looked at. I'm expecting this would work, in which case I'm not sure how much more help I can provide. -steve On Mon, Jul 27, 2009 at 12:15 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: On Jul 27, 2009, at 2:54 PM, Mehdi Khan wrote: i am able to return the first column, but anything else returns this: 0 rows (or 0-length row.names) any idea? I'm not sure what you're doing. The result you're getting happens when no rows pass the logical test that you are using to index the rows of your data.frame for. Can you show the code that you are using (based on the example data you gave) that is giving you the 0 rows result? -steve On Tue, Jul 21, 2009 at 12:49 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: On Jul 21, 2009, at 3:27 PM, Mehdi Khan wrote: I understand your explanation about the test for even numbers. However I am still a bit confused as to how to go about finding a particular value. Here is an example data set col # attr1attr2 attr 3LONLAT 17209 DNANA -122.9409 38.27645 17210BCNANA -122.9581 38.36304 17211 BNANA -123.6851 41.67121 17212BCNANA -123.0724 38.93073 17213 CNANA -123.7240 41.84403 17214 NA 464NA -122.9430 38.30988 17215 CNANA -123.4442 40.65369 17216BCNANA -122.9389 38.31551 17217 CNANA -123.0747 38.97998 17218 CNANA -123.6580 41.59610 17219 CNANA -123.4513 40.70992 17220 CNANA -123.0901 39.06473 17221BCNANA -123.0653 38.94845 17222BCNANA -122.9464 38.36808 17223 NA 464NA -123.0143 38.70205 17224 NANA 5 -122.8609 37.94137 17225 NANA 5 -122.8628 37.95057 17226 NANA 7 -122.8646 37.95978 For future reference, perhaps paste this in a way that's easy for us to paste into a running R session so we can use it, like so: df - data.frame( coln=c(17209, 17210, 17211, 17212, 17213, 17214, 17215, 17216, 17217, 17218, 17219, 17220, 17221, 17222, 17223, 17224, 17225, 17226), attr1=c(D,BC,B,BC,C,NA,C,BC,C,C,C,C,BC,BC,NA,NA,NA,NA), attr2=c( NA,NA,NA,NA,NA,464,NA,NA,NA,NA,NA,NA,NA,NA,464,NA,NA,NA), attr3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5,7), LON=c( -122.9409,-122.9581,-123.6851,-123.0724,-123.7240,-122.9430,-123.4442,-122.9389,-123.0747,-123.6580,-123.4513,-123.0901,-123.0653,-122.9464,-123.0143,-122.8609,-122.8628,-122.8646), LAT=c(38.27645,38.36304,41.67121,38.93073,41.84403,38.30988,40.65369,38.31551,38.97998,41.59610,40.70992,39.06473,38.94845,38.36808,38.70205,37.94137,37.95057,37.95978)) If I wanted to find the row with Lat = 37.95978 Using an indexing vector: R lats - df$LAT == 37.95978 # or with the %~% from before: # lats - df$LAT %~% 37.95978 R df[lats,] coln attr1 attr2 attr3 LON LAT 18 17226 NANA 7 -122.8646 37.95978 Using the subset function: R subset(df, LAT == 37.95978) coln attr1 attr2 attr3 LON LAT 18 17226 NANA 7 -122.8646 37.95978 , how would i do that? How would I find the rows with BC? R subset(df, attr1 == 'BC') coln attr1 attr2 attr3 LON LAT 2 17210BCNANA -122.9581 38.36304 4 17212BCNANA -123.0724 38.93073 8 17216BCNANA -122.9389 38.31551 13 17221BCNANA -123.0653 38.94845 14 17222BCNANA -122.9464 38.36808 If you try with an indexing vector the NA's will trip you up: R df[df$attr1 == 'BC',] coln attr1 attr2 attr3 LON LAT 217210BCNANA -122.9581 38.36304 417212BCNANA -123.0724 38.93073 NA NA NANANANA NA 817216BCNANA -122.9389 38.31551 13 17221BCNANA -123.0653 38.94845 14 17222BCNA
Re: [R] Searching for specific values in a matrix
it worked! thank you so much!! On Mon, Jul 27, 2009 at 3:16 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: Ahh .. On Jul 27, 2009, at 6:01 PM, Mehdi Khan wrote: Even when choosing a value from the first few rows, it doesn't work. okay here it goes: rearranged[1:10, 1:5] xy band1 VSCAT.001 soiltype 1 -124.3949 40.42468NANA CD 2 -124.3463 40.27358NANA CD 3 -124.3357 40.25226NANA CD 4 -124.3663 40.40241NANA CD 5 -124.3674 40.49810NANA CD 6 -124.3083 40.24744NA 464 NA 7 -124.3017 40.31295NANAD 8 -124.3375 40.47557NA 464 NA 9 -124.2511 40.11697 1NA NA 10 -124.2532 40.12640 1NA NA query- rearranged$y== 40.42468 rearranged[query,] [1] x y band1 VSCAT.001 soiltype 0 rows (or 0-length row.names) This isn't working because the numbers you see for y (40.42468) isn't precisely what that number is. As I mentioned before you should use an almost.equals type of search for this scenario. My %~% function isn't working in your session because that is a function I've defined myself. You can of course use it, you just have to define it in your workspace. Paste these lines into your workspace (or save them to a file and source that file into your workspace). ## === almost.equal functions almost.equal - function(x, y, tolerance=.Machine$double.eps^0.5) { abs(x - y) tolerance } %~% - function(x, y) almost.equal(x, y) ## === end paste == Now you can use %~% once that's in. Let's use the almost.equal function now because I don't know if the default tolerance here is too strict (I suspect showing the value for rearranged$y[1] will show you more significant digits than you're seeing in the table(?)) query - almost.equal(rearranged$y, 40.42468, tolerance=0.0001) rearranged[query,] This will get you something. query- rearranged$ VSCAT.001== 464 except it's a huge table (I guess I have to get rid of all rows with NA). Yes, I believe I mentioned earlier that you have to axe the NA matches manually: query - rearranged$VSCAT.001 == 464 !is.na(rearranged$VSCAT.001) rearranged[query,] Will get you what you want. I tried using the %~% but R doesn't recognize it. So maybe it has to do with the rounding errors? Rounding errors won't happen with integer comparisons (and it looks like the VSCAT.001 columns is integers, no?). -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contacthttp://cbio.mskcc.org/%7Elianos/contact [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Making a sub data.frame
Dear all, I have a data.frame like this ID VAR1 11 blaaal 121 blalda 121 adada 234baada 231 ddaaa 231 baada ... ... and I have another vector of ID, say, c(121,234,231) How could I collect all the observations start with ID from c(121,234,231) ? Thanks All the Best, Desper -- View this message in context: http://www.nabble.com/Making-a-sub-data.frame-tp24687873p24687873.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Searching for specific values in a matrix
Even when choosing a value from the first few rows, it doesn't work. okay here it goes: rearranged[1:10, 1:5] xy band1 VSCAT.001 soiltype 1 -124.3949 40.42468NANA CD 2 -124.3463 40.27358NANA CD 3 -124.3357 40.25226NANA CD 4 -124.3663 40.40241NANA CD 5 -124.3674 40.49810NANA CD 6 -124.3083 40.24744NA 464 NA 7 -124.3017 40.31295NANAD 8 -124.3375 40.47557NA 464 NA 9 -124.2511 40.11697 1NA NA 10 -124.2532 40.12640 1NA NA query- rearranged$y== 40.42468 rearranged[query,] [1] x y band1 VSCAT.001 soiltype 0 rows (or 0-length row.names) hmm it seems to be working for the whole number... query- rearranged$ VSCAT.001== 464 except it's a huge table (I guess I have to get rid of all rows with NA). I tried using the %~% but R doesn't recognize it. So maybe it has to do with the rounding errors? On Mon, Jul 27, 2009 at 2:56 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: no luck, it's okay, i will figure it out! i might isolate and recombine all the columns, maybe that will work. thanks for the help! No, wait .. no luck in being able to select out rows from your data.frame using values you see somewhere in the top 10 rows? Can you just paste in some key lines in your session so we can see? For instance, let's assume your data is in my.data, I'd like to see the results for: # Replace the column values (1:5) with other columns # you want to use for selection R my.data[1:10,1:5] # Now show me your query and it's result that returns # a 0-row data.frame, for example using a value # that appears in that column from the previous query R my.data[my.data[,1] == 'something',] 0 rows (or 0-length row.names) There should be a simple answer to what's going wrong here. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contacthttp://cbio.mskcc.org/%7Elianos/contact [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.