Re: [R] R Console Output
Thank you for your response. I would like to produce Flash animation with code such as what is below. saveSWF({ code }, img.name = file,swf.name = file2.swf , single.opts = 'utf8': false, autoplay = FALSE , interval = 0.1, imgdir = directory, htmlfile = random.html, ani.height = 500, ani.width = 500, title = groups, description = c(group1, group2)) Usually there is output in the console where the Flash animation has been produced such as: Flash has been created at: C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ\file2.swf When I run my code, there is the output below. Yet, the Flash file is not produced, and there is no error message. I am still able to enter commands. Thanks in advance for your help. Executing: C:\Program Files (x86)\SWFTools\png2swf.exe C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file1.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file2.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file3.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file4.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file5.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file6.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file7.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file8.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file9.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00010.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00011.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00012.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00013.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00014.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00015.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00016.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00017.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00018.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00019.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00020.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00021.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00022.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00023.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00024.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00025.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00026.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00027.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00028.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00029.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00030.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00031.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00032.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00033.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00034.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00035.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00036.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00037.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00038.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00039.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00040.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00041.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00042.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00043.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00044.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00045.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00046.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00047.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00048.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00049.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00050.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00051.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00052.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00053.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00054.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00055.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00056.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00057.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00058.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00059.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00060.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00061.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00062.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00063.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00064.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00065.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00066.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00067.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00068.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00069.png C:\Users\CHERYL\AppData\Local\Temp\RtmpqayKkJ/file00070.png
Re: [R] matrix
Hi Izhak, If the position of the elements to be replaced follow the pattern below: seq(1,length(t), by=7) #[1] 1 8 15 t[seq(1,length(t), by=7)] - c(50,90,100) A.K. On Monday, June 30, 2014 4:19 PM, Adams, Jean jvad...@usgs.gov wrote: t[1, 1] - 50 t[3, 2] - 90 t[5, 3] - 100 Jean On Mon, Jun 30, 2014 at 10:27 AM, IZHAK shabsogh ishaqb...@yahoo.com wrote: kindly guide me on how i can delete and replace an element from a matrix t below for example delete first element in column one and replace it with 50, third element in column 2 by 90 and fifth element in column 3 by 100 t1-c(1,2,3,4,5) t2-c(6,7,8,9,10) t3-c(11,12,13,14,15) t-cbind(t1,t2,t3) thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] From long to wide format
HI Jorge, I was able to reproduce the error. The link below provides a way to adjust the stack. I didn't test it. http://stackoverflow.com/questions/14719349/error-c-stack-usage-is-too-close-to-the-limit Also check this link http://stackoverflow.com/questions/13245019/how-to-change-the-stack-size-using-ulimit-or-per-process-on-mac-os-x-for-a-c-or A.K. On Monday, June 30, 2014 10:08 PM, Jorge I Velez jorgeivanve...@gmail.com wrote: Hi Arun, Thank you very much for your suggestion. While running some tests, I came across the following: # sample data n - 2000 p - 1000 x2 - data.frame(variable = rep(paste0('x', 1:p), each = n), id = rep(paste0('p', 1:p), n), outcome = sample(0:2, n*p, TRUE), rate = runif(n*p, 0.5, 1)) str(x2) library(dplyr) library(tidyr) # Arun's suggestion system.time({wide1 - x2%% select(-rate) %% mutate(variable=factor(variable, levels=unique(variable)),id=factor(id, levels=unique(id))) %% spread(variable,outcome) colnames(wide1)[-1] - paste(outcome,colnames(wide1)[-1],sep=.) }) # Error: C stack usage 18920219 is too close to the limit # Timing stopped at: 13.833 0.251 14.085 Do you happen to know what can be done to avoid this? Thank you. Best, Jorge.- On Mon, Jun 30, 2014 at 6:51 PM, arun smartpink...@yahoo.com wrote: Hi Jorge, You may try: library(dplyr) library(tidyr) #Looks like this is faster than the other methods. system.time({wide1 - x2%% select(-rate) %% mutate(variable=factor(variable, levels=unique(variable)),id=factor(id, levels=unique(id))) %% spread(variable,outcome) colnames(wide1)[-1] - paste(outcome,colnames(wide1)[-1],sep=.) }) #user system elapsed # 0.006 0.00 0.006 system.time(wide - reshape(x2[, -4], v.names = outcome, idvar = id, timevar = variable, direction = wide)) #user system elapsed # 0.169 0.000 0.169 system.time({ sel - unique(x2$variable) id - unique(x2$id) X - matrix(NA, ncol = length(sel) + 1, nrow = length(id)) X[, 1] - id colnames(X) - c('id', sel) r - mclapply(seq_along(sel), function(i){ out - x2[x2$variable == sel[i], ][, 3] }, mc.cores = 4) X[, -1] - do.call(rbind, r) X }) # user system elapsed # 0.125 0.011 0.074 wide2 - wide1 wide2$id - as.character(wide2$id) wide$id - as.character(wide$id) all.equal(wide, wide2, check.attributes=F) #[1] TRUE A.K. On Sunday, June 29, 2014 11:48 PM, Jorge I Velez jorgeivanve...@gmail.com wrote: Dear R-help, I am working with some data stored as filename.txt.gz in my working directory. After reading the data in using read.table(), I can see that each of them has four columns (variable, id, outcome, and rate) and the following structure: # sample data x2 - data.frame(variable = rep(paste0('x', 1:100), each = 100), id = rep(paste0('p', 1:100), 100), outcome = sample(0:2, 1, TRUE), rate = runif(1, 0.5, 1)) str(x2) Each variable, i.e., x1, x2,..., x100 is repeated as many times as the number of unique IDs (100 in this example). What I would like to do is to transform the data above in a long format. I can do this by using # reshape wide - reshape(x2[, -4], v.names = outcome, idvar = id, timevar = variable, direction = wide) str(wide) # or a hack with mclapply: require(parallel) sel - as.character(unique(x2$variable)) id - as.character(unique(x2$id)) X - matrix(NA, ncol = length(sel) + 1, nrow = length(id)) X[, 1] - id colnames(X) - c('id', sel) r - mclapply(seq_along(sel), function(i){ out - x2[x2$variable == sel[i], ][, 3] }, mc.cores = 4) X[, -1] - do.call(rbind, r) X However, I was wondering if it is possible to come up with another solution , hopefully faster than these . Unfortunately, either one of these takes a very long time to process, specially when the number of variables is very large ( 250,000) and the number of ids is ~2000. I would very much appreciate your suggestions. At the end of this message is my sessionInfo(). Thank you very much in advance. Best regards, Jorge Velez.- R sessionInfo() R version 3.0.2 Patched (2013-12-11 r64449) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8 attached base packages: [1] graphics grDevices utils datasets parallel compiler stats [8] methods base other attached packages: [1] knitr_1.6.3 ggplot2_1.0.0 slidifyLibraries_0.3.1 [4] slidify_0.3.52 loaded via a namespace (and not attached): [1] colorspace_1.2-4 digest_0.6.4 evaluate_0.5.5 formatR_0.10 [5] grid_3.0.2 gtable_0.1.2 markdown_0.7.1 MASS_7.3-33 [9] munsell_0.4.2 plyr_1.8.1 proto_0.3-10 Rcpp_0.11.2 [13] reshape2_1.4 scales_0.2.4 stringr_0.6.2 tools_3.0.2 [17] whisker_0.4 yaml_2.1.13 [[alternative HTML version deleted]]
[R] New editions of R Books from Chapman Hall/CRC
Take advantage of a 20% discount on new editions of the most recent R books from Chapman Hall/CRC! June and July will be very busy for us, with new editions of some of our most popular R books publishing. We are pleased to offer a 20% discount on these titles through our website. To take advantage of this offer, simply visit www.crcpress.com, choose your titles, and insert code AJL01 in the Promotion Code field at checkout. Standard Shipping is always FREE on all orders from CRCPress.com! *** Linear Models with R, Second Edition Julian Faraway ISBN: 978-1-4398-8733-2 Publication Date: July 17, 2014 Number of pages: 286 New to the Second Edition * Reorganized material on interpreting linear models, which distinguishes the main applications of prediction and explanation and introduces elementary notions of causality * Additional topics, including QR decomposition, splines, additive models, Lasso, multiple imputation, and false discovery rates * Extensive use of the ggplot2 graphics package in addition to base graphics List Price: $89.95 / £57.99 For more details and to order: http://www.crcpress.com/product/isbn/9781439887332 *** Using R for Introductory Statistics, Second Edition John Verzani ISBN: 978-1-4665-9073-1 Publication Date: June 26, 2014 Number of pages: 518 New to the Second Edition: * Increased emphasis on more idiomatic R provides a grounding in the functionality of base R * Discussions of the use of RStudio helps new R users avoid as many pitfalls as possible * Use of knitr package makes code easier to read and therefore easier to reason about * Additional information on computer-intensive approaches motivates the traditional approach * Updated examples and data make the information current and topical List Price: $59.95 / £38.99 For more details and to order: http://www.crcpress.com/product/isbn/9781466590731 *** A Handbook of Statistical Analyses using R, Third Edition Torsten Hothorn Brian S. Everitt ISBN: 978-1-4822-0458-2 Publication Date: June 25, 2014 Number of pages: 448 Pages New to the Third Edition: * Three new chapters on quantile regression, missing values, and Bayesian inference * Extra material in the logistic regression chapter that describes a regression model for ordered categorical response variables * Additional exercises * More detailed explanations of R code * New section in each chapter summarizing the results of the analyses * Updated version of the HSAUR package (HSAUR3), which includes some slides that can be used in introductory statistics courses List Price: $64.95 / £39.99 For more details and to order: http://www.crcpress.com/product/isbn/9781482204582 *** SAS and R: Data Management, Statistical Analysis, and Graphics, Second Edition Ken Kleinman Nicholas J. Horton ISBN: 978-1-4665-8449-5 Publication Date: July 9, 2014 Number of pages: 468 New to the Second Edition: This edition now covers RStudio, a powerful and easy-to-use interface for R. It incorporates a number of additional topics, including using application program interfaces (APIs), accessing data through database management systems, using reproducible analysis tools, and statistical analysis with Markov chain Monte Carlo (MCMC) methods and finite mixture models. It also includes extended examples of simulations and many new examples. List Price: $79.95 / £49.99 For more details and to order: http://www.crcpress.com/product/isbn/9781466584495 *** Introduction to Scientific Programming and Simulation Using R, Second Edition Owen Jones, Robert Maillardet, Andrew Robinson ISBN: 978-1-4665-6999-7 Publication Date: June 12, 2014 Number of pages: 600 New to the Second Edition: In a new chapter on systems of ordinary differential equations (ODEs), the authors cover the Euler, midpoint, and fourth-order Runge-Kutta (RK4) schemes for solving systems of first-order ODEs. They compare the numerical efficiency of the different schemes experimentally and show how to improve the RK4 scheme by using an adaptive step size. There is also a new chapter focuses on both discrete- and continuous-time Markov chains. It describes transition and rate matrices, classification of states, limiting behaviour, Kolmogorov forward and backward equations, finite absorbing chains, and expected hitting times. It also presents methods for simulating discrete- and continuous-time chains as well as techniques for defining the state space, including lumping states and supplementary variables. List Price: $79.95 / £49.99 For more details and to order: http://www.crcpress.com/product/isbn/9781466569997 *** Linear Mixed Models: A Practical Guide Using Statistical Software, Second Edition Brady T. West, Kathleen B. Welch, Andrzej T Galecki ISBN: 978-1-4665-6099-4 Publication Date: July 24, 2014 Number of pages: 440 New to the Second Edition * A new chapter on models with crossed random effects that uses a case study to illustrate software procedures capable of
[R] What are the other Options for hiddenActFunc in the RSNNS r package?
I am trying to figure out the options for hiddenActFunc..any help would be great!! -- View this message in context: http://r.789695.n4.nabble.com/What-are-the-other-Options-for-hiddenActFunc-in-the-RSNNS-r-package-tp4693313.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 1-dinemsional point process
Hi, As a new user, is it possible to look at clustering/dispersion processes of a 1D point process (i.e. points along a transect)? My limited understanding is that spatstat is for 23D point patterns. Thanks -- View this message in context: http://r.789695.n4.nabble.com/1-dinemsional-point-process-tp4693315.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change database in SQL Server using RODBC
Thanks for everyone’s help. I followed the instruction given on a variety of web pages in order to set up the connection. The problem is trying to use the first connection for a second database and doing so from within R. It seems to me that an easy workaround is to simply set up another connection and use a second database as the default. In Windows 7 the basic strategy is do the following: Control Panel Administrative Tools Data Sources (ODBC) In the ODBC Data Source Administrator pop up select SQL 2012 and then click on Add. Since I do not have to work with a large number of databases, I consider this to be a satisfactory work around. On 6/30/2014 8:17 AM, Frede Aakmann Tøgersen wrote: Hi I can see that you do have troubles understanding how all this works using the RODBC package. Peter wasn't really being helpful to you. This is something that is quite difficult to help with not sitting beside you. Do you not having some local help from e.g. the IT department? However for a start please let me know how you managed to get con = odbcConnect(SQLServer2012) to work. It seems like that some DSN was set up. From there we can probably find a solution. Br. Frede Sendt fra Samsung mobil Oprindelig meddelelse Fra: Ira Sharenow Dato:30/06/2014 16.42 (GMT+01:00) Til: Peter Crowther ,R list Emne: Re: [R] Change database in SQL Server using RODBC Thanks for everyone’s feedback. library(RODBC) con = odbcConnect(SQLServer2012) orders1 = sqlFetch(con,dbo.orders) odbcClose(con) Allowed me to close the connection properly. Thanks. However, I still cannot figure out how to connect to the second database and table. library(RODBC) con2 = odbcConnect([sportsDB].dbo.sports) Warning messages: 1: In odbcDriverConnect(DSN=[sportsDB].dbo.sports) : [RODBC] ERROR: state IM002, code 0, message [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified 2: In odbcDriverConnect(DSN=[sportsDB].dbo.sports) : ODBC connection failed con2 = odbcConnect([sportsDB].[dbo].sports) Warning messages: 1: In odbcDriverConnect(DSN=[sportsDB].[dbo].sports) : [RODBC] ERROR: state IM002, code 0, message [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified 2: In odbcDriverConnect(DSN=[sportsDB].[dbo].sports) : ODBC connection failed con2 = odbcConnect([sportsDB].[dbo].[sports]) Warning messages: 1: In odbcDriverConnect(DSN=[sportsDB].[dbo].[sports]) : [RODBC] ERROR: state IM002, code 0, message [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified 2: In odbcDriverConnect(DSN=[sportsDB].[dbo].[sports]) : ODBC connection failed con3 = odbcConnect(SQLServer2012) orders3 = sqlFetch(con3, sportsDB.dbo.sports) Error in odbcTableExists(channel, sqtable) : ‘sportsDB.dbo.sports’: table not found on channel On 6/30/2014 1:34 AM, Peter Crowther wrote: On 30 June 2014 02:44, Ira Sharenow irasharenow...@yahoo.com wrote: I wish to query tables that are NOT in the default SQL Server 2012 database. Now for the problem. I also want to read in the table dbo.sports. That table is in the database sportsDB. I did not see any way to do so from within R. Can you not use sportsDB.dbo.sports to reference the table? In general, table reference syntax is [ [ [ serverName '.' ] databaseName '.' ] [schema ] '.' ] tableName, where the names need only be surrounded by [...] if they are not valid SQL Server identifiers. Many people may suggest you reference [sportsDB].[dbo].[sports]; this is unnecessary verbiage. Cheers, - Peter [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using external SQLite installation for RSQLite in Windows?
Hello All, I'm trying to figure out how to link RSQLite to an external sqlite3 installation compiled for a 64bit platform. I see from the CRAN installation instructions (see: http://cran.r-project.org/web/packages/RSQLite/INSTALL) that on a unix machine there's a way to set the configuration to access an outside, and in theory the installation says it looks for sqlite installations before downloading one of it's own, so I assume it's possible. I've tried the changes recommended in the installation PDF, including: - compiling and installing sqlite3 (a full folder located at C:/sqlite3 with lib, include, and an .exe), then removing and reinstalling RSQLite. - removing and installing RSQLite with INSTALL_opts=--with-sqlite-dir=C:/sqlite3 (since apparently configure-args aren't for windows?) - removing and installing RSQLite after setting system vars PKG_LIBS=-Lc:/sqlite3/lib -lsqlite and PKG_CPPFLAGS=-Ic:/sqlite3/include - removing and installing RSQLite after setting system var LD_LIBRARY_PATH=C:/sqlite3/lib In all cases, when I create then query a new database (using dbGetInfo()) the system remains 3.7.17 -- the default RSQLite installation -- never 3.8.5 -- the version I installed. (The reason, for those interested, is that 32 bit windows was built with memory address stored in variables that couldn't address to more than 2gb of space, limiting individual processes to ~1900mb. The 32 bit build of sqlite3 still uses the variable types that can only cover 2gb of ram, so even on a 64 bit machine, 32bit sqlite3 can't allocate more than 2gb of ram, which has a huge effect on performance. So I'm trying to connect to a 64 bit build. ) Thanks! Nick [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to plot individual pdf files for each wrapped plot with ggplot2?
Thanks a lot for your reply Trevor! I've been working with the code but I cannot make it work. I have 2 main problems: 1. From running the loop I get pdf files with no pages generated. 2. I don't know how to write the code to get the 8 sex/day combinations. library(ggplot2) library(reshape2) sex1 - as.character(unique(tips$sex)) day1 - as.character(unique(tips$day)) for (i in 1:length(sex1)){ for (j in 1:length(day1)){ pdf(sprintf(C:/Users/bgonzale/Desktop/example_%s.pdf, i, j)) data1 - subset(tips, sex==sex1[i] day==day1[j]) ggplot(data1, aes(x=total_bill, y=tip/total_bill)) + geom_point(shape=1) dev.off() }} Thanks very much! On 30/06/2014 19:35, Trevor Davies wrote: I think the easiest most straight forward way would be to just throw it into a loop and subset the data on each loop around (untested code below but I'm sure you get the gist). ~Trevor sex1-unique(tips$sex) day1-unique(tips$day) for (i in 1:length(sex1)){ for (j in 1:length(day1)){ pdf(paste('example_',sex1[i],day1[j],'.pdf',sep='')) data1-subset(tips, sex==sex1[i] day==day1[j]) sp - ggplot(data1,aes(x=total_bill, y = tip/total_bill)) +geom_point(shape=1) plot(sp) dev.off() } } On Mon, Jun 30, 2014 at 10:23 AM, Bea GD aguitatie...@hotmail.com mailto:aguitatie...@hotmail.com wrote: Hi, I'm working with tips data from reshape package. library(reshape2) I'm saving my plots as pdf and I was wondering whether it was possible to print a different pdf for each 'wrapped' plot. Using the code below as an example, I'd like to get 8 independent pdf files for each sex ~ day combination. sp - ggplot(tips,aes(x=total_bill, y = tip/total_bill)) + geom_point(shape=1) + facet_grid(sex ~ day) plot(sp) Thanks a lot for your help! [[alternative HTML version deleted]] __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to plot individual pdf files for each wrapped plot with ggplot2?
Just solved the first problem! I had to generate a plot and then plotted. Now it's saved into pdf. Only the second issue: *2. I don't know how to write the code to get the 8 sex/day combinations.* Thanks! On 01/07/2014 12:59, Bea GD wrote: Thanks a lot for your reply Trevor! I've been working with the code but I cannot make it work. I have 2 main problems: 1. From running the loop I get pdf files with no pages generated. *2. I don't know how to write the code to get the 8 sex/day combinations.* library(ggplot2) library(reshape2) sex1 - as.character(unique(tips$sex)) day1 - as.character(unique(tips$day)) for (i in 1:length(sex1)){ for (j in 1:length(day1)){ pdf(sprintf(C:/Users/bgonzale/Desktop/example_%s.pdf, i, j)) data1 - subset(tips, sex==sex1[i] day==day1[j]) p - ggplot(data1, aes(x=total_bill, y=tip/total_bill)) + geom_point(shape=1) plot(p) dev.off() }} Thanks very much! On 30/06/2014 19:35, Trevor Davies wrote: I think the easiest most straight forward way would be to just throw it into a loop and subset the data on each loop around (untested code below but I'm sure you get the gist). ~Trevor sex1-unique(tips$sex) day1-unique(tips$day) for (i in 1:length(sex1)){ for (j in 1:length(day1)){ pdf(paste('example_',sex1[i],day1[j],'.pdf',sep='')) data1-subset(tips, sex==sex1[i] day==day1[j]) sp - ggplot(data1,aes(x=total_bill, y = tip/total_bill)) +geom_point(shape=1) plot(sp) dev.off() } } On Mon, Jun 30, 2014 at 10:23 AM, Bea GD aguitatie...@hotmail.com mailto:aguitatie...@hotmail.com wrote: Hi, I'm working with tips data from reshape package. library(reshape2) I'm saving my plots as pdf and I was wondering whether it was possible to print a different pdf for each 'wrapped' plot. Using the code below as an example, I'd like to get 8 independent pdf files for each sex ~ day combination. sp - ggplot(tips,aes(x=total_bill, y = tip/total_bill)) + geom_point(shape=1) + facet_grid(sex ~ day) plot(sp) Thanks a lot for your help! [[alternative HTML version deleted]] __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dead link in the help page of as.Date()
On 23.06.2014 21:41, Christofer Bogaso wrote: Hi, I was reading the help page for as.Date() function for some reason, and noticed a Matlab link: http://www.mathworks.com/help/techdoc/matlab_prog/bspgcx2-1.html Thanks, updated now. Best, Uwe Ligges It looks like this link is dead. So may be it would be better to put a correct link or remove this altogether. Thanks and regards, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] c() with POSIXlt objects and their timezone is lost
On 23.06.2014 23:52, Marc Girondot wrote: When two POSIXlt objects are combine with c(), they lost their tzone attribute, even if they are the same. I don't know if it is a feature, but I don't like it ! Marc es - strptime(2010-02-03 10:20:30, format=%Y-%m-%d %H:%M:%S, tz=UTC) es [1] 2010-02-03 10:20:30 UTC attributes(es) $names [1] sec min hour mday mon year wday yday isdst $class [1] POSIXlt POSIXt $tzone [1] UTC c(es, es) [1] 2010-02-03 11:20:30 CET 2010-02-03 11:20:30 CET attributes(c(es, es)) $names [1] secminhour mday monyear wday yday isdst zone gmtoff $class [1] POSIXlt POSIXt $tzone [1] CET CEST From ?c: c is sometimes used for its side effect of removing attributes [...] and from ?c.POSIXlt: Using c on POSIXlt objects converts them to the current time zone, [...] Best, Uwe Ligges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (PLM- package) Residual-Plotting and missing Values
Dear R-Community, I tried plotting the residuals of an FE-model estimated via plm . And detected that there are no residuals in the plot for the last two countries. I guess this happens because for some countries values are missing and R gives me the following for fixed.reg1.new$resid[1] 5 -0.4051985 Because the first 4 elements are missing. So there are residuals different from zero for the last two countries, but because of NA there´s a shift because the residuals are not padded to the correct length. I´ve read in https://stat.ethz.ch/pipermail/r-help/2008-June/166312.html and the manual that na.action=na.exclude is useful in lm-case to avoid this: when na.exclude is used the residuals and predictions are padded to the correct length by inserting NAs for cases omitted by na.exclude and tried it for my plm regression, but it does not work. Perhaps you have an Idea how to get residuals into the correct length? Or another way to deal with it? To make it easier explaining the way of proceeding, a reproducible example could be: # add NA´s for firm 6 data(Grunfeld, package = plm) Grunfeld$inv2= ifelse(Grunfeld$firm==6,NA, Grunfeld$inv) data- pdata.frame(Grunfeld,index=c(firm,year)) fixed.reg1.1 - plm(value~inv2+capital, +data = data,na.action=na.exclude ,index=c(firm,year), model=within) #resid(fixed.reg1.1) # no values for firm 6, no residuals displayed from 101-120 fixed.reg1.1$resid[105] 125 9.371963 require(lattice) xyplot(resid(fixed.reg1.1) ~ firm, data=data) # As you can see because of the NA´s of firm 6 ,there´s a shift because the residuals are not padded to the correct length, # and looking at the plot suggests there are no residuals for firm 10, which is not true. Thanks in advance for your help! Have a nice day Katie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot in generalized additive model (GAM)
I performed the following GAM by the MGCV package: gam(mortality ~ (PM10) + (Tmax) + (umidity), data = data, family = quasipoisson) How can I obtain a plot of Log-relative risk of mortality vs. PM10 ? thanks agostino Scopri istella, il nuovo motore per il web italiano. Istella garantisce risultati di qualità e la possibilità di condividere, in modo semplice e veloce, documenti, immagini, audio e video. Usa istella, vai su http://www.istella.it?wtk=amc138614816829636 -- View this message in context: http://r.789695.n4.nabble.com/plot-in-generalized-additive-model-GAM-tp4693326.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Socket Connection in R
I am trying to create socket connection in R. socket - make.socket(localhost,2099,T,T) msg2-'function=subscribe|item=MI.EQCON.1|schema=last_price;ask;bid' write.socket(socket,msg2) read.socket(socket,252,FALSE) When I run the read.socket line, I get error: Error in read.socket(socket, 252, FALSE) : embedded nul in string: 'þþ-\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\001\0\004\0CTCL\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0' I am Unable to solve this problem. Please advice how to get rid of this issue. Regards, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Order Book details in R Interactive Brokers Package
Hello All, I am working on project of Automated trade execution using R and Interactive Brokers Package. I have successfully implemented and tested the connection of R with Interactive Brokers API. Implementing orders, placing orders too are working fine. The only problem is that while executing order i wanted to check whether there are any pending orders in the order book. I search a lot but didn't found anything to retrieve the order books details. Can anyone provide me the logic to retrieve the order books details using R Interactive Brokers Package. Any help regarding this would be very appreciable. Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] combining data from multiple read.delim() invocations.
Is there a better way to do the following? I have data in a number of tab delimited files. I am using read.delim() to read them, in a loop. I am invoking my code on Linux Fedora 20, from the BASH command line, using Rscript. The code I'm using looks like: arguments - commandArgs(trailingOnly=TRUE); # initialize the capped_data data.frame capped_data - data.frame(lpar=NULL, started=Sys.time(), ended=Sys.time(), stringsAsFactors=FALSE); # and empty it. capped_data - capped_data[-1,]; # # Read in the data from the files listed for (file in arguments) { data - read.delim(file, header=FALSE, col.names=c(lpar,started,ended), as.is=TRUE, na.strings='\\N', colClasses=c(character,POSIXct,POSIXct)); capped_data - rbind(capped_data,data) } # I.e. is there an easier way than doing a read.delim/rbind in a loop? -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! John McKown [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] logistic regression for data with repeated measures
Hi, It seems that I'm quite lost in this wide and powerful R's universe, so I permit myself to ask your help about issues with which I'm struggling. Thank you, I would like to know if the answerâs accuracy (correct = 1; incorrect = 0) varies depending on 2 categorical variables which are the group (A and B) and the condition (a, b and c) knowing that Iâve got n subjects and 12 trials by conditions for each subject (i.e. 12 repetitions). To do that, Iâm focusing on logistic regression analysis. Iâve got no problem with this kind of analysis until now (logistic regression with numeric predictor variables and/or categorical predictor with 2 levels only) but, in this new context, I think I have to focus more specifically on logistic regression including *nested (or random?) factors* in a*repeated measures design* (because of the variables âSubjectâ and âTrialâ) with a categorical predictor variable with *more than 2 levels* (the variable âConditionâ) and I never did such a thingâ¦yet. mydata = mydata$Subject: Factor w/38 levels: i01, i02, i03, i04 mydata$Group: Factor w/2 levels: A, B mydata$Condition: Factor w/3 levels: a, b, c mydata$Trial: Factor w/12 levels: t01, t02, ...t12 mydata$Accuracy: Factor w/2 levels: 0, 1 Subject Group Trial Condition Accuracy i01 A t01 a 0 i01 A t02 a 1 ... i01 A t12 a 1 i01 A t01 b 1 i01 A t02 b 1 ... i01 A t12 b 0 i01 A t01 c 0 i01 A t02 c 1 ... i01 A t12 c 1 i02 B t01 a 1 ... First, Iâm wondering if I have to calculate a % of accuracy for each subject and each condition and thus âremoveâ the variable âTrialâ but âloseâ data (power?) in the same time⦠or to take into account this variable in the analysis and in this case, how to do that? Second, I donât know which function Iâve to choose (lmer, glm, glmerâ¦)? Third, Iâm not sure I proceed correctly to specify in this analysis that the responses all come from the same subject: within-subject design = â¦+(1|Subject) as I can do for a repeated measures ANOVA to analyze the effect of my different variables on a numeric one such as the response time: test=aov(Int~Group*Condition+*Error(Subject/(Group*Condition)*),data=mydata) and here again how can I add the variable Trial if I don't work on an average reaction time for each subject in the different conditions? Below, examples of models I can write with glmer(), fit.1=glmer(Accuracy~Group* Condition +(1|Subject),data=mydata,family=binomial) fit.2=glmer(Accuracy~Group* Condition +(1|Subject)-1,data=mydata,family=binomial) (âwithout interceptâ) fit.3=glmer(Accuracy~Group* Condition +(1|Subject)+(1|Trial)...?? I believed the analysis I've to conduct will be in the range of my qualifications then I realize it could be more complicated than that of course (ex GLMMs), I can hear do it as we do usually (=repeated measures ANOVA on a percentage of correct answers for each subject ??) as if there's only one way to follow but I think there's a lot, which one's revelant for my data, that's I want to find. Hope you can put me on the track, Best Suzon [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logistic regression for data with repeated measures
http://stats.stackexchange.com/questions/62225/conditional-logistic-regression-vs-glmm-in-r might be a good start Ersatzistician and Chutzpahthologist I can answer any question. I don't know is an answer. I don't know yet is a better answer. On Tue, Jul 1, 2014 at 10:24 AM, Suzon Sepp suzon.s...@gmail.com wrote: Hi, It seems that I'm quite lost in this wide and powerful R's universe, so I permit myself to ask your help about issues with which I'm struggling. Thank you, I would like to know if the answer’s accuracy (correct = 1; incorrect = 0) varies depending on 2 categorical variables which are the group (A and B) and the condition (a, b and c) knowing that I’ve got n subjects and 12 trials by conditions for each subject (i.e. 12 repetitions). To do that, I’m focusing on logistic regression analysis. I’ve got no problem with this kind of analysis until now (logistic regression with numeric predictor variables and/or categorical predictor with 2 levels only) but, in this new context, I think I have to focus more specifically on logistic regression including *nested (or random?) factors* in a*repeated measures design* (because of the variables “Subject” and “Trial”) with a categorical predictor variable with *more than 2 levels* (the variable “Condition”) and I never did such a thing…yet. mydata = mydata$Subject: Factor w/38 levels: i01, i02, i03, i04 mydata$Group: Factor w/2 levels: A, B mydata$Condition: Factor w/3 levels: a, b, c mydata$Trial: Factor w/12 levels: t01, t02, ...t12 mydata$Accuracy: Factor w/2 levels: 0, 1 Subject Group Trial Condition Accuracy i01 A t01 a 0 i01 A t02 a 1 ... i01 A t12 a 1 i01 A t01 b 1 i01 A t02 b 1 ... i01 A t12 b 0 i01 A t01 c 0 i01 A t02 c 1 ... i01 A t12 c 1 i02 B t01 a 1 ... First, I’m wondering if I have to calculate a % of accuracy for each subject and each condition and thus “remove” the variable “Trial” but “lose” data (power?) in the same time… or to take into account this variable in the analysis and in this case, how to do that? Second, I don’t know which function I’ve to choose (lmer, glm, glmer…)? Third, I’m not sure I proceed correctly to specify in this analysis that the responses all come from the same subject: within-subject design = …+(1|Subject) as I can do for a repeated measures ANOVA to analyze the effect of my different variables on a numeric one such as the response time: test=aov(Int~Group*Condition+*Error(Subject/(Group*Condition)*),data=mydata) and here again how can I add the variable Trial if I don't work on an average reaction time for each subject in the different conditions? Below, examples of models I can write with glmer(), fit.1=glmer(Accuracy~Group* Condition +(1|Subject),data=mydata,family=binomial) fit.2=glmer(Accuracy~Group* Condition +(1|Subject)-1,data=mydata,family=binomial) (“without intercept”) fit.3=glmer(Accuracy~Group* Condition +(1|Subject)+(1|Trial)...?? I believed the analysis I've to conduct will be in the range of my qualifications then I realize it could be more complicated than that of course (ex GLMMs), I can hear do it as we do usually (=repeated measures ANOVA on a percentage of correct answers for each subject ??) as if there's only one way to follow but I think there's a lot, which one's revelant for my data, that's I want to find. Hope you can put me on the track, Best Suzon [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combining data from multiple read.delim() invocations.
There is a better way. First we need some data. This creates three files in your home directory, each with five rows: write.table(data.frame(rep(A, 5), Sys.time(), Sys.time()), A.tab, sep=\t, row.names=FALSE, col.names=FALSE) write.table(data.frame(rep(B, 5), Sys.time(), Sys.time()), B.tab, sep=\t, row.names=FALSE, col.names=FALSE) write.table(data.frame(rep(C, 5), Sys.time(), Sys.time()), C.tab, sep=\t, row.names=FALSE, col.names=FALSE) Now to read and combine them into a single data.frame: fls - c(A.tab, B.tab, C.tab) df.list - lapply(fls, read.delim, header=FALSE, col.names=c(lpar,started,ended), as.is=TRUE, na.strings='\\N', colClasses=c(character,POSIXct,POSIXct)) df.all - do.call(rbind, df.list) str(df.all) 'data.frame': 15 obs. of 3 variables: $ lpar : chr A A A A ... $ started: POSIXct, format: 2014-07-01 11:25:05 2014-07-01 11:25:05 ... $ ended : POSIXct, format: 2014-07-01 11:25:05 2014-07-01 11:25:05 ... - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of John McKown Sent: Tuesday, July 1, 2014 7:07 AM To: r-help@r-project.org Subject: [R] combining data from multiple read.delim() invocations. Is there a better way to do the following? I have data in a number of tab delimited files. I am using read.delim() to read them, in a loop. I am invoking my code on Linux Fedora 20, from the BASH command line, using Rscript. The code I'm using looks like: arguments - commandArgs(trailingOnly=TRUE); # initialize the capped_data data.frame capped_data - data.frame(lpar=NULL, started=Sys.time(), ended=Sys.time(), stringsAsFactors=FALSE); # and empty it. capped_data - capped_data[-1,]; # # Read in the data from the files listed for (file in arguments) { data - read.delim(file, header=FALSE, col.names=c(lpar,started,ended), as.is=TRUE, na.strings='\\N', colClasses=c(character,POSIXct,POSIXct)); capped_data - rbind(capped_data,data) } # I.e. is there an easier way than doing a read.delim/rbind in a loop? -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! John McKown [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combining data from multiple read.delim() invocations.
Maybe, David, but this isn't really it. Your code just basically reproduces the explicit for() loop with the lapply. Maybe there might be some advantage in rbinding the list over incrementally adding rows to the data frame, but I would be surprised if it made much of a difference either way. Of course, someone with actual data might prove me wrong... Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. Clifford Stoll On Tue, Jul 1, 2014 at 9:31 AM, David L Carlson dcarl...@tamu.edu wrote: There is a better way. First we need some data. This creates three files in your home directory, each with five rows: write.table(data.frame(rep(A, 5), Sys.time(), Sys.time()), A.tab, sep=\t, row.names=FALSE, col.names=FALSE) write.table(data.frame(rep(B, 5), Sys.time(), Sys.time()), B.tab, sep=\t, row.names=FALSE, col.names=FALSE) write.table(data.frame(rep(C, 5), Sys.time(), Sys.time()), C.tab, sep=\t, row.names=FALSE, col.names=FALSE) Now to read and combine them into a single data.frame: fls - c(A.tab, B.tab, C.tab) df.list - lapply(fls, read.delim, header=FALSE, col.names=c(lpar,started,ended), as.is=TRUE, na.strings='\\N', colClasses=c(character,POSIXct,POSIXct)) df.all - do.call(rbind, df.list) str(df.all) 'data.frame': 15 obs. of 3 variables: $ lpar : chr A A A A ... $ started: POSIXct, format: 2014-07-01 11:25:05 2014-07-01 11:25:05 ... $ ended : POSIXct, format: 2014-07-01 11:25:05 2014-07-01 11:25:05 ... - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of John McKown Sent: Tuesday, July 1, 2014 7:07 AM To: r-help@r-project.org Subject: [R] combining data from multiple read.delim() invocations. Is there a better way to do the following? I have data in a number of tab delimited files. I am using read.delim() to read them, in a loop. I am invoking my code on Linux Fedora 20, from the BASH command line, using Rscript. The code I'm using looks like: arguments - commandArgs(trailingOnly=TRUE); # initialize the capped_data data.frame capped_data - data.frame(lpar=NULL, started=Sys.time(), ended=Sys.time(), stringsAsFactors=FALSE); # and empty it. capped_data - capped_data[-1,]; # # Read in the data from the files listed for (file in arguments) { data - read.delim(file, header=FALSE, col.names=c(lpar,started,ended), as.is=TRUE, na.strings='\\N', colClasses=c(character,POSIXct,POSIXct)); capped_data - rbind(capped_data,data) } # I.e. is there an easier way than doing a read.delim/rbind in a loop? -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! John McKown [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stringr / Regular Expressions advice
Sara, Yes, I modified the code that you provided and it worked quite well. Here is the revised code: . accel_data - data *# pattern to be identified* v.to.match - c(438, 454, 459) # call the below function anytime the v.to.match criteria changes to ensure match is updated v.matches - apply(fakedata, 1, function(x)all(x == v.to.match)) which(v.matches) [1] 405 sum(v.matches) [1] 1 .. Again, here is the dataset: dput(head(accel_data, 20)) structure(list(x_reading = c(455L, 451L, 458L, 463L, 462L, 460L, 448L, 449L, 450L, 451L, 445L, 440L, 439L, 445L, 448L, 447L, 440L, 439L, 440L, 434L), y_reading = c(502L, 503L, 502L, 502L, 495L, 505L, 480L, 483L, 489L, 488L, 489L, 456L, 497L, 476L, 470L, 474L, 469L, 482L, 484L, 477L), z_reading = c(454L, 454L, 452L, 452L, 446L, 459L, 456L, 451L, 451L, 455L, 438L, 462L, 437L, 455L, 470L, 455L, 460L, 463L, 458L, 458L)), .Names = c(x_reading, y_reading, z_reading), row.names = c(NA, 20L), class = data.frame) My next goal is to extend the range for each column. For instance: v.to.match - c(438:445, 454:460, 459:470) Your thoughts? Many thanks, Vincent On Fri, Jun 27, 2014 at 5:51 AM, Sarah Goslee sarah.gos...@gmail.com wrote: Hi, It's a good idea to copy back to the list, not just to mo, to keep the discussion all in one place. On Thursday, June 26, 2014, VINCENT DEAN BOYCE vincentdeanbo...@gmail.com wrote: Sarah, Great feedback and direction. Here is the data I am working with*: dput(head(data_log, 20)) structure(list(x_reading = c(455L, 451L, 458L, 463L, 462L, 460L, 448L, 449L, 450L, 451L, 445L, 440L, 439L, 445L, 448L, 447L, 440L, 439L, 440L, 434L), y_reading = c(502L, 503L, 502L, 502L, 495L, 505L, 480L, 483L, 489L, 488L, 489L, 456L, 497L, 476L, 470L, 474L, 469L, 482L, 484L, 477L), z_reading = c(454L, 454L, 452L, 452L, 446L, 459L, 456L, 451L, 451L, 455L, 438L, 462L, 437L, 455L, 470L, 455L, 460L, 463L, 458L, 458L)), .Names = c(x_reading, y_reading, z_reading), row.names = c(NA, 20L), class = data.frame) *however, I am unsure why the letter L has been appended to each numerical string. It denotes values stored as integers, and is nothing you need to worry about. In any event, as you can see there are three columns of data named x_reading, y_reading and z_reading. I would like to detect patterns among them. For instance, let's say the pattern I wish to detect is 455, 502, 454 across the three columns respectively. As you can see in the data, this is found in the first row.This particular string reoccurs numerous times within the dataset is what I wish to quantify - how many times the string 455, 502, 454 appears. Your thoughts? Did you try the code I provided? It does what I think you're looking for. Sarah Many thanks, Vincent On Thu, Jun 26, 2014 at 4:46 PM, Sarah Goslee sarah.gos...@gmail.com wrote: Hi, On Thu, Jun 26, 2014 at 12:17 PM, VINCENT DEAN BOYCE vincentdeanbo...@gmail.com wrote: Hello, Using R, I've loaded a .cvs file comprised of several hundred rows and 3 columns of data. The data within maps the output of a triaxial accelerometer, a sensor which measures an object's acceleration along the x,y and z axes. The data for each respective column sequentially oscillates, and ranges numerically from 100 to 500. If your data are numeric, why are you using stringr? It would be easier to provide you with an answer if we knew what your data looked like. dput(head(yourdata, 20)) and paste that into your non-HTML email. I want create a function that parses the data and detects patterns across the three columns. For instance, I would like to detect instances when the values for the x,y and z columns equal 150, 200, 300 respectively. Additionally, when a match is detected, I would like to know how many times the pattern appears. That's easy enough: fakedata - data.frame(matrix(c( 100, 100, 200, 150, 200, 300, 100, 350, 100, 400, 200, 300, 200, 500, 200, 150, 200, 300, 150, 200, 300), ncol=3, byrow=TRUE)) v.to.match - c(150, 200, 300) v.matches - apply(fakedata, 1, function(x)all(x == v.to.match)) # which rows match which(v.matches) # how many rows match sum(v.matches) I have been successful using str_detect to provide a Boolean, however it seems to only work on a single vector, i.e, 400 , not a range of values i.e 400 - 450. See below: This is where I get confused, and where we need sample data. Are your data numeric, as you state above, or some other format? If your data are character, and like 400 - 450, you can still match them with the code I suggested above. # this works vals - str_detect (string = data_log$x_reading, pattern = 400) # this also works, but doesn't detect the particular range, rather the existence of the numbers vals - str_detect (string = data_log$x_reading, pattern = [400-450]) Are you trying to match any numeric value in the range 400-450? Again, actual data. Also, it
Re: [R] combining data from multiple read.delim() invocations.
I agree it is not necessarily faster, but the code is more compact since we don't have to initialize the variable or explicitly refer to the index. For big data it has the disadvantage of storing the data twice. For speed, this is faster and does not store the data twice, but is system dependent. For Windows: shell(copy ?.tab Combined.tab) df.all - read.delim(Combined.tab, header=FALSE, col.names=c(lpar,started,ended), as.is=TRUE, na.strings='\\N', colClasses=c(character,POSIXct,POSIXct)) David C -Original Message- From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Tuesday, July 1, 2014 12:33 PM To: David L Carlson Cc: John McKown; r-help@r-project.org Subject: Re: [R] combining data from multiple read.delim() invocations. Maybe, David, but this isn't really it. Your code just basically reproduces the explicit for() loop with the lapply. Maybe there might be some advantage in rbinding the list over incrementally adding rows to the data frame, but I would be surprised if it made much of a difference either way. Of course, someone with actual data might prove me wrong... Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. Clifford Stoll On Tue, Jul 1, 2014 at 9:31 AM, David L Carlson dcarl...@tamu.edu wrote: There is a better way. First we need some data. This creates three files in your home directory, each with five rows: write.table(data.frame(rep(A, 5), Sys.time(), Sys.time()), A.tab, sep=\t, row.names=FALSE, col.names=FALSE) write.table(data.frame(rep(B, 5), Sys.time(), Sys.time()), B.tab, sep=\t, row.names=FALSE, col.names=FALSE) write.table(data.frame(rep(C, 5), Sys.time(), Sys.time()), C.tab, sep=\t, row.names=FALSE, col.names=FALSE) Now to read and combine them into a single data.frame: fls - c(A.tab, B.tab, C.tab) df.list - lapply(fls, read.delim, header=FALSE, col.names=c(lpar,started,ended), as.is=TRUE, na.strings='\\N', colClasses=c(character,POSIXct,POSIXct)) df.all - do.call(rbind, df.list) str(df.all) 'data.frame': 15 obs. of 3 variables: $ lpar : chr A A A A ... $ started: POSIXct, format: 2014-07-01 11:25:05 2014-07-01 11:25:05 ... $ ended : POSIXct, format: 2014-07-01 11:25:05 2014-07-01 11:25:05 ... - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of John McKown Sent: Tuesday, July 1, 2014 7:07 AM To: r-help@r-project.org Subject: [R] combining data from multiple read.delim() invocations. Is there a better way to do the following? I have data in a number of tab delimited files. I am using read.delim() to read them, in a loop. I am invoking my code on Linux Fedora 20, from the BASH command line, using Rscript. The code I'm using looks like: arguments - commandArgs(trailingOnly=TRUE); # initialize the capped_data data.frame capped_data - data.frame(lpar=NULL, started=Sys.time(), ended=Sys.time(), stringsAsFactors=FALSE); # and empty it. capped_data - capped_data[-1,]; # # Read in the data from the files listed for (file in arguments) { data - read.delim(file, header=FALSE, col.names=c(lpar,started,ended), as.is=TRUE, na.strings='\\N', colClasses=c(character,POSIXct,POSIXct)); capped_data - rbind(capped_data,data) } # I.e. is there an easier way than doing a read.delim/rbind in a loop? -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! John McKown [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] x axis labelling
Hi, I am new to R and am trying to create a graph with Time(24hr) along the x axis. Rather than start at 01.00, I wanted to start at 14.00. I tried to use the axis(side=1, at=c( )) function but it continues to put then in numeric order. Is there another way I can add labels to the x axis? Thank You. Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 1-dinemsional point process
It's unclear why density estimates are not being mentioned. Also suggest you search: install.packages(sos) require(sos) findFn(scan statistic) On Jun 30, 2014, at 7:35 PM, Doobs wrote: Hi, As a new user, is it possible to look at clustering/dispersion processes of a 1D point process (i.e. points along a transect)? My limited understanding is that spatstat is for 23D point patterns. Thanks -- View this message in context: http://r.789695.n4.nabble.com/1-dinemsional-point-process-tp4693315.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R help
To whom it may concern: I installed R 3.1 and I get this. In normalizePath(path.expand(path), winslash, mustWork) : path[1]=\\network\users\aweeks\My Documents/R/win-library/3.1: Access is denied Is there any way to change this path? I have looked it up on the internet but cannot seem to find the right option. If you could help me out, that would be fantastic. Thanks in advance and have a wonderful day! -- Sincerely, Andre Rei Weeks M.P.H Biostatistics (2014) B.S. Biology +1-(850)-443-6592 ᧠[[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using CSS package to extract text from html
This being my first post, I'm sure I'll do something discordant with convention, so forgive me in advance. Basically, I am trying to extract text from an html file using the CSS package in R. However, I am unable to do so because it seems that the text itself is not identified with any class and thus targeting it via the CSS function `cssApply` is difficult. I'll provide some detailed information so that you may be able to spot something I've missed. Let's say I want to extract the latitude/longitude info from the following html: http://va.water.usgs.gov/duration_plots/htm_7/dp02059500.htm Here's what the initial portion of my code would look like: install.packages('CSS') library(CSS) doc-http://va.water.usgs.gov/duration_plots/htm_7/dp02059500.htm; doc-htmlParse(doc) Now, considering that the text I want to extract is under the following Xpath (cp from Chrome DevTool): /html/body/table[1]/tbody/tr/td/table/tbody/tr[2]/td[2]/font/text()[1] Would the next move be to call the text from that path? If you need to see for yourself how the site's html is configured follow the link and use your respective browser's inspect element tool. Any help would be appreciated. Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Using-CSS-package-to-extract-text-from-html-tp4693347.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] x axis labelling
Did you add xaxt = n in the plot function? Try the following: plot(x,y, xaxt = n) axis(1, at = c(14, 20),labels = c(14h, 20h) ) 2014-07-01 12:41 GMT-05:00 Michael Millar michael88mil...@hotmail.co.uk: Hi, I am new to R and am trying to create a graph with Time(24hr) along the x axis. Rather than start at 01.00, I wanted to start at 14.00. I tried to use the axis(side=1, at=c( )) function but it continues to put then in numeric order. Is there another way I can add labels to the x axis? Thank You. Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating Patient Data
On Jun 25, 2014, at 1:49 PM, David Winsemius wrote: On Jun 24, 2014, at 11:18 PM, Abhinaba Roy wrote: Hi David, I was thinking something like this: ID Disease 1 A 2 B 3 A 1C 2D 5A 4B 3D 2A .... How can this be done? do.call(rbind, lapply( 1:20, function(pt) { data.frame( patient=pt, disease= sample( c('A','B','C','D','E','F'), pmin(2+rpois(1, 2), 6)) )}) ) If you were doing this repeatedly I suppose you might get time efficiency by the rpois vector as a single item of the same length as your PatientID's -- David. On Wed, Jun 25, 2014 at 11:34 AM, David Winsemius dwinsem...@comcast.net wrote: On Jun 24, 2014, at 10:14 PM, Abhinaba Roy wrote: Dear R helpers, I want to generate data for say 1000 patients (i.e., 1000 unique IDs) having suffered from various diseases in the past (say diseases A,B,C,D,E,F). The only condition imposed is that each patient should've suffered from *atleast* two diseases. So my data frame will have two columns 'ID' and 'Disease'. I want to do a basket analysis with this data, where ID will be the identifier and we will establish rules based on the 'Disease' column. How can I generate this type of data in R? Perhaps something along these lines for 20 cases: data.frame(patient=1:20, disease = sapply(pmin(2+rpois(20, 2), 6), function(n) paste0( sample( c('A','B','C','D','E','F'), n), collapse=+ ) ) + ) patient disease 11 F+D 22 F+A+D+E 33 F+D+C+E 44 B+D+C+A 55 D+A+F+C 66 E+A+D 77 E+F+B+C+A+D 88 A+B+C+D+E 99 B+E+C+F 10 10 C+A 11 11 B+A+D+E+C+F 12 12 B+C 13 13 A+D+B+E 14 14 D+C+E+F+B+A 15 15 C+F+D+E+A 16 16 A+C+B 17 17 C+D+B+E 18 18 A+B 19 19 C+B+D+E+F 20 20 D+C+F -- Regards Abhinaba Roy [[alternative HTML version deleted]] You should read the Posting Guide and learn to post in HTML. PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- David Winsemius Alameda, CA, USA -- Regards Abhinaba Roy Statistician Radix Analytics Pvt. Ltd Ahmedabad David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: combining data from multiple read.delim() invocations.
On Tue, Jul 1, 2014 at 11:31 AM, David L Carlson dcarl...@tamu.edu wrote: There is a better way. First we need some data. This creates three files in your home directory, each with five rows: write.table(data.frame(rep(A, 5), Sys.time(), Sys.time()), A.tab, sep=\t, row.names=FALSE, col.names=FALSE) write.table(data.frame(rep(B, 5), Sys.time(), Sys.time()), B.tab, sep=\t, row.names=FALSE, col.names=FALSE) write.table(data.frame(rep(C, 5), Sys.time(), Sys.time()), C.tab, sep=\t, row.names=FALSE, col.names=FALSE) Now to read and combine them into a single data.frame: fls - c(A.tab, B.tab, C.tab) df.list - lapply(fls, read.delim, header=FALSE, col.names=c(lpar,started,ended), as.is=TRUE, na.strings='\\N', colClasses=c(character,POSIXct,POSIXct)) df.all - do.call(rbind, df.list) str(df.all) 'data.frame': 15 obs. of 3 variables: $ lpar : chr A A A A ... $ started: POSIXct, format: 2014-07-01 11:25:05 2014-07-01 11:25:05 ... $ ended : POSIXct, format: 2014-07-01 11:25:05 2014-07-01 11:25:05 ... - David L Carlson I do like that better than my version. Mainly because it is fewer statements. I'm rather new with R and the *apply series of functions is bleeding edge for me. And I haven't the the do.call before either. I'm still reading. But the way that I learn best is to try projects as I am learning. So I get ahead of myself. According to the Linux time command, your method for a single input file, resulting in 144 output elements in the data.frame, took: real0m0.525s user0m0.441s sys 0m0.063s Mine: real0m0.523s user0m0.446s sys 0m0.060s Basically, a wash. For a stress, I took in all 136 of my files in a single execution. Output was 22,823 elements in the data.frame. Yours: real3m32.651s user3m26.837s sys 0m2.292s Mine: real3m24.603s user3m20.225s sys 0m0.969s Still a wash. Of course, since I run this only once a week, on a Sunday, the time is not too important. I actually think that your solution is a bit more readable than mine. So long as I document what is going on. === I had considered combining all the files together using the R pipe command to run the UNIX cat command, something like: command - paste(cat ,arguments,collapse= ); read.delim(pipe(command), ... but I was trying to be pure R since I am a Linux bigot surrounded by Windows weenies grin/. === Hook'em horns! -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! John McKown -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! John McKown [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] an incredibly trivial question about nls
Hello R People: I'm having a forest/trees location problem with the output of nls. If I save the output to an object, and print the object, it shows, amongst other things, the residual sum of squares. I would like to get that. However, when I look at names or str of the object, I can't find the residual sum of squares. Any help would be much appreciated. thanks, Erin -- Erin Hodgess Associate Professor Department of Mathematical and Statistics University of Houston - Downtown mailto: erinm.hodg...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: combining data from multiple read.delim() invocations.
On Tue, Jul 1, 2014 at 12:03 PM, John McKown john.archie.mck...@gmail.com wrote: On Tue, Jul 1, 2014 at 11:31 AM, David L Carlson dcarl...@tamu.edu wrote: There is a better way. First we need some data. This creates three files in your home directory, each with five rows: write.table(data.frame(rep(A, 5), Sys.time(), Sys.time()), A.tab, sep=\t, row.names=FALSE, col.names=FALSE) write.table(data.frame(rep(B, 5), Sys.time(), Sys.time()), B.tab, sep=\t, row.names=FALSE, col.names=FALSE) write.table(data.frame(rep(C, 5), Sys.time(), Sys.time()), C.tab, sep=\t, row.names=FALSE, col.names=FALSE) Now to read and combine them into a single data.frame: fls - c(A.tab, B.tab, C.tab) df.list - lapply(fls, read.delim, header=FALSE, col.names=c(lpar,started,ended), as.is=TRUE, na.strings='\\N', colClasses=c(character,POSIXct,POSIXct)) df.all - do.call(rbind, df.list) str(df.all) 'data.frame': 15 obs. of 3 variables: $ lpar : chr A A A A ... $ started: POSIXct, format: 2014-07-01 11:25:05 2014-07-01 11:25:05 ... $ ended : POSIXct, format: 2014-07-01 11:25:05 2014-07-01 11:25:05 ... - David L Carlson I do like that better than my version. Mainly because it is fewer statements. I'm rather new with R and the *apply series of functions is bleeding edge for me. And I haven't the the do.call before either. I'm still reading. But the way that I learn best is to try projects as I am learning. So I get ahead of myself. If you have not already done so, please read An Introduction to R or online tutorial of your choice before posting further. I do not consider it proper to post queries concerning basics that you can easily learn about yourself. I DO consider it proper to post queries about such topics if you have made the effort but are still confused. That is what this list is for. You can decide -- and chastise me if you like -- into which category you fit. Cheers, Bert According to the Linux time command, your method for a single input file, resulting in 144 output elements in the data.frame, took: real0m0.525s user0m0.441s sys 0m0.063s Mine: real0m0.523s user0m0.446s sys 0m0.060s Basically, a wash. For a stress, I took in all 136 of my files in a single execution. Output was 22,823 elements in the data.frame. Yours: real3m32.651s user3m26.837s sys 0m2.292s Mine: real3m24.603s user3m20.225s sys 0m0.969s Still a wash. Of course, since I run this only once a week, on a Sunday, the time is not too important. I actually think that your solution is a bit more readable than mine. So long as I document what is going on. === I had considered combining all the files together using the R pipe command to run the UNIX cat command, something like: command - paste(cat ,arguments,collapse= ); read.delim(pipe(command), ... but I was trying to be pure R since I am a Linux bigot surrounded by Windows weenies grin/. === Hook'em horns! -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! John McKown -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! John McKown [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] an incredibly trivial question about nls
1. Why? What do you think it tells you? (The number of parameters in a NONlinear model is probably not what you think it is). 2. ?deviance 3. You've been posting all this time and still didn't try stats:::print.nls ?? -- which is where you would find the answer. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. Clifford Stoll On Tue, Jul 1, 2014 at 1:27 PM, Erin Hodgess erinm.hodg...@gmail.com wrote: Hello R People: I'm having a forest/trees location problem with the output of nls. If I save the output to an object, and print the object, it shows, amongst other things, the residual sum of squares. I would like to get that. However, when I look at names or str of the object, I can't find the residual sum of squares. Any help would be much appreciated. thanks, Erin -- Erin Hodgess Associate Professor Department of Mathematical and Statistics University of Houston - Downtown mailto: erinm.hodg...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] an incredibly trivial question about nls
On Tue, Jul 1, 2014 at 1:27 PM, Erin Hodgess erinm.hodg...@gmail.com wrote: Hello R People: I'm having a forest/trees location problem with the output of nls. If I save the output to an object, and print the object, it shows, amongst other things, the residual sum of squares. I would like to get that. However, when I look at names or str of the object, I can't find the residual sum of squares. I think you want to look at summary(object), which contains (see help(summary.nls)) sigma: the square root of the estimated variance of the random error sigma^2 = 1/(n-p) Sum(R[i]^2), where R[i] is the i-th weighted residual. In other words, you probably want summary(object)$sigma^2*(n-p), perhaps a square root of it, or maybe just the sigma. HTH, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] an incredibly trivial question about nls
In direct contrast to what Bert says, I think this is a very reasonable (and non-trivial) question. The problem results from Gurus structuring the functions that they write in such a way that they are totally opaque to anyone but the ultra-cognoscenti. What is gained by not having things set up in a straightforward manner that is accessible to normal human beings is mysterious to me. If you do look at stats:::print.nls (and you have to start with that stats:::; things just *have* to be hidden away so that normal human beings can't see them!) you are likely to be no more enlightened than you were previously, until you engage in a good long struggle. It turns out that what happens is that, in order to print the residual sum of squares, print.nls() calls the function x$m$deviance (where x is the object returned by nls()). This function simply returns the object dev which is stored in its environment. Could one get more convoluted and obscure if one tried? So, to get the residual sum of squares you could do: rss - x$m$deviance() or rss - get(dev,envir=environment(x$m$deviance)) The actual residuals are hidden away as resid in the environment of the function x$m$resid, so you could also get the residual sum of squares via: rss - sum(get(resid,envir=environment(x$m$resid))^2) or rss - sum(x$m$resid()^2) or rss - sum(resid(x)^2) the last of which applies the (hidden) nls method for the residuals() function. Happily, they all seem to give the same answer. :-) On 02/07/14 08:40, Bert Gunter wrote: 1. Why? What do you think it tells you? That's *her* business. (The number of parameters in a NONlinear model is probably not what you think it is). 2. ?deviance Not at all useful. 3. You've been posting all this time and still didn't try stats:::print.nls ?? -- which is where you would find the answer. Chastising people for failing to see the invisible is not helpful. And even when they manage to see the invisible, the result is still very obscure. cheers, Rolf Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. Clifford Stoll On Tue, Jul 1, 2014 at 1:27 PM, Erin Hodgess erinm.hodg...@gmail.com wrote: Hello R People: I'm having a forest/trees location problem with the output of nls. If I save the output to an object, and print the object, it shows, amongst other things, the residual sum of squares. I would like to get that. However, when I look at names or str of the object, I can't find the residual sum of squares. Any help would be much appreciated. thanks, Erin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] an incredibly trivial question about nls
Thank you to all. I had actually found the summary and trotted that out. Just had not gotten back to the list. Thanks again! Sincerely, Erin On Tue, Jul 1, 2014 at 5:46 PM, Bert Gunter gunter.ber...@gene.com wrote: Beauty -- or obscurity -- is in the eyes of the beholder. But I leave your objections to stand without public response.If I can't stand the heat ... etc. However, I will say that my comment about the value of looking at the the RSS was meant to be helpful, because in my own consulting, I have seem many who believe that it is something that it is not. Deviance is the more useful statistical measure of model fit. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. Clifford Stoll On Tue, Jul 1, 2014 at 2:29 PM, Rolf Turner r.tur...@auckland.ac.nz wrote: In direct contrast to what Bert says, I think this is a very reasonable (and non-trivial) question. The problem results from Gurus structuring the functions that they write in such a way that they are totally opaque to anyone but the ultra-cognoscenti. What is gained by not having things set up in a straightforward manner that is accessible to normal human beings is mysterious to me. If you do look at stats:::print.nls (and you have to start with that stats:::; things just *have* to be hidden away so that normal human beings can't see them!) you are likely to be no more enlightened than you were previously, until you engage in a good long struggle. It turns out that what happens is that, in order to print the residual sum of squares, print.nls() calls the function x$m$deviance (where x is the object returned by nls()). This function simply returns the object dev which is stored in its environment. Could one get more convoluted and obscure if one tried? So, to get the residual sum of squares you could do: rss - x$m$deviance() or rss - get(dev,envir=environment(x$m$deviance)) The actual residuals are hidden away as resid in the environment of the function x$m$resid, so you could also get the residual sum of squares via: rss - sum(get(resid,envir=environment(x$m$resid))^2) or rss - sum(x$m$resid()^2) or rss - sum(resid(x)^2) the last of which applies the (hidden) nls method for the residuals() function. Happily, they all seem to give the same answer. :-) On 02/07/14 08:40, Bert Gunter wrote: 1. Why? What do you think it tells you? That's *her* business. (The number of parameters in a NONlinear model is probably not what you think it is). 2. ?deviance Not at all useful. 3. You've been posting all this time and still didn't try stats:::print.nls ?? -- which is where you would find the answer. Chastising people for failing to see the invisible is not helpful. And even when they manage to see the invisible, the result is still very obscure. cheers, Rolf Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. Clifford Stoll On Tue, Jul 1, 2014 at 1:27 PM, Erin Hodgess erinm.hodg...@gmail.com wrote: Hello R People: I'm having a forest/trees location problem with the output of nls. If I save the output to an object, and print the object, it shows, amongst other things, the residual sum of squares. I would like to get that. However, when I look at names or str of the object, I can't find the residual sum of squares. Any help would be much appreciated. thanks, Erin -- Erin Hodgess Associate Professor Department of Mathematical and Statistics University of Houston - Downtown mailto: erinm.hodg...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] x axis labelling
On Tue, 1 Jul 2014 06:41:52 PM Michael Millar wrote: Hi, I am new to R and am trying to create a graph with Time(24hr) along the x axis. Rather than start at 01.00, I wanted to start at 14.00. I tried to use the axis(side=1, at=c( )) function but it continues to put then in numeric order. Is there another way I can add labels to the x axis? Hi Michael, Perhaps this will get you out of trouble. mmdat-data.frame(time=paste(c(14:23,0:13),00,sep=:), wind_speed=sample(0:30,24)) plot(mmdat$wind_speed,type=b,xaxt=n,xlab=Time) axis(1,at=1:24,labels=mmdat$time) If you want to get more tick labels on the time axis, look at staxlab (plotrix). Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using RCMD INSTALL under Spanish version of windows.
Using RCMD INSTALL with R-version 3.1.0 under a Spanish Windows 7 gives the following error message: rcmd INSTALL MyPackages Mensajes de aviso perdidos In normalizePath(path.expand(path), winslash, mustWork) : path[1]=c:/ARCHIV~1/R/R-31~1.0/library: Acceso denegado Mensajes de aviso perdidos package methods in options(defaultPackages) was not found Durante la inicialización - Mensajes de aviso perdidos 1: package 'datasets' in options(defaultPackages) was not found 2: package 'utils' in options(defaultPackages) was not found 3: package 'grDevices' in options(defaultPackages) was not found 4: package 'graphics' in options(defaultPackages) was not found 5: package 'stats' in options(defaultPackages) was not found 6: package 'methods' in options(defaultPackages) was not found Error en normalizePath(path.expand(path), winslash, mustWork) : path[1]=c:/ARCHIV~1/R/R-31~1.0/library/tools: Acceso denegado Calls: ::: ... tryCatch - tryCatchList - tryCatchOne - Anonymous Ejecución interrumpida The user running this command has all permissions to modify the directory C:\Program Files\R\R-3.1.0\library, as is also clear from the fact that installing a package by install.packages() works. Any suggestions are welcome. Jan. -- Jan Graffelman Dpt. of Statistics and Operations Research Universitat Politècnica de Catalunya Av. Diagonal 647, 6th floor 08028 Barcelona, Spain email: jan.graffel...@upc.edu web: http://www-eio.upc.es/~jan tel: +34-93-4011739 fax: +34-93-4016575 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data visualization: overlay columns of train/test/validation datasets
Hello, Given two different datasets (having the same number and type of columns, but different observations, as commonly encountered in data-mining as train/test/validation datasets), is it possible to overlay plots (histograms) and compare the different attributes from the separate datasets, in order to check how similar the different datasets are? Is there a package available for such plotting together of similar columns from different datasets? Thanks, SJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data visualization: overlay columns of train/test/validation datasets
On Jul 1, 2014, at 3:46 PM, Supriya Jain wrote: Hello, Given two different datasets (having the same number and type of columns, but different observations, as commonly encountered in data-mining as train/test/validation datasets), is it possible to overlay plots (histograms) and compare the different attributes from the separate datasets, in order to check how similar the different datasets are? Is there a package available for such plotting together of similar columns from different datasets? Possible. Assuming you just want frequency histograms (or ones using counts for that matter) it can be done in any of the three major plotting paradigms supported in R. No extra packages needed if using just base graphics. Thanks, SJ [[alternative HTML version deleted]] Oh, you must have missed the parts of the Posign Guide where plain text was requyested. See below. PLEASE do read the posting guide http://www.R-project.org/posting-guide.html And you missed that section, as well. and provide commented, minimal, self-contained, reproducible code. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot in generalized additive model (GAM)
On Jul 1, 2014, at 12:02 AM, adc wrote: I performed the following GAM by the MGCV package: I think it's actually spelled in all lower case. gam(mortality ~ (PM10) + (Tmax) + (umidity), data = data, family = quasipoisson) How can I obtain a plot of Log-relative risk of mortality vs. PM10 ? thanks Shouldn't we need to know more details about the experimental setup to answer that question? And what sort of comparisons you are requesting? And about what parts of ?mgcv::plot.gam you need further explanations to answer the question? agostino snipped -- snipped Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] snipped PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] x axis labelling
Hi Michael Dates and times are always a problem as they are irregular not 1,2,3 ..., 100 If you want more fancy formatting of the x axis try this First convert your time to a datetime class # Use a dummy date for datetime as it is easier mmdat$time - seq(strptime(20140702 14, %Y%m%d %H), by = hours, length= 24) # only gives numerical sequence on xlab plot(mmdat$wind_speed,type=b,xlab=Time) However library(lattice) ?xyplot # by starting at 15:00 hours get sequence and use formatting of dates xyplot(wind_speed ~time, data = mmdat, type = b, xlab=Time, scales = list(x = list(at = seq(mmdat[2,1], by = 3 hours, length = 8), labels = format(seq(mmdat[2,1], by = 3 hours, length = 8),%H:%M))) ) Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mac...@northnet.com.au -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Michael Millar Sent: Wednesday, 2 July 2014 03:42 To: r-help@R-project.org Subject: [R] x axis labelling Hi, I am new to R and am trying to create a graph with Time(24hr) along the x axis. Rather than start at 01.00, I wanted to start at 14.00. I tried to use the axis(side=1, at=c( )) function but it continues to put then in numeric order. Is there another way I can add labels to the x axis? Thank You. Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stringr / Regular Expressions advice
#or res - mapply(`%in%`, accel_data, v.to.match) res1 - sapply(seq_len(ncol(accel_data)),function(i) accel_data[i]=tail(v.to.match[[i]],1) accel_data[i] =v.to.match[[i]][1]) all.equal(res, res1,check.attributes=F) #[1] TRUE A.K. On Tuesday, July 1, 2014 10:56 PM, arun smartpink...@yahoo.com wrote: Hi Vincent, You could try: v.to.match - list(438:445, 454:460,459:470) sapply(seq_len(ncol(accel_data)),function(i) accel_data[i]=tail(v.to.match[[i]],1) accel_data[i] =v.to.match[[i]][1]) #or use ?cut or ?findInterval A.K. On Tuesday, July 1, 2014 2:23 PM, VINCENT DEAN BOYCE vincentdeanbo...@gmail.com wrote: Sara, Yes, I modified the code that you provided and it worked quite well. Here is the revised code: . accel_data - data *# pattern to be identified* v.to.match - c(438, 454, 459) # call the below function anytime the v.to.match criteria changes to ensure match is updated v.matches - apply(fakedata, 1, function(x)all(x == v.to.match)) which(v.matches) [1] 405 sum(v.matches) [1] 1 .. Again, here is the dataset: dput(head(accel_data, 20)) structure(list(x_reading = c(455L, 451L, 458L, 463L, 462L, 460L, 448L, 449L, 450L, 451L, 445L, 440L, 439L, 445L, 448L, 447L, 440L, 439L, 440L, 434L), y_reading = c(502L, 503L, 502L, 502L, 495L, 505L, 480L, 483L, 489L, 488L, 489L, 456L, 497L, 476L, 470L, 474L, 469L, 482L, 484L, 477L), z_reading = c(454L, 454L, 452L, 452L, 446L, 459L, 456L, 451L, 451L, 455L, 438L, 462L, 437L, 455L, 470L, 455L, 460L, 463L, 458L, 458L)), .Names = c(x_reading, y_reading, z_reading), row.names = c(NA, 20L), class = data.frame) My next goal is to extend the range for each column. For instance: v.to.match - c(438:445, 454:460, 459:470) Your thoughts? Many thanks, Vincent On Fri, Jun 27, 2014 at 5:51 AM, Sarah Goslee sarah.gos...@gmail.com wrote: Hi, It's a good idea to copy back to the list, not just to mo, to keep the discussion all in one place. On Thursday, June 26, 2014, VINCENT DEAN BOYCE vincentdeanbo...@gmail.com wrote: Sarah, Great feedback and direction. Here is the data I am working with*: dput(head(data_log, 20)) structure(list(x_reading = c(455L, 451L, 458L, 463L, 462L, 460L, 448L, 449L, 450L, 451L, 445L, 440L, 439L, 445L, 448L, 447L, 440L, 439L, 440L, 434L), y_reading = c(502L, 503L, 502L, 502L, 495L, 505L, 480L, 483L, 489L, 488L, 489L, 456L, 497L, 476L, 470L, 474L, 469L, 482L, 484L, 477L), z_reading = c(454L, 454L, 452L, 452L, 446L, 459L, 456L, 451L, 451L, 455L, 438L, 462L, 437L, 455L, 470L, 455L, 460L, 463L, 458L, 458L)), .Names = c(x_reading, y_reading, z_reading), row.names = c(NA, 20L), class = data.frame) *however, I am unsure why the letter L has been appended to each numerical string. It denotes values stored as integers, and is nothing you need to worry about. In any event, as you can see there are three columns of data named x_reading, y_reading and z_reading. I would like to detect patterns among them. For instance, let's say the pattern I wish to detect is 455, 502, 454 across the three columns respectively. As you can see in the data, this is found in the first row.This particular string reoccurs numerous times within the dataset is what I wish to quantify - how many times the string 455, 502, 454 appears. Your thoughts? Did you try the code I provided? It does what I think you're looking for. Sarah Many thanks, Vincent On Thu, Jun 26, 2014 at 4:46 PM, Sarah Goslee sarah.gos...@gmail.com wrote: Hi, On Thu, Jun 26, 2014 at 12:17 PM, VINCENT DEAN BOYCE vincentdeanbo...@gmail.com wrote: Hello, Using R, I've loaded a .cvs file comprised of several hundred rows and 3 columns of data. The data within maps the output of a triaxial accelerometer, a sensor which measures an object's acceleration along the x,y and z axes. The data for each respective column sequentially oscillates, and ranges numerically from 100 to 500. If your data are numeric, why are you using stringr? It would be easier to provide you with an answer if we knew what your data looked like. dput(head(yourdata, 20)) and paste that into your non-HTML email. I want create a function that parses the data and detects patterns across the three columns. For instance, I would like to detect instances when the values for the x,y and z columns equal 150, 200, 300 respectively. Additionally, when a match is detected, I would like to know how many times the pattern appears. That's easy enough: fakedata - data.frame(matrix(c( 100, 100, 200, 150, 200, 300, 100, 350, 100, 400, 200, 300, 200, 500, 200, 150, 200, 300, 150, 200, 300), ncol=3, byrow=TRUE)) v.to.match - c(150, 200, 300) v.matches - apply(fakedata, 1, function(x)all(x == v.to.match)) # which rows match which(v.matches) # how many rows match sum(v.matches) I have been successful using str_detect to provide a Boolean, however it seems to only work on a single vector, i.e, 400 , not a range
Re: [R-es] [Grupo de Usuarios R Madrid]: Siguiente reunión el 1-julio... (Agenda disponible)...
Se recomienda llevar instaladas alguna librerÃa? Aquà - *Lugar:* Facultad de Ciencias - UNED. C/ Senda del Rey, 9. http://portal.uned.es/portal/page?_pageid=93,688166_dad=portal_schema=PORTAL - *Cómo llegar* http://portal.uned.es/portal/page?_pageid=93,688166_dad=portal_schema=PORTAL - *Hora:* 6:30pm - 8:30pm , no? El 29 de junio de 2014, 1:44, Carlos Ortega c...@qualityexcellence.es escribió: Hola, La siguiente reunión del Grupo de Usuarios de R de Madrid será el martes 1-julio. La agenda prevista es la siguiente: - Presentaciones: - Carlos Ortega http://www.qualityexcellence.es/: PISA - Escalas LikeRt (segunda parte) - pildoRas: - Pedro Concejero http://www.linkedin.com/in/pedroconcejero: Slidify desde RStudio - Pedro Concejero http://www.linkedin.com/in/pedroconcejero: Generar ficheros Word y pdf con nueva versión RStudio - Gregorio Serrano http://www.grserrano.es/: Procesar documentos pdf con R http://www.grserrano.es/wp/2014/06/extrayendo-informacion-de-archivos-pdf/ - Carlos Ortega http://www.qualityexcellence.es/: Un nuevo paquete de R que me ha gustado â âMás detalles en: http://r-es.org/GILMadridâ -- Saludos, Carlos Ortega www.qualityexcellence.es [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es