Re: [R] Sending Email with Attachment
On Mon, 10 Jun 2013, Bhupendrasinh Thakre vickytha...@gmail.com writes: Thanks Rex for the help. So it seems that I might have to use Python or Perl to perform the action. On Windows, you may want to look at Blat ( http://www.blat.net/ ). You can easily use it from R scripts via 'system'. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of rex Sent: Sunday, June 09, 2013 10:27 PM To: r-help@r-project.org Subject: Re: [R] Sending Email with Attachment Bhupendrasinh Thakre vickytha...@gmail.com [2013-06-09 20:03]: library(sendmailR) from - a...@outlook.com to - mailto:e...@gmail.com e...@gmail.com subject - Run at mailControl = list(smtpServer=blu-m.hotmail.com) attachment - type_1.pdf attachmentName - target_score.pdf attachmentObject - mime_part(x= attachment,name= attachmentName) body - Email Body bodywithAttachement - list(body, attachmentObject) sendmail(from=from,to=to,subject=subject,msg= bodywithAttachement,control=mailControl) However it gives me following Error: Error: Error in socketConnection(host = server, port = port, blocking = TRUE) : cannot open the connection In addition: Warning message: In socketConnection(host = server, port = port, blocking = TRUE) : blu-m.hotmail.com:25 cannot be opened It's an unsurprising result since telnet doesn't connect either: telnet blu-m.hotmail.com 25 Trying 65.55.121.94... [...] -- Enrico Schumann Lucerne, Switzerland http://enricoschumann.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reshaping a data frame
Hi Guys I am trying to cast a data frame but not aggregate the rows for the same variable. here is a contrived example. **input** temp_df - data.frame(names=c('foo','foo','foo'),variable=c('w','w','w'),value=c(34,65,12)) temp_df names variable value 1 foow34 2 foow65 3 foow12 ### **Want this** names w foo 34 foo 65 foo 12 ## **getting this*** ## cast(temp_df) Aggregation requires fun.aggregate: length used as default names w 1 foo 3 In real dataset the categorical column 'variable' will have many more categorical variable. Thanks! -Abhi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cannot load pbdMPI package after compilation
On 10/06/2013 03:17, Pascal Oettli wrote: Hello, I am not sure whether it helps you, but I was able to install it. OpenSUSE 12.3 R version 3.0.1 Patched (2013-06-09 r62918) pbdMPI version 0.1-6 gcc version 4.7.2 OpenMPI version 1.6.3 I didn't try with the most recent version of ompi (1.6.4). But the system used to accept that version of pdbMPI for CRAN used it, with gcc. The issue here is likely to be using the Intel compiler with OpenMPI. This is a programming matter really off-topic for R-help (see the posting guide). The first port of call for help is the package maintainer, then if that does not help, the R-devel list. But very few R users have access to an Intel compiler, let alone one as recent as that, and you will be expected to use a debugger for yourself (see 'Writing R Extensions'). Regards, Pascal On 07/06/13 21:42, Antoine Migeon wrote: Hello, I try to install pbdMPI. Compilation successful, but load fails with segfault. Is anyone can help me? R version 3.0.0 pbdMPI version 0.1-6 Intel compiler version 13.1.1 OpenMPI version 1.6.4-1 CPU Intel x86_64 # R CMD INSTALL pbdMPI_0.1-6.tar.gz .. checking for gcc... icc -std=gnu99 checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether icc -std=gnu99 accepts -g... yes checking for icc -std=gnu99 option to accept ISO C89... none needed checking for mpirun... mpirun checking for mpiexec... mpiexec checking for orterun... orterun checking for sed... /bin/sed checking for mpicc... mpicc checking for ompi_info... ompi_info checking for mpich2version... F found sed, mpicc, and ompi_info ... TMP_INC_DIRS = /opt/openmpi/1.6.4-1/intel-13.1.1/include checking /opt/openmpi/1.6.4-1/intel-13.1.1/include ... found /opt/openmpi/1.6.4-1/intel-13.1.1/include/mpi.h ... TMP_LIB_DIRS = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64 checking /opt/openmpi/1.6.4-1/intel-13.1.1/lib64 ... found /opt/openmpi/1.6.4-1/intel-13.1.1/lib64/libmpi.so ... found mpi.h and libmpi.so ... TMP_INC = /opt/openmpi/1.6.4-1/intel-13.1.1/include TMP_LIB = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64 checking for openpty in -lutil... yes checking for main in -lpthread... yes *** Results of pbdMPI package configure * TMP_INC = /opt/openmpi/1.6.4-1/intel-13.1.1/include TMP_LIB = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64 MPI_ROOT = MPITYPE = OPENMPI MPI_INCLUDE_PATH = /opt/openmpi/1.6.4-1/intel-13.1.1/include MPI_LIBPATH = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64 MPI_LIBS = -lutil -lpthread MPI_DEFS = -DMPI2 MPI_INCL2 = PKG_CPPFLAGS = -I/opt/openmpi/1.6.4-1/intel-13.1.1/include -DMPI2 -DOPENMPI PKG_LIBS = -L/opt/openmpi/1.6.4-1/intel-13.1.1/lib64 -lmpi -lutil -lpthread * .. icc -std=gnu99 -I/usr/local/R/3.0.0/intel13/lib64/R/include -DNDEBUG -I/opt/openmpi/1.6.4-1/intel-13.1.1/include -DMPI2 -DOPENMPI -O3 -fp-model precise -pc 64 -axAVX-fpic -O3 -fp-model precise -pc 64 -axAVX -c comm_errors.c -o comm_errors.o icc -std=gnu99 -I/usr/local/R/3.0.0/intel13/lib64/R/include -DNDEBUG -I/opt/openmpi/1.6.4-1/intel-13.1.1/include -DMPI2 -DOPENMPI -O3 -fp-model precise -pc 64 -axAVX-fpic -O3 -fp-model precise -pc 64 -axAVX -c comm_sort_double.c -o comm_sort_double.o . .. ** testing if installed package can be loaded sh: line 1: 2905 Segmentation fault '/usr/local/R/3.0.0/intel13/lib64/R/bin/R' --no-save --slave 21 /tmp/RtmpGkncGK/file1e541c57190 ERROR: loading failed *** caught segfault *** address (nil), cause 'unknown' Traceback: 1: .Call(spmd_initialize, PACKAGE = pbdMPI) 2: fun(libname, pkgname) 3: doTryCatch(return(expr), name, parentenv, handler) 4: tryCatchOne(expr, names, parentenv, handlers[[1L]]) 5: tryCatchList(expr, classes, parentenv, handlers) 6: tryCatch(fun(libname, pkgname), error = identity) 7: runHook(.onLoad, env, package.lib, package) 8: loadNamespace(package, c(which.lib.loc, lib.loc)) 9: doTryCatch(return(expr), name, parentenv, handler) 10: tryCatchOne(expr, names, parentenv, handlers[[1L]]) 11: tryCatchList(expr, classes, parentenv, handlers) 12: tryCatch(expr, error = function(e) {call - conditionCall(e) if (!is.null(call)) {if (identical(call[[1L]], quote(doTryCatch))) call - sys.call(-4L)dcall - deparse(call)[1L]prefix - paste(Error in, dcall, : ) LONG - 75Lmsg - conditionMessage(e)sm - strsplit(msg, \n)[[1L]]w - 14L + nchar(dcall, type = w) + nchar(sm[1L], type = w)if (is.na(w)) w - 14L + nchar(dcall, type = b) + nchar(sm[1L], type = b)if (w LONG) prefix - paste0(prefix, \n )}else prefix - Error : msg - paste0(prefix,
[R] problems with setClass or/and setMethod
Hello, I am working my way through A (not so) Short introduction to S4 I created a class setClass(Class = Trajectories, representation = representation(times = numeric,traj = matrix)) and tried to build a method using setMethod( f = plot, signature = Trajectories, definition = function(X, y, ...){ matplot(x@times, t(x@traj), xaxt = n, type = l, ylab = , xlab = , pch = 1) axis(1, at = x@times) } ) R responds with an error message: Creating a generic function for plot from package graphics in the global environment Error in conformMethod(signature, mnames, fnames, f, fdef, definition) : in method for plot with signature x=Trajectories: formal arguments (x = Trajectories, y = Trajectories, ... = Trajectories) omitted in the method definition cannot be in the signature Did anything change in the transition to R-3.0? is there any other, more recent introduction to S4 classes recommended? Thank you for your help. Andreas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying breakpoints/inflection points?
You can try this: library(inflection) #you have to instsall package inflection first a-findiplist(cbind(year),cbind(piproute),1) a The answer: [,1] [,2] [,3] [1,]5 35 1986.0 [2,]5 30 1983.5 shows that the total inflection point is between 1983 and 1986, if we treat data as first concave and then convex, as it can be found from a simple graph. -- View this message in context: http://r.789695.n4.nabble.com/Identifying-breakpoints-inflection-points-tp2065886p4669117.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create a package with package.skeleton
Hi, I am trying to build a package with package.skeleton function. I already have the function quadprod2.R in the current folder. After running the program, library(frontier) source(quadprod2.R) package.skeleton(name=sfa_ext) package.skeleton(name=sfa_ext) Creating directories ... Creating DESCRIPTION ... Creating NAMESPACE ... Creating Read-and-delete-me ... Saving functions and data ... Making help files ... Done. Further steps are described in './sfa_ext/Read-and-delete-me'. Opening the Read-and-delete-me file by notepad, I find * Edit the help file skeletons in 'man', possibly combining help files for multiple functions. * Edit the exports in 'NAMESPACE', and add necessary imports. * Put any C/C++/Fortran code in 'src'. * If you have compiled code, add a useDynLib() directive to 'NAMESPACE'. * Run R CMD build to build the package tarball. * Run R CMD check to check the package tarball. Read Writing R Extensions for more information. Then it seems that I need to edit some documentation. Since I build the package primarily for myself, I spend as little time as possible editing the documentation. I almost do nothing on man and namespace. (Is that ok?) What should I do next? How can I run build and check commands? Thanks Miao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems with setClass or/and setMethod
On Jun 9, 2013, at 11:37 PM, andreas betz wrote: Hello, I am working my way through A (not so) Short introduction to S4 I created a class setClass(Class = Trajectories, representation = representation(times = numeric,traj = matrix)) and tried to build a method using setMethod( f = plot, signature = Trajectories, definition = function(X, y, ...){ matplot(x@times, t(x@traj), xaxt = n, type = l, ylab = , xlab = , pch = 1) axis(1, at = x@times) } ) R responds with an error message: Creating a generic function for Œplot‚ from package Œgraphics‚ in the global environment Error in conformMethod(signature, mnames, fnames, f, fdef, definition) : in method for Œplot‚ with signature Œx=Trajectories‚: formal arguments (x = Trajectories, y = Trajectories, ... = Trajectories) omitted in the method definition cannot be in the signature Did anything change in the transition to R-3.0? I doubt it worked in earlier versions. There is a misprint of X where there should be an x. I'm unable to explain why the y is along side the x in the argument list since the 'definition' function does nothing with it. is there any other, more recent introduction to S4 classes recommended? Thank you for your help. Andreas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error Object Not Found
Hello, Please quote context. The message you get means that package foreign is installed on your computer, you need to load it in the R session: library(foreign) Hope this helps, Rui Barradas Em 09-06-2013 23:07, Court escreveu: Hi, I think that they are loaded. Here is the response that I get: package ‘foreign’ successfully unpacked and MD5 sums checked -- View this message in context: http://r.789695.n4.nabble.com/Error-Object-Not-Found-tp4669041p4669100.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to expand.grid with string elements (the half!)
Your question makes no sense at all. The grid expansion has 9 rows. In case you hadn't noticed, 9 is an odd number (i.e. not divisible by 2). There are no halves. Do not expect the list to read your mind. Instead, ask a meaningful question. cheers, Rolf Turner On 10/06/13 17:25, Gundala Viswanath wrote: I have the following result of expand grid: d - expand.grid(c(x,y,z),c(x,y,z)) What I want is to create a combination of strings but only the half of the all combinations: Var1 Var2 1xx 2yx 3 yy 4 zy 5 xz 6zz What's the way to do it? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Not sure this is something R could do but it feels like it should be.
On 06/09/2013 11:14 PM, Calum Polwart wrote: ... What we are trying to do is determine the most appropriate number to make the capsules. (Our dosing is more complex but lets stick to something simple. I can safely assure you that vritually no-one actually needs 250 or 500mg as a dose of amoxicillin... ...thats just a dose to get them into a therapeutic window, and I'm 99% certain 250 and 500 are used coz they are round numbers. if 337.5 more reliably got everyone in the window without kicking anyone out the window that'd be a better dose to use! So... what I'm looking to do is model the 'theoretical dose required' (which we know) and the dose delivered using several starting points to get the 'best fit'. We know they need to be within 7% of each other, but if one starting point can get 85% of doses within 5% we think that might be better than one that only gets 50% within 5%. Okay, I think I see what you are attempting now. You are stuck with fairly large dosage increments (say powers of two) and you want to have a base value that will be appropriate for the greatest number of patients. So, your range of doses can be generated with: d * 2 ^ (0:m) where d is some constant and m+1 is the number of doses you want to generate. For your amoxcillin, d=250 and m=1, so you get 250 and 500mg. Given this relationship (or any other one you can define), you want to set your base dose so that it is close to the mode of the patient distribution. This means that the greatest number of patients will be suitably dosed with your base dose. I would probably try to solve this by brute force, setting the base dose at the mode and then moving it up and down until the dose was appropriate for the largest number of patients. However, there are a lot of people on this list who would be more familiar with this sort of problem, and there may be a more elegant solution. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] modify and append new rows to a data.frame using ddply
Hi, I have a data.frame that contains a variable act which records the duration (in seconds) of two states (wet-dry) for several individuals (identified by Ring) over a period of time. Since I want to work with daytime (i.e. from sunrise till sunset) and night time (i.e. from sunset till next sunrise), I have to split act from time[i] till sunset and from sunset until time[i+1], and from time[k] till sunrise and from sunrise until time[k+1]. Here is an example with time and act separated by a comma: [i] 01-01-2000 20:55:00 , 360 [i+1] 01-01-2000 21:01:00 , 30 # let's say that sunset is at 01-01-2000 21:00:00 [i+2] 01-01-2000 21:01:30 , 30 . . . My goal is to get: [i] 01-01-2000 20:55:00 , 300 # act is modified [i+1] 01-01-2000 21:00:00 , 60 # new row with time=sunset [i+2] 01-01-2000 21:01:00 , 30 # previously row i+1th [i+3] 01-01-2000 21:01:30 , 30 # previously row i+2th . . . I attach a dput with a selection of my data.frame. Here is a piece of existing code that I am trying to adapt just for the daytime/night time change: require(plyr) xandaynight - ddply( xan, .(Ring), function(df1){ # index of day/night changes ind - c( FALSE, diff(df$dif) == 1 ) add - df1[ind,] add$timepos - add$dusk # rearrangement df1 - rbind( df1, add ) df1 - df1[order(df1$timepos),] # recalculation of act df1$act2 - c( diff( as.numeric(df1$timepos) ), NA ) df1} ) This code produces an error message: Error en diff(df$dif): error in evaluating the argument 'x' in selecting a method for function 'diff': Error en df$dif: object of type 'closure' is not a subset Thank you for your help, Santi__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Estimation of covariance matrices and mixing parameter by a bivariate normal-lognormal model
Dear all, I have to create a model which is a mixture of a normal and log-normal distribution. To create it, I need to estimate the 2 covariance matrixes and the mixing parameter (total =7 parameters) by maximizing the log-likelihood function. This maximization has to be performed by the nlm routine. As I use relative data, the means are known and equal to 1. I’ve already tried to do it in 1 dimension (with 1 set of relative data) and it works well. However, when I introduce the 2nd set of relative data I get illogical results for the correlation and a lot of warnings messages. To estimates the parameters I defined first the log-likelihood function with the 2 commands dmvnorm and dlnorm.plus. Then I assign starting values of the parameters and finally I use the nlm routine to estimate the parameters (see script below). # Importing and reading the grid files. Output are 2048x2048 matrixes P - read.ascii.grid(d:/Documents/JOINT_FREQUENCY/grid_E727_P-3000.asc, return.header= FALSE ); V - read.ascii.grid(d:/Documents/JOINT_FREQUENCY/grid_E727_V-3000.asc, return.header= FALSE ); p - c(P); # tranform matrix into a vector v - c(V); p- p[!is.na(p)] # removing NA values v- v[!is.na(v)] p_rel - p/mean(p) #Transforming the data to relative values v_rel - v/mean(v) PV - cbind(p_rel, v_rel) # create a matrix of vectors L - function(par,p_rel,v_rel) { return (-sum(log( (1- par[7])*dmvnorm(PV, mean=c(1,1), sigma= matrix(c(par[1], par[1]*par[2]*par[3],par[1]*par[2]*par[3], par[2] ),nrow=2, ncol=2))+ par[7]*dlnorm.rplus(PV, meanlog=c(1,1), varlog= matrix(c(par[4],par[4]*par[5]*par[6],par[4]*par[5]*par[6],par[5]), nrow=2,ncol=2))))) } par.start- c(0.74, 0.66 ,0.40, 1.4, 1.2, 0.4, 0.5) # log-likelihood estimators result-nlm(L,par.start,v_rel=v_rel,p_rel=p_rel, hessian=TRUE, iterlim=200, check.analyticals= TRUE) Il y a eu 50 avis ou plus (utilisez warnings() pour voir les 50 premiers) 1: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) : production de NaN 2: In sqrt(2 * pi * det(varlog)) : production de NaN 3: In nlm(L, par.start, v_rel = v_rel, p_rel = p_rel, hessian = TRUE, ... : NA/Inf replaced by maximum positive value 4: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) : production de NaN 5: In sqrt(2 * pi * det(varlog)) : production de NaN 6: In nlm(L, par.start, v_rel = v_rel, p_rel = p_rel, hessian = TRUE, ... : NA/Inf replaced by maximum positive value par.hat - result$estimate cat(sigN_p =, par[1],\n,sigN_v =, par[2],\n,rhoN =, par[3],\n,sigLN_p =, par[4],\n,sigLN_v =, par[5],\n,rhoLN =, par[6],\n,mixing parameter =, par[7],\n) sigN_p = 0.2919377 sigN_v = 0.4445056 rhoN = 1.737904 sigLN_p = 2.911735 sigLN_v = 2.539405 rhoLN = 0.3580525 mixing parameter = 0.8112917 Does someone know what is wrong in my model or how should I do to find these parameters in 2 dimensions? Thank you very much for taking time to look at my questions. Regards, Gladys Hertzog -- View this message in context: http://r.789695.n4.nabble.com/Estimation-of-covariance-matrices-and-mixing-parameter-by-a-bivariate-normal-lognormal-model-tp4669143.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modify and append new rows to a data.frame using ddply
On 10-06-2013, at 11:49, Santiago Guallar sgual...@yahoo.com wrote: Hi, I have a data.frame that contains a variable act which records the duration (in seconds) of two states (wet-dry) for several individuals (identified by Ring) over a period of time. Since I want to work with daytime (i.e. from sunrise till sunset) and night time (i.e. from sunset till next sunrise), I have to split act from time[i] till sunset and from sunset until time[i+1], and from time[k] till sunrise and from sunrise until time[k+1]. Here is an example with time and act separated by a comma: [i] 01-01-2000 20:55:00 , 360 [i+1] 01-01-2000 21:01:00 , 30 # let's say that sunset is at 01-01-2000 21:00:00 [i+2] 01-01-2000 21:01:30 , 30 . . . My goal is to get: [i] 01-01-2000 20:55:00 , 300 # act is modified [i+1] 01-01-2000 21:00:00 , 60 # new row with time=sunset [i+2] 01-01-2000 21:01:00 , 30 # previously row i+1th [i+3] 01-01-2000 21:01:30 , 30 # previously row i+2th . . . I attach a dput with a selection of my data.frame. Here is a piece of existing code that I am trying to adapt just for the daytime/night time change: require(plyr) xandaynight - ddply( xan, .(Ring), function(df1){ # index of day/night changes ind - c( FALSE, diff(df$dif) == 1 ) add - df1[ind,] add$timepos - add$dusk # rearrangement df1 - rbind( df1, add ) df1 - df1[order(df1$timepos),] # recalculation of act df1$act2 - c( diff( as.numeric(df1$timepos) ), NA ) df1} ) This code produces an error message: Error en diff(df$dif): error in evaluating the argument 'x' in selecting a method for function 'diff': Error en df$dif: object of type 'closure' is not a subset Shouldn't the line ind - c( FALSE, diff(df$dif) == 1 ) read ind - c( FALSE, diff(df1$dif) == 1 ) Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cannot load pbdMPI package after compilation
Thank you, I will try contact the developper. Antoine Migeon Université de Bourgogne Centre de Calcul et Messagerie Direction des Systèmes d'Information tel : 03 80 39 52 70 Site du CCUB : http://www.u-bourgogne.fr/dsi-ccub Le 10/06/2013 08:19, Prof Brian Ripley a écrit : On 10/06/2013 03:17, Pascal Oettli wrote: Hello, I am not sure whether it helps you, but I was able to install it. OpenSUSE 12.3 R version 3.0.1 Patched (2013-06-09 r62918) pbdMPI version 0.1-6 gcc version 4.7.2 OpenMPI version 1.6.3 I didn't try with the most recent version of ompi (1.6.4). But the system used to accept that version of pdbMPI for CRAN used it, with gcc. The issue here is likely to be using the Intel compiler with OpenMPI. This is a programming matter really off-topic for R-help (see the posting guide). The first port of call for help is the package maintainer, then if that does not help, the R-devel list. But very few R users have access to an Intel compiler, let alone one as recent as that, and you will be expected to use a debugger for yourself (see 'Writing R Extensions'). Regards, Pascal On 07/06/13 21:42, Antoine Migeon wrote: Hello, I try to install pbdMPI. Compilation successful, but load fails with segfault. Is anyone can help me? R version 3.0.0 pbdMPI version 0.1-6 Intel compiler version 13.1.1 OpenMPI version 1.6.4-1 CPU Intel x86_64 # R CMD INSTALL pbdMPI_0.1-6.tar.gz .. checking for gcc... icc -std=gnu99 checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether icc -std=gnu99 accepts -g... yes checking for icc -std=gnu99 option to accept ISO C89... none needed checking for mpirun... mpirun checking for mpiexec... mpiexec checking for orterun... orterun checking for sed... /bin/sed checking for mpicc... mpicc checking for ompi_info... ompi_info checking for mpich2version... F found sed, mpicc, and ompi_info ... TMP_INC_DIRS = /opt/openmpi/1.6.4-1/intel-13.1.1/include checking /opt/openmpi/1.6.4-1/intel-13.1.1/include ... found /opt/openmpi/1.6.4-1/intel-13.1.1/include/mpi.h ... TMP_LIB_DIRS = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64 checking /opt/openmpi/1.6.4-1/intel-13.1.1/lib64 ... found /opt/openmpi/1.6.4-1/intel-13.1.1/lib64/libmpi.so ... found mpi.h and libmpi.so ... TMP_INC = /opt/openmpi/1.6.4-1/intel-13.1.1/include TMP_LIB = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64 checking for openpty in -lutil... yes checking for main in -lpthread... yes *** Results of pbdMPI package configure * TMP_INC = /opt/openmpi/1.6.4-1/intel-13.1.1/include TMP_LIB = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64 MPI_ROOT = MPITYPE = OPENMPI MPI_INCLUDE_PATH = /opt/openmpi/1.6.4-1/intel-13.1.1/include MPI_LIBPATH = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64 MPI_LIBS = -lutil -lpthread MPI_DEFS = -DMPI2 MPI_INCL2 = PKG_CPPFLAGS = -I/opt/openmpi/1.6.4-1/intel-13.1.1/include -DMPI2 -DOPENMPI PKG_LIBS = -L/opt/openmpi/1.6.4-1/intel-13.1.1/lib64 -lmpi -lutil -lpthread * .. icc -std=gnu99 -I/usr/local/R/3.0.0/intel13/lib64/R/include -DNDEBUG -I/opt/openmpi/1.6.4-1/intel-13.1.1/include -DMPI2 -DOPENMPI -O3 -fp-model precise -pc 64 -axAVX-fpic -O3 -fp-model precise -pc 64 -axAVX -c comm_errors.c -o comm_errors.o icc -std=gnu99 -I/usr/local/R/3.0.0/intel13/lib64/R/include -DNDEBUG -I/opt/openmpi/1.6.4-1/intel-13.1.1/include -DMPI2 -DOPENMPI -O3 -fp-model precise -pc 64 -axAVX-fpic -O3 -fp-model precise -pc 64 -axAVX -c comm_sort_double.c -o comm_sort_double.o . .. ** testing if installed package can be loaded sh: line 1: 2905 Segmentation fault '/usr/local/R/3.0.0/intel13/lib64/R/bin/R' --no-save --slave 21 /tmp/RtmpGkncGK/file1e541c57190 ERROR: loading failed *** caught segfault *** address (nil), cause 'unknown' Traceback: 1: .Call(spmd_initialize, PACKAGE = pbdMPI) 2: fun(libname, pkgname) 3: doTryCatch(return(expr), name, parentenv, handler) 4: tryCatchOne(expr, names, parentenv, handlers[[1L]]) 5: tryCatchList(expr, classes, parentenv, handlers) 6: tryCatch(fun(libname, pkgname), error = identity) 7: runHook(.onLoad, env, package.lib, package) 8: loadNamespace(package, c(which.lib.loc, lib.loc)) 9: doTryCatch(return(expr), name, parentenv, handler) 10: tryCatchOne(expr, names, parentenv, handlers[[1L]]) 11: tryCatchList(expr, classes, parentenv, handlers) 12: tryCatch(expr, error = function(e) {call - conditionCall(e) if (!is.null(call)) {if (identical(call[[1L]], quote(doTryCatch))) call - sys.call(-4L)dcall - deparse(call)[1L]prefix - paste(Error in, dcall, : ) LONG - 75L
Re: [R] agnes() in package cluster on R 2.14.1 and R 3.0.1
Hugo Varet vareth...@gmail.com on Sun, 9 Jun 2013 11:43:32 +0200 writes: Dear R users, I discovered something strange using the function agnes() of the cluster package on R 3.0.1 and on R 2.14.1. Indeed, the clusterings obtained are different whereas I ran exactly the same code. hard to believe... but .. I quickly looked at the source code of the function and I discovered that there was an important change: agnes() in R 2.14.1 used a FORTRAN code whereas agnes() in R 3.0.1 uses a C code. well, it does so quite a bit longer, e.g., also in R 2.15.0 Here is one of the contingency table between R 2.14.1 and R 3.0.1: classe.agnTani.2.14.1 classe.agnTani.3.0.1 12 3 174 0229 2 02350 3 120 0 15 So, I was wondering if it was normal that the C and FORTRAN codes give different results? It's not normal, and I'm pretty sure I have had many many examples which gave identical results. Can you provide a reproducible example, please? If the example is too large [for dput() ], please send me the *.rda file produced from save(your data, file=the file I neeed) *and* a the exact call to agnes() for your data. Thank you in advance! Martin Maechler, the one you could have e-mailed directly to using maintainer(cluster) ... Best regards, Hugo Varet [[alternative HTML version deleted]] ^ try to avoid, please ^ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. yes indeed, please. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshaping a data frame
Abhi, In the example you give, you don't really need to reshape the data ... just rename the column value to w. Here's a different example with more than one category ... tempdf - expand.grid(names=c(foo, bar), variable=letters[1:3]) tempdf$value - rnorm(dim(tempdf)[1]) tempdf library(reshape) cast(tempdf) But, that may not be what you want, If not, please give an example with more than one category showing us what you have and what you want. Jean On Mon, Jun 10, 2013 at 1:15 AM, Abhishek Pratap abhishek@gmail.comwrote: Hi Guys I am trying to cast a data frame but not aggregate the rows for the same variable. here is a contrived example. **input** temp_df - data.frame(names=c('foo','foo','foo'),variable=c('w','w','w'),value=c(34,65,12)) temp_df names variable value 1 foow34 2 foow65 3 foow12 ### **Want this** names w foo 34 foo 65 foo 12 ## **getting this*** ## cast(temp_df) Aggregation requires fun.aggregate: length used as default names w 1 foo 3 In real dataset the categorical column 'variable' will have many more categorical variable. Thanks! -Abhi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshaping a data frame
Unless I completely misunderstand what you are doing you don't need to aggregate, just drop the one column and rename things newtemp - temp_df[, c(1,3)] names(newtemp) - c(names, w) newtemp John Kane Kingston ON Canada -Original Message- From: abhishek@gmail.com Sent: Sun, 9 Jun 2013 23:15:48 -0700 To: r-help@r-project.org Subject: [R] reshaping a data frame Hi Guys I am trying to cast a data frame but not aggregate the rows for the same variable. here is a contrived example. **input** temp_df - data.frame(names=c('foo','foo','foo'),variable=c('w','w','w'),value=c(34,65,12)) temp_df names variable value 1 foow34 2 foow65 3 foow12 ### **Want this** names w foo 34 foo 65 foo 12 ## **getting this*** ## cast(temp_df) Aggregation requires fun.aggregate: length used as default names w 1 foo 3 In real dataset the categorical column 'variable' will have many more categorical variable. Thanks! -Abhi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] All against all correlation matrix with GGPLOT Facet
No image. The R-help list tends to strip out a lot of files. A pdf or txt usually gets through. In any case I understand what you want this may do it. library(ggplot2) dat1 - data.frame( v = rnorm(13), w = rnorm(13), x = rnorm(13), y = rnorm(13), z = rnorm(13)) plotmatrix(dat1) John Kane Kingston ON Canada -Original Message- From: gunda...@gmail.com Sent: Mon, 10 Jun 2013 12:26:44 +0900 To: r-h...@stat.math.ethz.ch Subject: [R] All against all correlation matrix with GGPLOT Facet I have the following data: v - rnorm(13) w - rnorm(13) x - rnorm(13) y - rnorm(13) z - rnorm(13) Using GGPLOT facet, what I want to do is to create a 5*5 matrix, where each cells plot the correlation between each pair of the above data. E.g. v-v,v-w; v-x,...,z-z What's the way to do it? Attached is the image. GV. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshaping a data frame
Hi,If your dataset is similar to the one below: set.seed(24) temp1_df- data.frame(names=rep(c('foo','foo1'),each=6),variable=rep(c('w','x'),times=6),value=sample(25:40,12,replace=TRUE),stringsAsFactors=FALSE) library(reshape2) res-dcast(within(temp1_df,{Seq1-ave(value,names,variable,FUN=seq_along)}),names+Seq1~variable,value.var=value)[,-2] res # names w x #1 foo 29 28 #2 foo 36 33 #3 foo 35 39 #4 foo1 29 37 #5 foo1 37 29 #6 foo1 34 30 A.K. - Original Message - From: Abhishek Pratap abhishek@gmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Monday, June 10, 2013 2:15 AM Subject: [R] reshaping a data frame Hi Guys I am trying to cast a data frame but not aggregate the rows for the same variable. here is a contrived example. **input** temp_df - data.frame(names=c('foo','foo','foo'),variable=c('w','w','w'),value=c(34,65,12)) temp_df names variable value 1 foo w 34 2 foo w 65 3 foo w 12 ### **Want this** names w foo 34 foo 65 foo 12 ## **getting this*** ## cast(temp_df) Aggregation requires fun.aggregate: length used as default names w 1 foo 3 In real dataset the categorical column 'variable' will have many more categorical variable. Thanks! -Abhi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] please check this
Hi, Try this: which(duplicated(res10Percent)) # [1] 117 125 157 189 213 235 267 275 278 293 301 327 331 335 339 367 369 371 379 #[20] 413 415 417 441 459 461 477 479 505 res10PercentSub1-subset(res10Percent[which(duplicated(res10Percent)),],dummy==1) #most of the duplicated are dummy==1 res10PercentSub0-subset(res10Percent[which(duplicated(res10Percent)),],dummy==0) indx1-as.numeric(row.names(res10PercentSub1)) indx11-sort(c(indx1,indx1+1)) indx0- as.numeric(row.names(res10PercentSub0)) indx00- sort(c(indx0,indx0-1)) indx10- sort(c(indx11,indx00)) nrow(res10Percent[-indx10,]) #[1] 452 res10PercentNew-res10Percent[-indx10,] nrow(subset(res10PercentNew,dummy==1)) #[1] 226 nrow(subset(res10PercentNew,dummy==0)) #[1] 226 nrow(unique(res10PercentNew)) #[1] 452 A.K. - Original Message - From: Cecilia Carmo cecilia.ca...@ua.pt To: arun smartpink...@yahoo.com Cc: Sent: Monday, June 10, 2013 10:19 AM Subject: RE: please check this But I don't want it like this. Once a firm is paired with another, these two firms should not be paired again. Could you solve this? Thanks, Cecília De: arun [smartpink...@yahoo.com] Enviado: segunda-feira, 10 de Junho de 2013 15:12 Para: Cecilia Carmo Assunto: Re: please check this I did look into that. If you look for the nrow() in each category, then it will be different. It means that the duplicates are not pairwise, but in the whole `result`. The explanation is again with the multiple matches. So, here we selected the one with dummy==0 that closely matches the dimension of one dummy==1. Suppose, the value of dimension with dummy==1` is `2554` and it got a match with dummy==0 with `2580`. Now, consider another case with dimension as `2570` with dummy==1 (which also comes within the same split group). Then it got a match with `2580' with dummy==0. I guess it was based on the way in which it was tested. From: Cecilia Carmo cecilia.ca...@ua.pt To: arun smartpink...@yahoo.com Sent: Monday, June 10, 2013 10:02 AM Subject: please check this When I do res10Percent- fun1(final3New,0.1,200) dim(res10Percent) [1] 508 5 #[1] 508 5 nrow(subset(res10Percent,dummy==0)) #[1] 254 nrow(subset(res10Percent,dummy==1)) #[1] 254 testingDuplicates-unique(res10Percent) nrow(testingDuplicates) [1] 480 #this should be 508, if not there are duplicated rows, or not? Thanks Cecilia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] recode: how to avoid nested ifelse
Thanks, guys. On Sat, Jun 8, 2013 at 2:17 PM, Neal Fultz nfu...@gmail.com wrote: rowSums and Reduce will have the same problems with bad data you alluded to earlier, eg cg = 1, hs = 0 But that's something to check for with crosstabs anyway. This wrong data thing is a distraction here. I guess I'd have to craft 2 solutions, depending on what the researcher says. (We can't assume es = 0 or es = NA and cg = 1 is bad data. There are some people who finish college without doing elementary school (wasn't Albert Einstein one of those?) or high school. I once went to an eye doctor who didn't finish high school, but nonetheless was admitted to optometrist school.) I did not know about the Reduce function before this. If we enforce the ordering and clean up the data in the way you imagine, it would work. I think the pmax is the most teachable and dependably not-getting-wrongable approach if the data is not wrong. Side note: you should check out the microbenchmark pkg, it's quite handy. Perhaps the working example of microbenchmark is the best thing in this thread! I understand the idea behind it, but it seems like I can never get it to work right. It helps to see how you do it. Rrequire(microbenchmark) Rmicrobenchmark( + f1(cg,hs,es), + f2(cg,hs,es), + f3(cg,hs,es), + f4(cg,hs,es) + ) Unit: microseconds expr min lq median uq max neval f1(cg, hs, es) 23029.848 25279.9660 27024.9640 29996.6810 55444.112 100 f2(cg, hs, es) 730.665 755.5750 811.7445 934.3320 6179.798 100 f3(cg, hs, es)85.029 101.6785 129.8605 196.2835 2820.187 100 f4(cg, hs, es) 762.232 804.4850 843.7170 1079.0800 24869.548 100 On Fri, Jun 07, 2013 at 08:03:26PM -0700, Joshua Wiley wrote: I still argue for na.rm=FALSE, but that is cute, also substantially faster f1 - function(x1, x2, x3) do.call(paste0, list(x1, x2, x3)) f2 - function(x1, x2, x3) pmax(3*x3, 2*x2, es, 0, na.rm=FALSE) f3 - function(x1, x2, x3) Reduce(`+`, list(x1, x2, x3)) f4 - function(x1, x2, x3) rowSums(cbind(x1, x2, x3)) es - rep(c(0, 0, 1, 0, 1, 0, 1, 1, NA, NA), 1000) hs - rep(c(0, 0, 1, 0, 1, 0, 1, 0, 1, NA), 1000) cg - rep(c(0, 0, 0, 0, 1, 0, 1, 0, NA, NA), 1000) system.time(replicate(1000, f1(cg, hs, es))) system.time(replicate(1000, f2(cg, hs, es))) system.time(replicate(1000, f3(cg, hs, es))) system.time(replicate(1000, f4(cg, hs, es))) system.time(replicate(1000, f1(cg, hs, es))) user system elapsed 22.730.03 22.76 system.time(replicate(1000, f2(cg, hs, es))) user system elapsed 0.920.040.95 system.time(replicate(1000, f3(cg, hs, es))) user system elapsed 0.190.020.20 system.time(replicate(1000, f4(cg, hs, es))) user system elapsed 0.950.030.98 R version 3.0.0 (2013-04-03) Platform: x86_64-w64-mingw32/x64 (64-bit) On Fri, Jun 7, 2013 at 7:25 PM, Neal Fultz nfu...@gmail.com wrote: I would do this to get the highest non-missing level: x - pmax(3*cg, 2*hs, es, 0, na.rm=TRUE) rock chalk... -nfultz On Fri, Jun 07, 2013 at 06:24:50PM -0700, Joshua Wiley wrote: Hi Paul, Unless you have truly offended the data generating oracle*, the pattern: NA, 1, NA, should be a data entry error --- graduating HS implies graduating ES, no? I would argue fringe cases like that should be corrected in the data, not through coding work arounds. Then you can just do: x - do.call(paste0, list(es, hs, cg)) table(factor(x, levels = c(000, 100, 110, 111), labels = c(none, es,hs, cg))) none es hs cg 4112 Cheers, Josh *Drawn from comments by Judea Pearl one lively session. On Fri, Jun 7, 2013 at 6:13 PM, Paul Johnson pauljoh...@gmail.com wrote: In our Summer Stats Institute, I was asked a question that amounts to reversing the effect of the contrasts function (reconstruct an ordinal predictor from a set of binary columns). The best I could think of was to link together several ifelse functions, and I don't think I want to do this if the example became any more complicated. I'm unable to remember a less error prone method :). But I expect you might. Here's my working example code ## Paul Johnson pauljohn at ku.edu ## 2013-06-07 ## We need to create an ordinal factor from these indicators ## completed elementary school es - c(0, 0, 1, 0, 1, 0, 1, 1) ## completed high school hs - c(0, 0, 1, 0, 1, 0, 1, 0) ## completed college graduate cg - c(0, 0, 0, 0, 1, 0, 1, 0) ed - ifelse(cg == 1, 3, ifelse(hs == 1, 2, ifelse(es == 1, 1, 0))) edf - factor(ed, levels = 0:3, labels = c(none, es, hs, cg)) data.frame(es, hs, cg, ed, edf) ## Looks OK, but what if there are missings? es - c(0, 0, 1,
Re: [R] How to expand.grid with string elements (the half!)
If you can explain why those particular six combinations out of the complete set of nine, then perhaps someone can tell you how. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 6/9/13 10:25 PM, Gundala Viswanath gunda...@gmail.com wrote: I have the following result of expand grid: d - expand.grid(c(x,y,z),c(x,y,z)) What I want is to create a combination of strings but only the half of the all combinations: Var1 Var2 1xx 2yx 3 yy 4 zy 5 xz 6zz What's the way to do it? G.V. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Substituting the values on the y-axis
Hello, I plotted a graph on R showing how salinity (in ‰, y-axis) changes with time(in years, x-axis). However, right from the beginning on the Excel spreadsheet the v alues for salinity appeared as, for example, 35000‰ instead of 35‰, which I gues sed must have been a typing error for the website from which I extracted the dat a (NOAA).Thus, I now would like to substitute these values with the correspondin g smaller value, as it follows: 25000 35000- 25, 35 and so on. Is there any way I can change this on R or do I have to modify these numbers bef ore inputting the data on R (for example on Excel)? If so, can anybody tell me h ow to do either of these? Many thanks! Emanuela __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] twoby2 (Odds Ratio) for variables with 3 or more levels
Dear all, I am using Epi package to calculate Odds ratio in my bivariate analysis. How can I make *twoby2 *in variables that have 3 or more levels. For example: I have 4 level var (Age) m=matrix(c(290, 100,232, 201, 136, 99, 182, 240), nrow=4, ncol=2) twoby2(m) R gives me only Comparing : Row 1 vs. Row 2 While I would like to have reference value in Row 1, and compare Row 2, Row 3 and Row 4 with it. Thanks for your help! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rcmdr seit heute nicht mehr ladbar
Wenn man ihn mal braucht ist er tot. folgende Fehlermeldung ereilt mich seit heute beim starten des R Commander auf dem Mac: library(Rcmdr) Lade nötiges Paket: car Lade nötiges Paket: MASS Lade nötiges Paket: nnet Error : .onAttach in attachNamespace() für 'Rcmdr' fehlgeschlagen, Details: Aufruf: structure(.External(.C_dotTclObjv, objv), class = tclObj) Fehler: [tcl] invalid command name image. Zusätzlich: Warnmeldung: In fun(libname, pkgname) : couldn't connect to display /tmp/launch-K8nELf/org.macosforge.xquartz:0 Fehler: Laden von Paket oder Namensraum für 'Rcmdr' fehlgeschlagen Ich bin ziemlich angefressen. Folgende erfolglose Versuche: - R neu installiert - x11 neu installiert - alle Ordner dabei gelöscht - Pakete neu installiert Nichts. Rcmdr will nicht mehr. -- Beste Grüße, Yours, Bastian Wimmer M.A. Research Associate at the Chair of Educational Psychology University of Erlangen-Nuremberg Dutzendteichstraße 24 90478 Nuremberg Germany Phone: +49 (0) 9171 83924 84 Fax: +49 (0) 3222 64968 14 Email: bastian.wim...@fau.de Web: http://j.mp/Umkf4U (Chair of educational Psychology) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Substituting the values on the y-axis
Sounds like you have made no effort to learn R, e.g. by reading the Intro to R tutorial packaged with R or other online tutorial (there are many). Don't you think you need to do some homework first? -- Bert On Mon, Jun 10, 2013 at 7:26 AM, diddle1...@fastwebnet.it wrote: Hello, I plotted a graph on R showing how salinity (in ‰, y-axis) changes with time(in years, x-axis). However, right from the beginning on the Excel spreadsheet the v alues for salinity appeared as, for example, 35000‰ instead of 35‰, which I gues sed must have been a typing error for the website from which I extracted the dat a (NOAA).Thus, I now would like to substitute these values with the correspondin g smaller value, as it follows: 25000 35000- 25, 35 and so on. Is there any way I can change this on R or do I have to modify these numbers bef ore inputting the data on R (for example on Excel)? If so, can anybody tell me h ow to do either of these? Many thanks! Emanuela __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Substituting the values on the y-axis
Just calculate a new sequence if those percentages are in an orderly sequence. See ?seq v - seq(25, 200, by = 10) or perhaps the values are actually text ?substr x - substr(v, 1,2) John Kane Kingston ON Canada -Original Message- From: diddle1...@fastwebnet.it Sent: Mon, 10 Jun 2013 16:26:54 +0200 (CEST) To: r-help@r-project.org Subject: [R] Substituting the values on the y-axis Hello, I plotted a graph on R showing how salinity (in ‰, y-axis) changes with time(in years, x-axis). However, right from the beginning on the Excel spreadsheet the v alues for salinity appeared as, for example, 35000‰ instead of 35‰, which I gues sed must have been a typing error for the website from which I extracted the dat a (NOAA).Thus, I now would like to substitute these values with the correspondin g smaller value, as it follows: 25000 35000- 25, 35 and so on. Is there any way I can change this on R or do I have to modify these numbers bef ore inputting the data on R (for example on Excel)? If so, can anybody tell me h ow to do either of these? Many thanks! Emanuela __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Substituting the values on the y-axis
I did look into tutorials but I could not find the exact request I am looking for. I just started using R so I am still a beginner. If you then know where I can find it, can you please redirect me to it -- View this message in context: http://r.789695.n4.nabble.com/Substituting-the-values-on-the-y-axis-tp4669165p4669171.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rcmdr seit heute nicht mehr ladbar
Dear Bastian, I'm afraid that I don't read German, but (as near as I can tell) since you say that you're using the most recent version of R and have X11 installed, you should have the software you need. Just in case, you might check the Rcmdr installation notes for Mac users at http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html. Apparently, R is having difficulty connecting to X11. I'm copying this response to Rob Goedman, who has often been able to help with Rcmdr issues under Mac OS X. Best, John --- John Fox Senator McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Bastian Wimmer Sent: Monday, June 10, 2013 5:27 AM To: r-help@r-project.org Subject: [R] Rcmdr seit heute nicht mehr ladbar Wenn man ihn mal braucht ist er tot. folgende Fehlermeldung ereilt mich seit heute beim starten des R Commander auf dem Mac: library(Rcmdr) Lade nötiges Paket: car Lade nötiges Paket: MASS Lade nötiges Paket: nnet Error : .onAttach in attachNamespace() für 'Rcmdr' fehlgeschlagen, Details: Aufruf: structure(.External(.C_dotTclObjv, objv), class = tclObj) Fehler: [tcl] invalid command name image. Zusätzlich: Warnmeldung: In fun(libname, pkgname) : couldn't connect to display /tmp/launch- K8nELf/org.macosforge.xquartz:0 Fehler: Laden von Paket oder Namensraum für 'Rcmdr' fehlgeschlagen Ich bin ziemlich angefressen. Folgende erfolglose Versuche: - R neu installiert - x11 neu installiert - alle Ordner dabei gelöscht - Pakete neu installiert Nichts. Rcmdr will nicht mehr. -- Beste Grüße, Yours, Bastian Wimmer M.A. Research Associate at the Chair of Educational Psychology University of Erlangen-Nuremberg Dutzendteichstraße 24 90478 Nuremberg Germany Phone: +49 (0) 9171 83924 84 Fax: +49 (0) 3222 64968 14 Email: bastian.wim...@fau.de Web: http://j.mp/Umkf4U (Chair of educational Psychology) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] woby2 (Odds Ratio) for variables with 3 or more levels
Dear all, I am using Epi package to calculate Odds ratio in my bivariate analysis. How can I make *twoby2 *in variables that have 3 or more levels. For example: I have 4 level var (Age) m=matrix(c(290, 100,232, 201, 136, 99, 182, 240), nrow=4, ncol=2) library (Epi) twoby2(m) R gives me only Comparing : Row 1 vs. Row 2 While I would like to have reference value in Row 1, and compare Row 2, Row 3 and Row 4 with it. Thanks for your help! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Substituting the values on the y-axis
Hi Emanuela, Welcome to R It can be hard finding even relatively simple things when you are just starting. You might want to have a look at http://www.unt.edu/rss/class/Jon/R_SC/ or http://www.burns-stat.com/documents/tutorials/impatient-r/ if ou have not already seen them. Patrick Burn's site http://www.introductoryr.co.uk/R_Resources_for_Beginners.html has some useful links If you are a refugee from SAS or SPSS, this paper by Bob Muenchen is very useful www.et.bs.ehu.es/~etptupaf/pub/R/RforSASSPSSusers.pdf Some tricks for asking a good question in the R help list is here: https://github.com/hadley/devtools/wiki/Reproducibility or http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example In most cases it is very useful to provide some data. See ?dput in the last two links. A small bit of sample data in your original post would definately have helped. Many or most R-help readers do not use nabble and really hate to have to go there to see the context of a message. You should always leave the important parts of earlier messages to let the R-help reader see what the problems and other suggested solutions may be. John Kane Kingston ON Canada -Original Message- From: diddle1...@fastwebnet.it Sent: Mon, 10 Jun 2013 09:08:59 -0700 (PDT) To: r-help@r-project.org Subject: Re: [R] Substituting the values on the y-axis I did look into tutorials but I could not find the exact request I am looking for. I just started using R so I am still a beginner. If you then know where I can find it, can you please redirect me to it -- View this message in context: http://r.789695.n4.nabble.com/Substituting-the-values-on-the-y-axis-tp4669165p4669171.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to expand.grid with string elements (the half!)
Perhaps the OP wants the unique combinations of V1 and V2, as in R d - expand.grid(V1=c(x,y,z),V2=c(x,y,z)) R d[ as.numeric(d$V1) = as.numeric(d$V2), ] V1 V2 1 x x 4 x y 5 y y 7 x z 8 y z 9 z z or R V - letters[24:26] R rbind(t(combn(V,m=2)), cbind(V,V)) V V [1,] x y [2,] x z [3,] y z [4,] x x [5,] y y [6,] z z Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Rolf Turner Sent: Monday, June 10, 2013 2:20 AM To: Gundala Viswanath Cc: r-h...@stat.math.ethz.ch Subject: Re: [R] How to expand.grid with string elements (the half!) Your question makes no sense at all. The grid expansion has 9 rows. In case you hadn't noticed, 9 is an odd number (i.e. not divisible by 2). There are no halves. Do not expect the list to read your mind. Instead, ask a meaningful question. cheers, Rolf Turner On 10/06/13 17:25, Gundala Viswanath wrote: I have the following result of expand grid: d - expand.grid(c(x,y,z),c(x,y,z)) What I want is to create a combination of strings but only the half of the all combinations: Var1 Var2 1xx 2yx 3 yy 4 zy 5 xz 6zz What's the way to do it? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Selecting divergent colors
Hi, I was trying to make a density plot with 13 samples. To distinguish each sample, it would be good if each color is as different as possible from the other colors. I could use the built in function, but that does not do more than 8 colors and then goes back to recycling the cols. If I use a palette, then it is really difficult to distinguish between the colors. So, is there a way that I can select a large number of colors (i.e. perhaps 20) that are as different from each other as possible? Here is my example code using the palette: ** mat - matrix(sample(1:1000,1000,replace=T),nrow=20,ncol=20) snames - paste('Sample_',1:ncol(mat),sep='') colnames(mat) - snames mycols - palette(rainbow(ncol(mat))) for(k in 1:ncol(mat)){ plot(density(mat[,k]),col=mycols[k],xlab='',ylab='',axes=F,main=F) par(new=T) } legend(x='topright',legend=snames,fill=mycols) thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv timing
here are some small benchmarks on an i7-2600k with an SSD: input file: 104,126 rows with 76 columns. all numeric. linux time bzcat bzfile.csv.bz2 /dev/null -- 1.8 seconds R d - read.csv( pipe( bzfile ) ) -- 6.3 seconds R d - read.csv( pipe( bzfile ), colClasses=numeric) -- 4.2 seconds R more than doubles the time it takes to load the file to convert it into an R data structure. if the colClasses are not specified, then it takes another 50% longer. some more experiments: save in R format (gzip format) --- this increases file size from 15MB to 20MB. how fast is the filesystem? linux time gzcat file.Rdata /dev/null -- 0.4 seconds the linux file system and CPU can decompress the 15MB .bz2 file in 1.8 seconds and decompress the 20MB .gz file in 0.4 seconds. this is surprising. let's make sure that this is due to the .gz format. indeed: linux bunzip bzfile.csv.bz2 ; gzip bzfile.csv linux time gzcat bzfile.csv.gz /dev/null -- 0.4 seconds reading .gz files is much faster on my linux system than reading bz files. this surprises me. I would have thought my CPU is so fast at decompressing even bzip2 that it is almost zero, so I thought the disk space was the primary determinant of speed, and bzip2 should have been faster. well, ok, maybe slower, but not by a factor of 4. now I am thinking that maybe I should use .gz files to store my data. but the advantages are surprisingly not as great: R d - read.csv( pipe( gzfile ) ) -- 5.7 seconds R d - read.csv( pipe( gzfile ), colClasses=numeric) -- 2.6 seconds R d - read.csv( gzfile( gzfile ), colClasses=numeric) -- 4.5 seconds (surprisingly slower) (the first and second versions are using R's gzfile, but literally gzcat .. | in a pipe here.) conclusion: a .gz file can be read from file to memory about four times faster than a .bz file by the linux file system (outside R). the conversion from strings in memory nto R doubles takes about as much time as the .bz file system decompression read. bzip2 is a more efficient storage method than .gz, but its decompression is considerably slower (the fact that there is less to read from disk does not make up for the CPU decompression overhead). saving the data in native R format essentially has no decompression penalty and becomes close to native fast reading of .gz data. chances are this is because it has .gz support baked in. gzfile does not help with read.csv, however. /iaw Ivo Welch (ivo.we...@gmail.com) On Mon, Jun 10, 2013 at 10:09 AM, ivo welch ivo.we...@gmail.com wrote: Surely you know the types of the columns? If you specify it in advance, read.table and relatives will be much faster. Duncan Murdoch thx, duncan. yes, I do know the types of columns, but I did not realize how much faster these functions become. on my SSD-based system, the speedup is about a factor of 2. that is, read.csv on a bzip2 file that takes 10 seconds without colClasses takes 5 seconds with colClasses. I don't know how to benchmark intermittent memory usage, but my guess is that with colClasses, it requires less memory, too. in fact, my naive and incorrect assumption had been that read.csv would just read ithe file nto a dynamic string array and then convert each string, and this would not take much longer than if it converted as it went along. so, I had thought more memory use but not more time. wrong. I would add to the man (.Rd) page the sentence Specifying colClasses can speed up read.csv where it describes the option.) once I will figure out how to bake C into R, I may try to write a fast filter function for myself, but share it for others wanting to use it. regards, /iaw __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] woby2 (Odds Ratio) for variables with 3 or more levels
On Jun 10, 2013, at 9:27 AM, Vlatka Matkovic Puljic wrote: Dear all, I am using Epi package to calculate Odds ratio in my bivariate analysis. How can I make *twoby2 *in variables that have 3 or more levels. I hope looking at that again you will see how odd it sounds to be requesting advice about how to use a program for 2 x 2 tables on data that doesn't meet those requirements. If you want to stay within the Epi package world, you can probably use the 'mh' function since it says it can handle multi-way tables (or you can learn to use 'glm' in the regular stats package to do either logistic regression or Poisson regression.) For example: I have 4 level var (Age) m=matrix(c(290, 100,232, 201, 136, 99, 182, 240), nrow=4, ncol=2) library (Epi) twoby2(m) R gives me only Comparing : Row 1 vs. Row 2 While I would like to have reference value in Row 1, and compare Row 2, Row 3 and Row 4 with it. That is the default set of contrasts for 'glm' (and probably for 'mh' although it's not clear from the help page.) (Epi does have its own mailing list.) -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Where Query in SQL
Hey all I am trying to use where in clause in sql query in R here is my code: sql.select-paste(select PERSON_NAME from UNITS where UNIT_ID in (',cathree,'),sep=) where cathree is 1 variable with 16 observations as follows UNIT_ID 1 205 2 209 3 213 4 217 5 228 6 232 7 236 8 240 9 245 10 249 11 253 12 257 13 268 14 272 15 276 16 280 but when i run this code, 0 rows are selected eventhough there exist 3 rows which satisfy the above query Thanks Sneha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] woby2 (Odds Ratio) for variables with 3 or more levels
You may want to consider a cumulative logit model which effectively bifurcates an ordinal variable by utilizing the odds of being in a given level or below (depending on your coding). On Mon, Jun 10, 2013 at 12:27 PM, Vlatka Matkovic Puljic vlatk...@gmail.com wrote: Dear all, I am using Epi package to calculate Odds ratio in my bivariate analysis. How can I make *twoby2 *in variables that have 3 or more levels. For example: I have 4 level var (Age) m=matrix(c(290, 100,232, 201, 136, 99, 182, 240), nrow=4, ncol=2) library (Epi) twoby2(m) R gives me only Comparing : Row 1 vs. Row 2 While I would like to have reference value in Row 1, and compare Row 2, Row 3 and Row 4 with it. Thanks for your help! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- *James C. Whanger* * * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Where Query in SQL
Do this cat(sql.select,'\n') and then decide whether the query is what it should be according to standard SQL syntax. (If it is not, then fix it.) -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 6/10/13 11:47 AM, Sneha Bishnoi sneha.bish...@gmail.com wrote: Hey all I am trying to use where in clause in sql query in R here is my code: sql.select-paste(select PERSON_NAME from UNITS where UNIT_ID in (',cathree,'),sep=) where cathree is 1 variable with 16 observations as follows UNIT_ID 1 205 2 209 3 213 4 217 5 228 6 232 7 236 8 240 9 245 10 249 11 253 12 257 13 268 14 272 15 276 16 280 but when i run this code, 0 rows are selected eventhough there exist 3 rows which satisfy the above query Thanks Sneha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting divergent colors
It will be hard to come up with 20 clearly distinguishable colors. Check out the website http://colorbrewer2.org/ and the R package RColorBrewer. It does not have a 20-color palette, but it does have some 8- to 12-color palettes that are very nice. library(RColorBrewer) display.brewer.all(n=NULL, type=all, select=NULL, exact.n=TRUE) You could use these colors in combination with line type to build up to 72 unique combinations. For example ... nuniq - ncol(mat) mycols - rep(brewer.pal(12, Set3), length=nuniq) myltys - rep(1:6, rep(12, 6))[1:nuniq] for(k in 1:nuniq){ plot(density(mat[,k]), col=mycols[k], xlab='', ylab='', axes=F, main=F, lwd=3, lty=myltys[k]) par(new=TRUE) } legend('topright', legend=snames, col=mycols, lty=myltys, lwd=3) Jean On Mon, Jun 10, 2013 at 12:33 PM, Brian Smith bsmith030...@gmail.comwrote: Hi, I was trying to make a density plot with 13 samples. To distinguish each sample, it would be good if each color is as different as possible from the other colors. I could use the built in function, but that does not do more than 8 colors and then goes back to recycling the cols. If I use a palette, then it is really difficult to distinguish between the colors. So, is there a way that I can select a large number of colors (i.e. perhaps 20) that are as different from each other as possible? Here is my example code using the palette: ** mat - matrix(sample(1:1000,1000,replace=T),nrow=20,ncol=20) snames - paste('Sample_',1:ncol(mat),sep='') colnames(mat) - snames mycols - palette(rainbow(ncol(mat))) for(k in 1:ncol(mat)){ plot(density(mat[,k]),col=mycols[k],xlab='',ylab='',axes=F,main=F) par(new=T) } legend(x='topright',legend=snames,fill=mycols) thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshaping a data frame
Thanks everyone for your quick reply. I think my contrived example hid the complexity I wanted to show by using only one variable. @Arun: I think your example is exactly what I was looking for. Very cool trick with 'ave' and 'seq_along'...just dint occur to me. Best, -Abh On Mon, Jun 10, 2013 at 7:13 AM, arun smartpink...@yahoo.com wrote: Hi,If your dataset is similar to the one below: set.seed(24) temp1_df- data.frame(names=rep(c('foo','foo1'),each=6),variable=rep(c('w','x'),times=6),value=sample(25:40,12,replace=TRUE),stringsAsFactors=FALSE) library(reshape2) res-dcast(within(temp1_df,{Seq1-ave(value,names,variable,FUN=seq_along)}),names+Seq1~variable,value.var=value)[,-2] res # names w x #1 foo 29 28 #2 foo 36 33 #3 foo 35 39 #4 foo1 29 37 #5 foo1 37 29 #6 foo1 34 30 A.K. - Original Message - From: Abhishek Pratap abhishek@gmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Monday, June 10, 2013 2:15 AM Subject: [R] reshaping a data frame Hi Guys I am trying to cast a data frame but not aggregate the rows for the same variable. here is a contrived example. **input** temp_df - data.frame(names=c('foo','foo','foo'),variable=c('w','w','w'),value=c(34,65,12)) temp_df names variable value 1 foow34 2 foow65 3 foow12 ### **Want this** names w foo 34 foo 65 foo 12 ## **getting this*** ## cast(temp_df) Aggregation requires fun.aggregate: length used as default names w 1 foo 3 In real dataset the categorical column 'variable' will have many more categorical variable. Thanks! -Abhi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] parameters estimation of a normal-lognormal multivariate model
Dear all, I have to create a model which is a mixture of a normal and log-normal distribution. To create it, I need to estimate the 2 covariance matrixes and the mixing parameter (total =7 parameters) by maximizing the log-likelihood function. This maximization has to be performed by the nlm routine. As I use relative data, the means are known and equal to 1. I’ve already tried to do it in 1 dimension (with 1 set of relative data) and it works well. However, when I introduce the 2nd set of relative data I get illogical results for the correlation and a lot of warnings messages (at all 25). To estimates the parameters I defined first the log-likelihood function with the 2 commands dmvnorm and dlnorm.plus. Then I assign starting values of the parameters and finally I use the nlm routine to estimate the parameters (see script below). # Importing and reading the grid files. Output are 2048x2048 matrixes P - read.ascii.grid(d:/Documents/JOINT_FREQUENCY/grid_E727_P-3000.asc, return.header= FALSE ); V - read.ascii.grid(d:/Documents/JOINT_FREQUENCY/grid_E727_V-3000.asc, return.header= FALSE ); p - c(P); # tranform matrix into a vector v - c(V); p- p[!is.na(p)] # removing NA values v- v[!is.na(v)] p_rel - p/mean(p) #Transforming the data to relative values v_rel - v/mean(v) PV - cbind(p_rel, v_rel) # create a matrix of vectors L - function(par,p_rel,v_rel) { return (-sum(log( (1- par[7])*dmvnorm(PV, mean=c(1,1), sigma= matrix(c(par[1]^2, par[1]*par[2]*par[3],par[1]*par[2]*par[3], par[2]^2 ),nrow=2, ncol=2))+ par[7]*dlnorm.rplus(PV, meanlog=c(1,1), varlog= matrix(c(par[4]^2,par[4]*par[5]*par[6],par[4]*par[5]*par[6],par[5]^2), nrow=2,ncol=2))))) } par.start- c(0.74, 0.66 ,0.40, 1.4, 1.2, 0.4, 0.5) # log-likelihood estimators result-nlm(L,par.start,v_rel=v_rel,p_rel=p_rel, hessian=TRUE, iterlim=200, check.analyticals= TRUE) Messages d'avis : 1: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) : production de NaN 2: In sqrt(2 * pi * det(varlog)) : production de NaN 3: In nlm(L, par.start, p_rel = p_rel, v_rel = v_rel, hessian = TRUE) : NA/Inf replaced by maximum positive value 4: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) : production de NaN …. Until 25. par.hat - result$estimate cat(sigN_p =, par[1],\n,sigN_v =, par[2],\n,rhoN =, par[3],\n,sigLN_p =, par[4],\n,sigLN_v =, par[5],\n,rhoLN =, par[6],\n,mixing parameter =, par[7],\n) sigN_p = 0.5403361 sigN_v = 0.6667375 rhoN = 0.6260181 sigLN_p = 1.705626 sigLN_v = 1.592832 rhoLN = 0.9735974 mixing parameter = 0.8113369 Does someone know what is wrong in my model or how should I do to find these parameters in 2 dimensions? Thank you very much for taking time to look at my questions. Regards, Gladys Hertzog Master student in environmental engineering, ETH Zurich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting divergent colors
Hi, On Jun 10, 2013, at 3:46 PM, Adams, Jean wrote: It will be hard to come up with 20 clearly distinguishable colors. Check out the website http://colorbrewer2.org/ and the R package RColorBrewer. It does not have a 20-color palette, but it does have some 8- to 12-color palettes that are very nice. library(RColorBrewer) display.brewer.all(n=NULL, type=all, select=NULL, exact.n=TRUE) It sounds like Brian is looking for categorical coloring rather than divergent coloring. The Glasbey LUT works really well in image processing for just such purposes. It would be easy to use that within R for your lines. http://www.bioss.ac.uk/people/chris/colorpaper.pdf You might be able to snag the color table out of this collection of Java plugins for ImageJ software. http://www.dentistry.bham.ac.uk/landinig/software/morphology.zip Within that archive is a text file called glasbey.lut which is a simple text file of RGB color values. Cheers, Ben You could use these colors in combination with line type to build up to 72 unique combinations. For example ... nuniq - ncol(mat) mycols - rep(brewer.pal(12, Set3), length=nuniq) myltys - rep(1:6, rep(12, 6))[1:nuniq] for(k in 1:nuniq){ plot(density(mat[,k]), col=mycols[k], xlab='', ylab='', axes=F, main=F, lwd=3, lty=myltys[k]) par(new=TRUE) } legend('topright', legend=snames, col=mycols, lty=myltys, lwd=3) Jean On Mon, Jun 10, 2013 at 12:33 PM, Brian Smith bsmith030...@gmail.comwrote: Hi, I was trying to make a density plot with 13 samples. To distinguish each sample, it would be good if each color is as different as possible from the other colors. I could use the built in function, but that does not do more than 8 colors and then goes back to recycling the cols. If I use a palette, then it is really difficult to distinguish between the colors. So, is there a way that I can select a large number of colors (i.e. perhaps 20) that are as different from each other as possible? Here is my example code using the palette: ** mat - matrix(sample(1:1000,1000,replace=T),nrow=20,ncol=20) snames - paste('Sample_',1:ncol(mat),sep='') colnames(mat) - snames mycols - palette(rainbow(ncol(mat))) for(k in 1:ncol(mat)){ plot(density(mat[,k]),col=mycols[k],xlab='',ylab='',axes=F,main=F) par(new=T) } legend(x='topright',legend=snames,fill=mycols) thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Ben Tupper Bigelow Laboratory for Ocean Sciences 60 Bigelow Drive, P.O. Box 380 East Boothbay, Maine 04544 http://www.bigelow.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: Problem with ODBC connection
Any response please? Was my question not clear to the list? Please let me know. Thanks and regards, -- Forwarded message -- From: Christofer Bogaso bogaso.christo...@gmail.com Date: Sat, Jun 8, 2013 at 9:39 PM Subject: Re: Problem with ODBC connection To: r-help r-help@r-project.org Hello All, My previous post remains unanswered probably because the attachment was not working properly. So I am re-posting it again. My problem is in reading an Excel-2003 file through ODBC connection using RODBC package. Let say I have this Excel file: http://www.2shared.com/document/HS3JeFyW/MyFile.html I saved it in my F: drive and tried reading the contents using RODBC connection: library(RODBC) MyData - sqlFetch(odbcConnectExcel(f:/MyFile.xls), ) head(MyData, 30) However it looks that the second column (with header 's') is not read properly. Can somebody here explain this bizarre thing? Did I do something wrong in reading that? Really appreciate if someone could point out anything what might go wrong. Thanks and regards, On Fri, Jun 7, 2013 at 4:46 PM, Christofer Bogaso bogaso.christo...@gmail.com wrote: Hello again, I am having problem with ODBC connection using the RODBC package. I am basically trying to read the attached Excel-2003 file using RODBC package. Here is my code: head(sqlFetch(odbcConnectExcel(d:/1.xls), ), 30); odbcCloseAll() Criteria s d fd ffd1 f1fd2f2 fd3 f3 F12 F13 F14 F15 F16 F17 F18 F19 F20 1 a NA NA NA NA 0. 0.27755576 -0.00040332321NA NA NA NA NA NA NA NA NA NA NA NA 2 s NA 0 NA NA 0. 0. 0.000NA NA NA NA NA NA NA NA NA NA NA NA 3 d NA 0 NA NA 0.01734723 0.06938894 0.2775558 5.00 NA NA NA NA NA NA NA NA NA NA NA 4 f NA NA NA NA NA NA NA -4.25 NA NA NA NA NA NA NA NA NA NA NA 5 f NA 0 NA NA 0. 0. 0.000 -1.53 NA NA NA NA NA NA NA NA NA NA NA 6 f NA NA NA NA NA NA 0.000 0.00 NA NA NA NA NA NA NA NA NA NA NA 7 f NA NA NA NA NA NA 0.000NA NA NA NA NA NA NA NA NA NA NA NA 8 f NA 0 NA NA NA NA NANA NA NA NA NA NA NA NA NA NA NA NA 9 f NA 0 NA NA NA NA NANA NA NA NA NA NA NA NA NA NA NA NA 10f NA NA NA NA NA NA NANA NA NA NA NA NA NA NA NA NA NA NA 11f NA NA NA NA NA NA NANA NA NA NA NA NA NA NA NA NA NA NA 12f NA NA NA NA NA NA NANA NA NA NA NA NA NA NA NA NA NA NA 13f NA NA NA NA NA NA NANA NA NA NA NA NA NA NA NA NA NA NA Here you see the data in second column could not read at all. Can somebody point me if I did something wrong? Thanks and regards, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Apply a PCA to other datasets
I have run a PCA on one data set. I need the standard deviation of the first two bands for my analysis. I now want to apply the same PCA rotation I used in the first one to all my other data sets. Is there any way to do this in r? Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Apply-a-PCA-to-other-datasets-tp4669182.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Speed up or alternative to 'For' loop
I have a For loop that is quite slow and am wondering if there is a faster option: df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500)) df$Height - exp(-0.1 + 0.2*df$Age) df$HeightGrowth - NA #intialize with NA for (i in 2:nrow(df)) {if(df$TreeID[i]==df$TreeID[i-1]) {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1] } } Trevor Walker Email: trevordaviswal...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Combining CSV data
Hello R community, I am trying to combine two CSV files that look like this: File A Row_ID_CR, Data1,Data2,Data3 1, aa, bb, cc 2, dd, ee, ff File B Row_ID_N, Src_Row_ID, DataN1 1a, 1, This is comment 1 2a, 1, This is comment 2 3a, 2, This is comment 1 4a, 1, This is comment 3 And the output I am looking for is, comparing the values of Row_ID_CR and Src_Row_ID Output ROW_ID_CR,Data1,Data2,Data3,DataComment1, DataComment2, DataComment3 1, aa, bb, cc,This is comment1,This is comment2, This is comment 3 2, dd, ee, ff, This is comment1 I am a novice R user, I am able to replicate a left join but I need a bit more in the final result. Thanks!! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How sum all possible combinations of rows, given 4 matrices
It works, Arun. Thanks! (FYI, a couple a the matrices I am dealing with have 1000+ rows, so I had to do in on a supercomputer at work. For the curious, I am trying to find all possible scores in a model f language mixing described in: Title: Structured Variation in Codeswitching: Towards an Empirically Based Typology of Bilingual Speech Patterns Authors: Deuchar, Margaret; Muysken, Pieter; Wang, Sung-Lan Publication Date: 2007 Journal Name: International Journal of Bilingual Education and Bilingualism) Bruno Estigarribia Assistant Professor of Spanish, Department of Romance Languages and Literatures Research Assistant Professor of Psychology, Cognitive Science Program Affiliate Faculty, Global Studies Dey Hall, Room 332, CB# 3170 University of North Carolina at Chapel Hill estig...@email.unc.edu 917-348-8162 On 5/27/13 1:54 PM, arun smartpink...@yahoo.com wrote: Hi, Not sure if this is what you expected: set.seed(24) mat1- matrix(sample(1:20,3*4,replace=TRUE),ncol=3) set.seed(28) mat2- matrix(sample(1:25,3*6,replace=TRUE),ncol=3) set.seed(30) mat3- matrix(sample(1:35,3*8,replace=TRUE),ncol=3) set.seed(35) mat4- matrix(sample(1:40,3*10,replace=TRUE),ncol=3) dat1-expand.grid(seq(dim(mat1)[1]),seq(dim(mat2)[1]),seq(dim(mat3)[1]),se q(dim(mat4)[1])) vec1-paste0(mat,1:4) matNew-do.call(cbind,lapply(seq_len(ncol(dat1)),function(i) get(vec1[i])[dat1[,i],])) colnames(matNew)- (seq(12)-1)%%3+1 datNew-data.frame(matNew) res-sapply(split(colnames(datNew),gsub(\\..*,,colnames(datNew))),func tion(x) rowSums(datNew[,x])) dim(res) #[1] 19203 head(res) # X1 X2 X3 #[1,] 46 63 70 #[2,] 45 68 59 #[3,] 55 55 66 #[4,] 51 65 61 #[5,] 48 84 75 #[6,] 47 89 64 A.K. - Original Message - From: Estigarribia, Bruno estig...@email.unc.edu To: r-help@R-project.org r-help@r-project.org Cc: Sent: Monday, May 27, 2013 11:24 AM Subject: [R] How sum all possible combinations of rows, given 4 matrices Hello all, I have 4 matrices with 3 columns each (different number of rows though). I want to find a function that returns all possible 3-place vectors corresponding to the sum by columns of picking one row from matrix 1, one from matrix 2, one from matrix 3, and one from matrix 4. So basically, all possible ways of picking one row from each matrix and then sum their columns to obtain a 3-place vector. Is there a way to use expand.grid and reduce to obtain this result? Or am I on the wrong track? Thank you, Bruno PS:I believe I have given all relevant info. I apologize in advance if my question is ill-posed or ambiguous. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] please check this
Hi, Try this: res10Percent- fun1(final3New,0.1,200) res10PercentSub1-subset(res10Percent[duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE),],dummy==1) indx1-as.numeric(row.names(res10PercentSub1)) res10PercentSub2-res10PercentSub1[order(res10PercentSub1$dimension),] indx11-as.numeric(row.names(res10PercentSub2)) names(indx11)-(seq_along(indx11)-1)%/%2+1 res10PercentSub3-res10Percent[c(indx11,indx11+1),] res10PercentSub3$id- names(c(indx11,indx11+1)) res10PercentSub4-do.call(rbind,lapply(split(res10PercentSub3,res10PercentSub3$id),function(x) {x1-x[-1,];x2-x1[which.max(abs(x1$dimension[1]-x1$dimension[-1]))+1,];x3-x[x$dummy==1,][which.min(abs(as.numeric(row.names(x[x$dummy==1,]))-as.numeric(row.names(x2,];rbind(x3,x2)})) res10PercentSub0-subset(res10Percent[duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE),],dummy==0) indx0-as.numeric(row.names(res10PercentSub0)) res10PercentSub20-res10PercentSub0[order(res10PercentSub0$dimension),] indx00-as.numeric(row.names(res10PercentSub20)) names(indx00)-(seq_along(indx00)-1)%/%2+1 res10PercentSub30- res10Percent[c(indx00-1,indx00),] res10PercentSub30$id- names(c(indx00-1,indx00)) res10PercentSub40- do.call(rbind,lapply(split(res10PercentSub30,res10PercentSub30$id),function(x){x1-subset(x,dummy==1); x2-subset(x,dummy==0);x3-x1[which.max(abs(x1$dimension-unique(x2$dimension))),];x4-x2[which.min(abs(as.numeric(row.names(x3))-as.numeric(row.names(x2,];rbind(x3,x4)})) row.names(res10PercentSub40)-gsub(.*\\.,,row.names(res10PercentSub40)) indxNew- sort(as.numeric(c(row.names(res10PercentSub5),row.names(res10PercentSub40 res10PercentFinal-res10Percent[-indxNew,] dim(res10PercentFinal) #[1] 454 5 nrow(subset(res10PercentFinal,dummy==0)) #[1] 227 nrow(subset(res10PercentFinal,dummy==1)) #[1] 227 nrow(unique(res10PercentFinal)) #[1] 454 which(duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE)) # [1] 113 117 123 125 153 157 187 189 207 213 223 235 265 267 269 275 276 278 279 #[20] 283 293 301 303 305 309 317 327 331 335 339 341 343 347 351 367 369 371 379 #[39] 385 399 407 413 415 417 429 437 441 453 459 461 471 473 477 479 501 505 res10Percent[c(113:114,117:118),] # firm year industry dummy dimension #113 500221723 2005 26 1 3147 #114 500601429 2005 26 0 3076 #117 500221723 2005 26 1 3147 #118 502668920 2005 26 0 3249 res10PercentFinal[c(113:114,117:118),] #deleted the duplicated row and the accompanying pair with the maximum difference # firm year industry dummy dimension #113 500221723 2005 26 1 3147 #114 500601429 2005 26 0 3076 #119 500115362 2006 26 1 6239 #120 500060223 2006 26 0 6208 A.K. row.names(res10PercentSub4)-gsub(.*\\.,,row.names(res10PercentSub4)) res10PercentSub5-res10PercentSub4[order(as.numeric(res10PercentSub4$id)),] - Original Message - From: Cecilia Carmo cecilia.ca...@ua.pt To: arun smartpink...@yahoo.com Cc: Sent: Monday, June 10, 2013 1:41 PM Subject: RE: please check this I think it could be better to eliminate that one. If you could do it I appreciate. Cecília De: arun [smartpink...@yahoo.com] Enviado: segunda-feira, 10 de Junho de 2013 18:14 Para: Cecilia Carmo Assunto: Re: please check this If you wanted to eliminate the duplicate rows that have the pair with the maximum difference, it is possible. Just informing you. - Original Message - From: Cecilia Carmo cecilia.ca...@ua.pt To: arun smartpink...@yahoo.com Cc: Sent: Monday, June 10, 2013 10:51 AM Subject: RE: please check this I think it is ok now. Thanks Cecília De: arun [smartpink...@yahoo.com] Enviado: segunda-feira, 10 de Junho de 2013 15:39 Para: Cecilia Carmo Cc: R help Assunto: Re: please check this Hi, Try this: which(duplicated(res10Percent)) # [1] 117 125 157 189 213 235 267 275 278 293 301 327 331 335 339 367 369 371 379 #[20] 413 415 417 441 459 461 477 479 505 res10PercentSub1-subset(res10Percent[which(duplicated(res10Percent)),],dummy==1) #most of the duplicated are dummy==1 res10PercentSub0-subset(res10Percent[which(duplicated(res10Percent)),],dummy==0) indx1-as.numeric(row.names(res10PercentSub1)) indx11-sort(c(indx1,indx1+1)) indx0- as.numeric(row.names(res10PercentSub0)) indx00- sort(c(indx0,indx0-1)) indx10- sort(c(indx11,indx00)) nrow(res10Percent[-indx10,]) #[1] 452 res10PercentNew-res10Percent[-indx10,] nrow(subset(res10PercentNew,dummy==1)) #[1] 226 nrow(subset(res10PercentNew,dummy==0)) #[1] 226 nrow(unique(res10PercentNew)) #[1] 452 A.K. - Original Message - From: Cecilia Carmo cecilia.ca...@ua.pt To: arun smartpink...@yahoo.com Cc: Sent: Monday, June 10, 2013 10:19 AM Subject: RE: please check this But I don't want it like this. Once a firm is paired with another, these two firms
Re: [R] Speed up or alternative to 'For' loop
Hello, One way to speed it up is to use a matrix instead of a data.frame. Since data.frames can hold data of all classes, the access to their elements is slow. And your data is all numeric so it can be hold in a matrix. The second way below gave me a speed up by a factor of 50. system.time({ for (i in 2:nrow(df)) {if(df$TreeID[i]==df$TreeID[i-1]) {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1] } } }) system.time({ df2 - data.matrix(df) for(i in seq_len(nrow(df2))[-1]){ if(df2[i, TreeID] == df2[i - 1, TreeID]) df2[i, HeightGrowth] - df2[i, Height] - df2[i - 1, Height] } }) all.equal(df, as.data.frame(df2)) # TRUE Hope this helps, Rui Barradas Em 10-06-2013 18:28, Trevor Walker escreveu: I have a For loop that is quite slow and am wondering if there is a faster option: df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500)) df$Height - exp(-0.1 + 0.2*df$Age) df$HeightGrowth - NA #intialize with NA for (i in 2:nrow(df)) {if(df$TreeID[i]==df$TreeID[i-1]) {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1] } } Trevor Walker Email: trevordaviswal...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combining CSV data
try this: fileA - read.csv(text = Row_ID_CR, Data1,Data2,Data3 + 1, aa, bb, cc + 2, dd, ee, ff, as.is = TRUE) fileB - read.csv(text = Row_ID_N, Src_Row_ID, DataN1 + 1a, 1, This is comment 1 + 2a, 1, This is comment 2 + 3a, 2, This is comment 1 + 4a, 1, This is comment 3, as.is = TRUE) # get rid of leading/trailing blanks on comments fileB$DataN1 - gsub(^ *| *$, , fileB$DataN1) # merge together result - merge(fileA, fileB, by.x = 'Row_ID_CR', by.y = Src_Row_ID) # now partition by Row_ID_CR and aggregate the comments result2 - do.call(rbind, + lapply(split(result, result$Row_ID_CR), function(.grp){ + cbind(.grp[1L, -c(5,6)], comment = paste(.grp$DataN1, collapse = ', ')) + }) + ) result2 Row_ID_CR Data1Data2Data3 comment 1 1aa bb cc This is comment 1, This is comment 2, This is comment 3 2 2dd ee ff This is comment 1 On Mon, Jun 10, 2013 at 4:38 PM, Shreya Rawal rawal.shr...@gmail.comwrote: Hello R community, I am trying to combine two CSV files that look like this: File A Row_ID_CR, Data1,Data2,Data3 1, aa, bb, cc 2, dd, ee, ff File B Row_ID_N, Src_Row_ID, DataN1 1a, 1, This is comment 1 2a, 1, This is comment 2 3a, 2, This is comment 1 4a, 1, This is comment 3 And the output I am looking for is, comparing the values of Row_ID_CR and Src_Row_ID Output ROW_ID_CR,Data1,Data2,Data3,DataComment1, DataComment2, DataComment3 1, aa, bb, cc,This is comment1,This is comment2, This is comment 3 2, dd, ee, ff, This is comment1 I am a novice R user, I am able to replicate a left join but I need a bit more in the final result. Thanks!! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Speed up or alternative to 'For' loop
How about for (ir in unique(df$TreeID)) { in.ir - df$TreeID == ir df$HeightGrowth[in.ir] - cumsum(df$Height[in.ir]) } Seemed fast enough to me. In R, it is generally good to look for ways to operate on entire vectors or arrays, rather than element by element within them. The cumsum() function does that in this example. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 6/10/13 10:28 AM, Trevor Walker trevordaviswal...@gmail.com wrote: I have a For loop that is quite slow and am wondering if there is a faster option: df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500)) df$Height - exp(-0.1 + 0.2*df$Age) df$HeightGrowth - NA #intialize with NA for (i in 2:nrow(df)) {if(df$TreeID[i]==df$TreeID[i-1]) {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1] } } Trevor Walker Email: trevordaviswal...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] please check this
Sorry, I forgot to paste some lines and change the names: res10Percent- fun1(final3New,0.1,200) res10PercentSub1-subset(res10Percent[duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE),],dummy==1) indx1-as.numeric(row.names(res10PercentSub1)) res10PercentSub2-res10PercentSub1[order(res10PercentSub1$dimension),] indx11-as.numeric(row.names(res10PercentSub2)) names(indx11)-(seq_along(indx11)-1)%/%2+1 res10PercentSub3-res10Percent[c(indx11,indx11+1),] res10PercentSub3$id- names(c(indx11,indx11+1)) res10PercentSub4-do.call(rbind,lapply(split(res10PercentSub3,res10PercentSub3$id),function(x) {x1-x[-1,];x2-x1[which.max(abs(x1$dimension[1]-x1$dimension[-1]))+1,];x3-x[x$dummy==1,][which.min(abs(as.numeric(row.names(x[x$dummy==1,]))-as.numeric(row.names(x2,];rbind(x3,x2)})) row.names(res10PercentSub4)-gsub(.*\\.,,row.names(res10PercentSub4)) #forgot res10PercentSub0-subset(res10Percent[duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE),],dummy==0) indx0-as.numeric(row.names(res10PercentSub0)) res10PercentSub20-res10PercentSub0[order(res10PercentSub0$dimension),] indx00-as.numeric(row.names(res10PercentSub20)) names(indx00)-(seq_along(indx00)-1)%/%2+1 res10PercentSub30- res10Percent[c(indx00-1,indx00),] res10PercentSub30$id- names(c(indx00-1,indx00)) res10PercentSub40- do.call(rbind,lapply(split(res10PercentSub30,res10PercentSub30$id),function(x){x1-subset(x,dummy==1); x2-subset(x,dummy==0);x3-x1[which.max(abs(x1$dimension-unique(x2$dimension))),];x4-x2[which.min(abs(as.numeric(row.names(x3))-as.numeric(row.names(x2,];rbind(x3,x4)})) row.names(res10PercentSub40)-gsub(.*\\.,,row.names(res10PercentSub40)) indxNew- sort(as.numeric(c(row.names(res10PercentSub4),row.names(res10PercentSub40 #res10PercentSub4 res10PercentFinal-res10Percent[-indxNew,] dim(res10PercentFinal) #[1] 454 5 nrow(subset(res10PercentFinal,dummy==0)) #[1] 227 nrow(subset(res10PercentFinal,dummy==1)) #[1] 227 nrow(unique(res10PercentFinal)) A.K. - Original Message - From: Cecilia Carmo cecilia.ca...@ua.pt To: arun smartpink...@yahoo.com Cc: Sent: Monday, June 10, 2013 5:48 PM Subject: RE: please check this Error message: Error in row.names(res10PercentSub5) : object 'res10PercentSub5' not found De: arun [smartpink...@yahoo.com] Enviado: segunda-feira, 10 de Junho de 2013 22:05 Para: Cecilia Carmo Cc: R help Assunto: Re: please check this Hi, Try this: res10Percent- fun1(final3New,0.1,200) res10PercentSub1-subset(res10Percent[duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE),],dummy==1) indx1-as.numeric(row.names(res10PercentSub1)) res10PercentSub2-res10PercentSub1[order(res10PercentSub1$dimension),] indx11-as.numeric(row.names(res10PercentSub2)) names(indx11)-(seq_along(indx11)-1)%/%2+1 res10PercentSub3-res10Percent[c(indx11,indx11+1),] res10PercentSub3$id- names(c(indx11,indx11+1)) res10PercentSub4-do.call(rbind,lapply(split(res10PercentSub3,res10PercentSub3$id),function(x) {x1-x[-1,];x2-x1[which.max(abs(x1$dimension[1]-x1$dimension[-1]))+1,];x3-x[x$dummy==1,][which.min(abs(as.numeric(row.names(x[x$dummy==1,]))-as.numeric(row.names(x2,];rbind(x3,x2)})) res10PercentSub0-subset(res10Percent[duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE),],dummy==0) indx0-as.numeric(row.names(res10PercentSub0)) res10PercentSub20-res10PercentSub0[order(res10PercentSub0$dimension),] indx00-as.numeric(row.names(res10PercentSub20)) names(indx00)-(seq_along(indx00)-1)%/%2+1 res10PercentSub30- res10Percent[c(indx00-1,indx00),] res10PercentSub30$id- names(c(indx00-1,indx00)) res10PercentSub40- do.call(rbind,lapply(split(res10PercentSub30,res10PercentSub30$id),function(x){x1-subset(x,dummy==1); x2-subset(x,dummy==0);x3-x1[which.max(abs(x1$dimension-unique(x2$dimension))),];x4-x2[which.min(abs(as.numeric(row.names(x3))-as.numeric(row.names(x2,];rbind(x3,x4)})) row.names(res10PercentSub40)-gsub(.*\\.,,row.names(res10PercentSub40)) indxNew- sort(as.numeric(c(row.names(res10PercentSub5),row.names(res10PercentSub40 res10PercentFinal-res10Percent[-indxNew,] dim(res10PercentFinal) #[1] 454 5 nrow(subset(res10PercentFinal,dummy==0)) #[1] 227 nrow(subset(res10PercentFinal,dummy==1)) #[1] 227 nrow(unique(res10PercentFinal)) #[1] 454 which(duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE)) # [1] 113 117 123 125 153 157 187 189 207 213 223 235 265 267 269 275 276 278 279 #[20] 283 293 301 303 305 309 317 327 331 335 339 341 343 347 351 367 369 371 379 #[39] 385 399 407 413 415 417 429 437 441 453 459 461 471 473 477 479 501 505 res10Percent[c(113:114,117:118),] # firm year industry dummy dimension #113 500221723 2005 26 1 3147 #114 500601429 2005 26 0 3076 #117 500221723 2005 26 1 3147 #118 502668920 2005 26 0 3249
Re: [R] help needed! RMSE
mansor nad nadsim88 at hotmail.com writes: i need HELPPP!! how do i calculate the RMSE value for two GEV models?first GEV is where the three parameters are constant.2nd GEV model a 4 parameter model with the location parameter is allowed to vary linearly with respect to time while holding the other parameters at constant. is there any programming code for this? i really really need help. please reply to me as soon as possible. thanks in advance. Have you read the posting guide (URL/link at the bottom of every posting at this list)? Can you provide a reproducible example? It may seem perverse, but urgency (I need HELP! ... I really really need help ... please reply to me as soon as possible ...) doesn't actually generally improve your chances of getting help here -- it comes across as shouting. Providing reproducible examples not only makes it easier for people to answer, and improving the chances that the answers you get will be ones you really need, it also demonstrates evidence that you have invested some effort. You might want to start with this example: library(fExtremes) g1 - gevFit(gevSim()) sqrt(sum(g1@residuals^2)) ?gevFit __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Speed up or alternative to 'For' loop
On Jun 10, 2013, at 10:28 AM, Trevor Walker wrote: I have a For loop that is quite slow and am wondering if there is a faster option: df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500)) df$Height - exp(-0.1 + 0.2*df$Age) df$HeightGrowth - NA #intialize with NA for (i in 2:nrow(df)) {if(df$TreeID[i]==df$TreeID[i-1]) {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1] } } Ivoid tests with if(){}e;se(). Use vectorized code, possibly with 'ifelse' but in this case you need a function that does calcualtions within groups. The ave() function with diff() will do it compactly and efficiently: df - data.frame(TreeID=rep(1:5,each=4), Age=rep(seq(1,4,1),5)) df$Height - exp(-0.1 + 0.2*df$Age) df$HeightGrowth - NA #intialize with NA df$HeightGrowth - ave(df$Height, df$TreeID, FUN= function(vec) c(NA, diff(vec))) df TreeID Age Height HeightGrowth 1 1 1 1.105171 NA 2 1 2 1.3498590.2446879 3 1 3 1.6487210.2988625 4 1 4 2.0137530.3650314 5 2 1 1.105171 NA 6 2 2 1.3498590.2446879 7 2 3 1.6487210.2988625 8 2 4 2.0137530.3650314 9 3 1 1.105171 NA 10 3 2 1.3498590.2446879 11 3 3 1.6487210.2988625 12 3 4 2.0137530.3650314 13 4 1 1.105171 NA 14 4 2 1.3498590.2446879 15 4 3 1.6487210.2988625 16 4 4 2.0137530.3650314 17 5 1 1.105171 NA 18 5 2 1.3498590.2446879 19 5 3 1.6487210.2988625 20 5 4 2.0137530.3650314 (On my machine it was over six times as fast as the if-based code from Arun. ) -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Speed up or alternative to 'For' loop
Sorry, it looks like I was hasty. Absent another dumb mistake, the following should do it. The request was for differences, i.e., the amount of growth from one period to the next, separately for each tree. for (ir in unique(df$TreeID)) { in.ir - df$TreeID == ir df$HeightGrowth[in.ir] - c(NA, diff(df$Height[in.ir])) } And this gives the same result as Rui Barradas' previous response. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 6/10/13 2:51 PM, MacQueen, Don macque...@llnl.gov wrote: How about for (ir in unique(df$TreeID)) { in.ir - df$TreeID == ir df$HeightGrowth[in.ir] - cumsum(df$Height[in.ir]) } Seemed fast enough to me. In R, it is generally good to look for ways to operate on entire vectors or arrays, rather than element by element within them. The cumsum() function does that in this example. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 6/10/13 10:28 AM, Trevor Walker trevordaviswal...@gmail.com wrote: I have a For loop that is quite slow and am wondering if there is a faster option: df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500)) df$Height - exp(-0.1 + 0.2*df$Age) df$HeightGrowth - NA #intialize with NA for (i in 2:nrow(df)) {if(df$TreeID[i]==df$TreeID[i-1]) {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1] } } Trevor Walker Email: trevordaviswal...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combining CSV data
Hi, Try this: dat1-read.table(text= Row_ID_CR, Data1, Data2, Data3 1, aa, bb, cc 2, dd, ee, ff ,sep=,,header=TRUE,stringsAsFactors=FALSE) dat2-read.table(text= Row_ID_N, Src_Row_ID, DataN1 1a, 1, This is comment 1 2a, 1, This is comment 2 3a, 2, This is comment 1 4a, 1, This is comment 3 ,sep=,,header=TRUE,stringsAsFactors=FALSE) library(stringr) dat2$DataN1-str_trim(dat2$DataN1) res- merge(dat1,dat2,by.x=1,by.y=2) res1-res[,-5] library(plyr) res2-ddply(res1,.(Row_ID_CR,Data1,Data2,Data3),summarize, DataN1=list(DataN1)) res2 # Row_ID_CR Data1 Data2 Data3 #1 1 aa bb cc #2 2 dd ee ff # DataN1 #1 This is comment 1, This is comment 2, This is comment 3 #2 This is comment 1 res3-data.frame(res2[,-5],t(apply(do.call(rbind,res2[,5]),1,function(x) {x[duplicated(x)]-NA;x}))) colnames(res3)[grep(X,colnames(res3))]- paste0(DataComment,gsub([[:alpha:]],,colnames(res3)[grep(X,colnames(res3))])) res3 # Row_ID_CR Data1 Data2 Data3 DataComment1 #1 1 aa bb cc This is comment 1 #2 2 dd ee ff This is comment 1 # DataComment2 DataComment3 #1 This is comment 2 This is comment 3 #2 NA NA A.K. - Original Message - From: Shreya Rawal rawal.shr...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, June 10, 2013 4:38 PM Subject: [R] Combining CSV data Hello R community, I am trying to combine two CSV files that look like this: File A Row_ID_CR, Data1, Data2, Data3 1, aa, bb, cc 2, dd, ee, ff File B Row_ID_N, Src_Row_ID, DataN1 1a, 1, This is comment 1 2a, 1, This is comment 2 3a, 2, This is comment 1 4a, 1, This is comment 3 And the output I am looking for is, comparing the values of Row_ID_CR and Src_Row_ID Output ROW_ID_CR, Data1, Data2, Data3, DataComment1, DataComment2, DataComment3 1, aa, bb, cc, This is comment1, This is comment2, This is comment 3 2, dd, ee, ff, This is comment1 I am a novice R user, I am able to replicate a left join but I need a bit more in the final result. Thanks!! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Speed up or alternative to 'For' loop
Well, speaking of hasty... This will also do it, provided that each tree's initial height is less than the previous tree's final height. In principle, not a safe assumption, but might be ok depending on where the data came from. df$delta - c(NA,diff(df$Height)) df$delta[df$delta 0] - NA -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 6/10/13 2:51 PM, MacQueen, Don macque...@llnl.gov wrote: How about for (ir in unique(df$TreeID)) { in.ir - df$TreeID == ir df$HeightGrowth[in.ir] - cumsum(df$Height[in.ir]) } Seemed fast enough to me. In R, it is generally good to look for ways to operate on entire vectors or arrays, rather than element by element within them. The cumsum() function does that in this example. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 6/10/13 10:28 AM, Trevor Walker trevordaviswal...@gmail.com wrote: I have a For loop that is quite slow and am wondering if there is a faster option: df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500)) df$Height - exp(-0.1 + 0.2*df$Age) df$HeightGrowth - NA #intialize with NA for (i in 2:nrow(df)) {if(df$TreeID[i]==df$TreeID[i-1]) {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1] } } Trevor Walker Email: trevordaviswal...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Speed up or alternative to 'For' loop
Hi, Some speed comparisons: df - data.frame(TreeID=rep(1:6000,each=20), Age=rep(seq(1,20,1),6000)) df$Height - exp(-0.1 + 0.2*df$Age) df1- df df3-df library(data.table) dt1- data.table(df) df$HeightGrowth - NA system.time({ #Rui's 2nd function df2 - data.matrix(df) for(i in seq_len(nrow(df2))[-1]){ if(df2[i, TreeID] == df2[i - 1, TreeID]) df2[i, HeightGrowth] - df2[i, Height] - df2[i - 1, Height] } }) # user system elapsed # 1.108 0.000 1.109 system.time({for (ir in unique(df$TreeID)) { #Don's first function in.ir - df$TreeID == ir df$HeightGrowth[in.ir] - c(NA, diff(df$Height[in.ir])) }}) # user system elapsed #100.004 0.704 100.903 system.time({df3$delta - c(NA,diff(df3$Height)) ##Don's 2nd function df3$delta[df3$delta 0] - NA}) #winner # user system elapsed # 0.016 0.000 0.014 system.time(df1$HeightGrowth - ave(df1$Height, df1$TreeID, FUN= function(vec) c(NA, diff(vec #David's #user system elapsed # 0.136 0.000 0.137 system.time(dt1[,HeightGrowth:=c(NA,diff(Height)),by=TreeID]) # user system elapsed # 0.076 0.000 0.079 identical(df1,as.data.frame(dt1)) #[1] TRUE identical(df1,df) #[1] TRUE head(df1,2) # TreeID Age Height HeightGrowth #1 1 1 1.105171 NA #2 1 2 1.349859 0.2446879 head(df2,2) # TreeID Age Height HeightGrowth #[1,] 1 1 1.105171 NA #[2,] 1 2 1.349859 0.2446879 A.K. - Original Message - From: Trevor Walker trevordaviswal...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, June 10, 2013 1:28 PM Subject: [R] Speed up or alternative to 'For' loop I have a For loop that is quite slow and am wondering if there is a faster option: df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500)) df$Height - exp(-0.1 + 0.2*df$Age) df$HeightGrowth - NA #intialize with NA for (i in 2:nrow(df)) {if(df$TreeID[i]==df$TreeID[i-1]) {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1] } } Trevor Walker Email: trevordaviswal...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Apply a PCA to other datasets
Short answer: Yes. Long answer: Your question does not provide specific information; therefore, I cannot provide a specific answer. On Mon, Jun 10, 2013 at 1:23 PM, edelance delanceye...@gmail.com wrote: I have run a PCA on one data set. I need the standard deviation of the first two bands for my analysis. I now want to apply the same PCA rotation I used in the first one to all my other data sets. Is there any way to do this in r? Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Apply-a-PCA-to-other-datasets-tp4669182.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combining CSV data
HI, I am not sure about your DataN1 column. If there is any identifier to differentiate the comments (in this case 1,2,3), then it will easier to place that in the correct column. My previous solution is not helpful in situations like these: dat2-read.table(text= Row_ID_N, Src_Row_ID, DataN1 1a, 1, This is comment 1 2a, 1, This is comment 2 3a, 2, This is comment 2 4a, 1, This is comment 3 ,sep=,,header=TRUE,stringsAsFactors=FALSE) dat3-read.table(text= Row_ID_N, Src_Row_ID, DataN1 1a, 1, This is comment 1 2a, 1, This is comment 2 3a, 2, This is comment 3 4a, 1, This is comment 3 5a, 2, This is comment 2 ,sep=,,header=TRUE,stringsAsFactors=FALSE) library(stringr) library(plyr) fun1- function(data1,data2){ data2$DataN1- str_trim(data2$DataN1) res- merge(data1,data2,by.x=1,by.y=2) res1- res[,-5] res2- ddply(res1,.(Row_ID_CR,Data1,Data2,Data3),summarize,DataN1=list(DataN1)) Mx1- max(sapply(res2[,5],length)) res3- data.frame(res2[,-5],do.call(rbind,lapply(res2[,5],function(x){ indx- as.numeric(gsub([[:alpha:]],,x)) x[match(seq(Mx1),indx)] })),stringsAsFactors=FALSE) colnames(res3)[grep(X,colnames(res3))]- paste0(DataComment,gsub([[:alpha:]],,colnames(res3)[grep(X,colnames(res3))])) res3 } fun1(dat1,dat2) # Row_ID_CR Data1 Data2 Data3 DataComment1 #1 1 aa bb cc This is comment 1 #2 2 dd ee ff NA # DataComment2 DataComment3 #1 This is comment 2 This is comment 3 #2 This is comment 2 NA fun1(dat1,dat3) # Row_ID_CR Data1 Data2 Data3 DataComment1 #1 1 aa bb cc This is comment 1 #2 2 dd ee ff NA # DataComment2 DataComment3 #1 This is comment 2 This is comment 3 #2 This is comment 2 This is comment 3 A.K. - Original Message - From: arun smartpink...@yahoo.com To: Shreya Rawal rawal.shr...@gmail.com Cc: R help r-help@r-project.org Sent: Monday, June 10, 2013 6:41 PM Subject: Re: [R] Combining CSV data Hi, Try this: dat1-read.table(text= Row_ID_CR, Data1, Data2, Data3 1, aa, bb, cc 2, dd, ee, ff ,sep=,,header=TRUE,stringsAsFactors=FALSE) dat2-read.table(text= Row_ID_N, Src_Row_ID, DataN1 1a, 1, This is comment 1 2a, 1, This is comment 2 3a, 2, This is comment 1 4a, 1, This is comment 3 ,sep=,,header=TRUE,stringsAsFactors=FALSE) library(stringr) dat2$DataN1-str_trim(dat2$DataN1) res- merge(dat1,dat2,by.x=1,by.y=2) res1-res[,-5] library(plyr) res2-ddply(res1,.(Row_ID_CR,Data1,Data2,Data3),summarize, DataN1=list(DataN1)) res2 # Row_ID_CR Data1 Data2 Data3 #1 1 aa bb cc #2 2 dd ee ff # DataN1 #1 This is comment 1, This is comment 2, This is comment 3 #2 This is comment 1 res3-data.frame(res2[,-5],t(apply(do.call(rbind,res2[,5]),1,function(x) {x[duplicated(x)]-NA;x}))) colnames(res3)[grep(X,colnames(res3))]- paste0(DataComment,gsub([[:alpha:]],,colnames(res3)[grep(X,colnames(res3))])) res3 # Row_ID_CR Data1 Data2 Data3 DataComment1 #1 1 aa bb cc This is comment 1 #2 2 dd ee ff This is comment 1 # DataComment2 DataComment3 #1 This is comment 2 This is comment 3 #2 NA NA A.K. - Original Message - From: Shreya Rawal rawal.shr...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, June 10, 2013 4:38 PM Subject: [R] Combining CSV data Hello R community, I am trying to combine two CSV files that look like this: File A Row_ID_CR, Data1, Data2, Data3 1, aa, bb, cc 2, dd, ee, ff File B Row_ID_N, Src_Row_ID, DataN1 1a, 1, This is comment 1 2a, 1, This is comment 2 3a, 2, This is comment 1 4a, 1, This is comment 3 And the output I am looking for is, comparing the values of Row_ID_CR
[R] padding specific missing values with NA to allow cbind
Dear list Getting very frustrated with this simple-looking problem m1 - lm(x~y, data=mydata) outliers - abs(stdres(m1))2 plot(x~y, data=mydata) I would like to plot a simple x,y scatter plot with labels giving custom information displayed for the outliers only, i.e. I would like to define a column mydata$labels for the mydata dataframe so that the command text(mydata$y, mydata$x, labels=mydata$labels) will label those rows where outliers[i] = TRUE with text but is otherwise blank The first problem I have is that due to some NAs in mydata, nrows(outliers) nrows(mydata) and I'm getting in a tangle trying to pad the appropriate rows of outliers Thanks Rob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with R loop for URL download from FRED to create US time series
I am downloading time series data from FRED. I have a working download, but I do not want to write out the download for all 50 states likes this: IDRGSP - read.table('http://research.stlouisfed.org/fred2/data/IDRGSP.txt', skip=11, header=TRUE) IDRGSP$DATE - as.Date(IDRGSP$DATE, '%Y-%m-%d') IDRGSP$SERIES - 'IDRGSP' IDRGSP$DESC - Real Total Gross Domestic Product by State for Idaho, Mil. of, A, NSA, 2012-06-05 WYRGSP - read.table('http://research.stlouisfed.org/fred2/data/WYRGSP.txt', skip=11, header=TRUE) WYRGSP$DATE - as.Date(WYRGSP$DATE, '%Y-%m-%d') WYRGSP$SERIES - 'WYRGSP' WYRGSP$DESC - Real Total Gross Domestic Product by State for Wyoming, Mil. of, A, NSA, 2012-06-05 RGSP - rbind(IDRGSP, WYRGSP) I want to loop but I can not get the paste to work correctly. I am trying this: Can someone help me figure out the loop so I can build a table for all 50 states. ab - c(state.abb) base - 'http://research.stlouisfed.org/fred2/data/; type - RGSP.txt', skip=11, header=TRUE; tmp - NULL; for (a in ab) { url - paste(base, a, type, sep=); if (is.null(tmp)) tmp - read.table(url) else tmp - rbind(tmp, read.table(url)) } tmp thanks for your help -- View this message in context: http://r.789695.n4.nabble.com/Help-with-R-loop-for-URL-download-from-FRED-to-create-US-time-series-tp4669209.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] padding specific missing values with NA to allow cbind
Try adding the argument na.action = na.exclude to your call to lm(). See help(na.exclude) for details. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Rob Forsyth Sent: Monday, June 10, 2013 2:42 PM To: r-help@r-project.org Subject: [R] padding specific missing values with NA to allow cbind Dear list Getting very frustrated with this simple-looking problem m1 - lm(x~y, data=mydata) outliers - abs(stdres(m1))2 plot(x~y, data=mydata) I would like to plot a simple x,y scatter plot with labels giving custom information displayed for the outliers only, i.e. I would like to define a column mydata$labels for the mydata dataframe so that the command text(mydata$y, mydata$x, labels=mydata$labels) will label those rows where outliers[i] = TRUE with text but is otherwise blank The first problem I have is that due to some NAs in mydata, nrows(outliers) nrows(mydata) and I'm getting in a tangle trying to pad the appropriate rows of outliers Thanks Rob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with R loop for URL download from FRED to create US time series
This should do it for you: base - http://research.stlouisfed.org/fred2/data/; files - lapply(state.abb, function(.state){ + cat(.state, \n) + input - read.table(paste0(base, .state, RGSP.txt) + , skip = 11 + , header = TRUE + , as.is = TRUE + ) + input$DATE - as.Date(input$DATE, %Y-%m-%d) + input$SERIES - paste0(.state, RGSP) + input + }) AL AK AZ AR CA CO CT DE FL GA HI ID IL IN IA KS KY LA ME MD MA MI MN MS MO MT NE NV NH NJ NM NY NC ND OH OK OR PA RI SC SD TN TX UT VT VA WA WV WI WY result - do.call(rbind, files) str(result) 'data.frame': 750 obs. of 3 variables: $ DATE : Date, format: 1997-01-01 1998-01-01 1999-01-01 2000-01-01 ... $ VALUE : int 122541 126309 130898 132699 133888 137086 140020 146937 150968 153681 ... $ SERIES: chr ALRGSP ALRGSP ALRGSP ALRGSP ... head(result,30) DATE VALUE SERIES 1 1997-01-01 122541 ALRGSP 2 1998-01-01 126309 ALRGSP 3 1999-01-01 130898 ALRGSP 4 2000-01-01 132699 ALRGSP 5 2001-01-01 133888 ALRGSP 6 2002-01-01 137086 ALRGSP 7 2003-01-01 140020 ALRGSP 8 2004-01-01 146937 ALRGSP 9 2005-01-01 150968 ALRGSP 10 2006-01-01 153681 ALRGSP 11 2007-01-01 155388 ALRGSP 12 2008-01-01 155870 ALRGSP 13 2009-01-01 148074 ALRGSP 14 2010-01-01 151480 ALRGSP 15 2011-01-01 150330 ALRGSP 16 1997-01-01 37249 AKRGSP 17 1998-01-01 35341 AKRGSP 18 1999-01-01 34967 AKRGSP 19 2000-01-01 34192 AKRGSP 20 2001-01-01 35729 AKRGSP 21 2002-01-01 37111 AKRGSP 22 2003-01-01 36288 AKRGSP 23 2004-01-01 38179 AKRGSP 24 2005-01-01 37774 AKRGSP 25 2006-01-01 39836 AKRGSP 26 2007-01-01 40694 AKRGSP 27 2008-01-01 41039 AKRGSP 28 2009-01-01 44030 AKRGSP 29 2010-01-01 43591 AKRGSP 30 2011-01-01 44702 AKRGSP On Mon, Jun 10, 2013 at 7:42 PM, arum arumk...@wrdf.org wrote: I am downloading time series data from FRED. I have a working download, but I do not want to write out the download for all 50 states likes this: IDRGSP - read.table('http://research.stlouisfed.org/fred2/data/IDRGSP.txt', skip=11, header=TRUE) IDRGSP$DATE - as.Date(IDRGSP$DATE, '%Y-%m-%d') IDRGSP$SERIES - 'IDRGSP' IDRGSP$DESC - Real Total Gross Domestic Product by State for Idaho, Mil. of, A, NSA, 2012-06-05 WYRGSP - read.table('http://research.stlouisfed.org/fred2/data/WYRGSP.txt ', skip=11, header=TRUE) WYRGSP$DATE - as.Date(WYRGSP$DATE, '%Y-%m-%d') WYRGSP$SERIES - 'WYRGSP' WYRGSP$DESC - Real Total Gross Domestic Product by State for Wyoming, Mil. of, A, NSA, 2012-06-05 RGSP - rbind(IDRGSP, WYRGSP) I want to loop but I can not get the paste to work correctly. I am trying this: Can someone help me figure out the loop so I can build a table for all 50 states. ab - c(state.abb) base - 'http://research.stlouisfed.org/fred2/data/; type - RGSP.txt', skip=11, header=TRUE; tmp - NULL; for (a in ab) { url - paste(base, a, type, sep=); if (is.null(tmp)) tmp - read.table(url) else tmp - rbind(tmp, read.table(url)) } tmp thanks for your help -- View this message in context: http://r.789695.n4.nabble.com/Help-with-R-loop-for-URL-download-from-FRED-to-create-US-time-series-tp4669209.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Substituting the values on the y-axis
On 06/11/2013 12:26 AM, diddle1...@fastwebnet.it wrote: Hello, I plotted a graph on R showing how salinity (in ‰, y-axis) changes with time(in years, x-axis). However, right from the beginning on the Excel spreadsheet the v alues for salinity appeared as, for example, 35000‰ instead of 35‰, which I gues sed must have been a typing error for the website from which I extracted the dat a (NOAA).Thus, I now would like to substitute these values with the correspondin g smaller value, as it follows: 25000 35000- 25, 35 and so on. Is there any way I can change this on R or do I have to modify these numbers bef ore inputting the data on R (for example on Excel)? If so, can anybody tell me h ow to do either of these? Hi Emanuela, I think that the axis.mult function in the plotrix package will do what you want with mult=0.001. Obviously you won't want to display the transformation, so set mult.label=. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Problem with ODBC connection
Given the resounding silence, I would venture to guess that no-one here is interested in troubleshooting ODBC connections to Excel. The problem is most likely in the ODBC driver for Excel (not in R or RODBC), and Excel is NOT a database (so any data format problem is unlikely to be detected). --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Christofer Bogaso bogaso.christo...@gmail.com wrote: Any response please? Was my question not clear to the list? Please let me know. Thanks and regards, -- Forwarded message -- From: Christofer Bogaso bogaso.christo...@gmail.com Date: Sat, Jun 8, 2013 at 9:39 PM Subject: Re: Problem with ODBC connection To: r-help r-help@r-project.org Hello All, My previous post remains unanswered probably because the attachment was not working properly. So I am re-posting it again. My problem is in reading an Excel-2003 file through ODBC connection using RODBC package. Let say I have this Excel file: http://www.2shared.com/document/HS3JeFyW/MyFile.html I saved it in my F: drive and tried reading the contents using RODBC connection: library(RODBC) MyData - sqlFetch(odbcConnectExcel(f:/MyFile.xls), ) head(MyData, 30) However it looks that the second column (with header 's') is not read properly. Can somebody here explain this bizarre thing? Did I do something wrong in reading that? Really appreciate if someone could point out anything what might go wrong. Thanks and regards, On Fri, Jun 7, 2013 at 4:46 PM, Christofer Bogaso bogaso.christo...@gmail.com wrote: Hello again, I am having problem with ODBC connection using the RODBC package. I am basically trying to read the attached Excel-2003 file using RODBC package. Here is my code: head(sqlFetch(odbcConnectExcel(d:/1.xls), ), 30); odbcCloseAll() Criteria s d fd ffd1 f1fd2f2 fd3 f3 F12 F13 F14 F15 F16 F17 F18 F19 F20 1 a NA NA NA NA 0. 0.27755576 -0.00040332321NA NA NA NA NA NA NA NA NA NA NA NA 2 s NA 0 NA NA 0. 0. 0.000NA NA NA NA NA NA NA NA NA NA NA NA 3 d NA 0 NA NA 0.01734723 0.06938894 0.2775558 5.00 NA NA NA NA NA NA NA NA NA NA NA 4 f NA NA NA NA NA NA NA -4.25 NA NA NA NA NA NA NA NA NA NA NA 5 f NA 0 NA NA 0. 0. 0.000 -1.53 NA NA NA NA NA NA NA NA NA NA NA 6 f NA NA NA NA NA NA 0.000 0.00 NA NA NA NA NA NA NA NA NA NA NA 7 f NA NA NA NA NA NA 0.000NA NA NA NA NA NA NA NA NA NA NA NA 8 f NA 0 NA NA NA NA NANA NA NA NA NA NA NA NA NA NA NA NA 9 f NA 0 NA NA NA NA NANA NA NA NA NA NA NA NA NA NA NA NA 10f NA NA NA NA NA NA NANA NA NA NA NA NA NA NA NA NA NA NA 11f NA NA NA NA NA NA NANA NA NA NA NA NA NA NA NA NA NA NA 12f NA NA NA NA NA NA NANA NA NA NA NA NA NA NA NA NA NA NA 13f NA NA NA NA NA NA NANA NA NA NA NA NA NA NA NA NA NA NA Here you see the data in second column could not read at all. Can somebody point me if I did something wrong? Thanks and regards, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide
Re: [R] Fwd: Problem with ODBC connection
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jeff Newmiller Sent: Monday, June 10, 2013 9:45 PM To: Christofer Bogaso; r-help Subject: Re: [R] Fwd: Problem with ODBC connection Given the resounding silence, I would venture to guess that no-one here is interested in troubleshooting ODBC connections to Excel. The problem is most likely in the ODBC driver for Excel (not in R or RODBC), and Excel is NOT a database (so any data format problem is unlikely to be detected). -- - Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k -- - Sent from my phone. Please excuse my brevity. Christofer Bogaso bogaso.christo...@gmail.com wrote: I tried reading your workbook using your code, i.e. library(RODBC) MyData - sqlFetch(odbcConnectExcel('mypath/Myfile.xls'), ) head(MyData, 30) and got an error message saying that odbcConnectExcel is only usable with 32-bit Windows and I have a 64-bit system, so I can't help you there. But there are many other options in R for reading Excel workbooks. I was able to read your data using the read.xls function from the gdata package. I am not endorsing that package, it just happened to be the first package on my system that I tried. So if you can't read the data one way, try another. You could install and load the sos package and runthe following function findFn('xls') and you will get all sorts of suggestions. Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Problem with ODBC connection
On Tue, 11 Jun 2013 02:19:14 +0545 Christofer Bogaso bogaso.christo...@gmail.com wrote: Any real answer would be contingent on a reader being provided a reproducible example. Since you don't provide that, there's not a lot of point to an answer. However, to tilt at a windmill, depending on the size and complexity of your data file, it might be easier to simply export the data from Excel as a csv file and use read.table to bring it in to R. JWDougherty __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How can we access an element in a structure
Hi, I have a structure, which is the result of a function How can I access the elements in the gradient? dput(test1) structure(-1.17782911684913, gradient = structure(c(-0.0571065371783791, -0.144708170683529), .Dim = 1:2, .Dimnames = list(NULL, c(x1, x2 test1[[1]] [1] -1.177829 test1 [1] -1.177829 attr(,gradient) x1 x2 [1,] -0.05710654 -0.1447082 test1[gradient] [1] NA Thanks, Miao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.