Re: [R] Partial Derivatives in R
Hi, Paul: Your example is so complicated that I don't want to take the time to check it. You apply deriv to an exponential divided by a sum of exponentials, and I'm not convinced that your manual Correct way is actually correct. It looks like you've followed the examples in the deriv help page, so I would be more inclined to trust that, especially since it matched the answer I got from genD, as follows. In your genD example, x01 and x02 should be x[1] and x[2]: p1 - function(x) {exp(b00.1+b01.1*x[1]+b02.1*x[2]) / (exp(b00.1+b01.1*x[1]+b02.1*x[2])+ exp(b00.2+b01.2*x[1]+b02.2*x[2])+ exp(b00.3+b01.3*x[1]+b02.3*x[2])) - phat1} test - genD(p1, c(x01, x02)) test$D [,1] [,2][,3] [,4] [,5] [1,] -0.2012997 0.1296301 -0.03572875 0.07082898 -0.1261376 The first two components of test$D here match your attr(eval(dp1.dx), gradient). The next three are the lower triangular portion of the matrix of second partials of the function p1, per the genD documentation. The function numericGradient in the maxLik package could also be used for this, I believe. However, I won't take the time here to test that. Hope this helps. Spencer Graves Paul Heinrich Dietrich wrote: Hi Spencer, Thanks for suggesting the genD function. In attempting it, I have rearranged my function from phat1 ~ ... to ... - 1, it apparently doesn't like the first one :) But when I run it, it tells me the partials are all zero. I'm trying out a simple MNL equation before I expand it to what I'm looking for. Here is what I tried (and I get different answers from a textbook solution, deriv(), and genD()): ### Variables for an observation x01 - rnorm(1,0,1) x02 - rnorm(1,0,1) ### Parameters for an observation b00.1 - rnorm(1,0,1) b00.2 - rnorm(1,0,1) b00.3 - 0 b01.1 - rnorm(1,0,1) b01.2 - rnorm(1,0,1) b01.3 - 0 b02.1 - rnorm(1,0,1) b02.2 - rnorm(1,0,1) b02.3 - 0 ### Predicted Probabilities for an observation phat1 - 0.6 phat2 - 0.3 phat3 - 0.1 ### Correct way to calculate a partial derivative partial.b01.1 - phat1 * (b01.1 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b01.2 - phat2 * (b01.2 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b01.3 - phat3 * (b01.3 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b01.1; partial.b01.2; partial.b01.3 [1] 0.04288663 [1] -0.1804876 [1] 0.1376010 partial.b02.1 - phat1 * (b02.1 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b02.2 - phat2 * (b02.2 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b02.3 - phat3 * (b02.3 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b02.1; partial.b02.2; partial.b02.3 [1] 0.8633057 [1] 0.3171978 [1] 0.1376010 ### Derivatives for MNL dp1.dx - deriv(phat1 ~ exp(b00.1+b01.1*x01+b02.1*x02) / + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+ + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02)) dp2.dx - deriv(phat2 ~ exp(b00.2+b01.2*x01+b02.2*x02) / + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+ + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02)) dp3.dx - deriv(phat3 ~ exp(b00.3+b01.3*x01+b02.3*x02) / + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+ + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02)) attr(eval(dp1.dx), gradient) x01 x02 [1,] -0.01891354 0.058918 attr(eval(dp2.dx), gradient) x01 x02 [1,] -0.1509395 -0.06258685 attr(eval(dp3.dx), gradient) x01 x02 [1,] 0.169853 0.003668849 library(numDeriv) dp1.dx - function(x) {exp(b00.1+b01.1*x01+b02.1*x02) / + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+ + exp(b00.3+b01.3*x01+b02.3*x02)) - phat1} test - genD(dp1.dx, c(phat1,b00.1,b01.1,b02.1,x01,x02)); test $D [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [1,]000000000 0 0 0 0 0 [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [1,] 0 0 0 0 0 0 0 0 0 0 0 0 [,27] [1,] 0 $p [1] 6 $f0 [1] 0.05185856 $func function(x) {exp(b00.1+b01.1*x01+b02.1*x02) / (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+ exp(b00.3+b01.3*x01+b02.3*x02)) - phat1} $x [1] 0.6 1.401890082 -1.304531849 0.062833294 -0.143141379 [6] -0.005995477 $d [1] 1e-04 $method [1] Richardson $method.args $method.args$eps [1] 1e-04 $method.args$d [1] 1e-04 $method.args$zero.tol [1] 1.781029e-05 $method.args$r [1] 4 $method.args$v [1] 2 attr(,class) [1] Darray spencerg wrote: Have you considered genD{numDeriv}? If this does not answer your question, I suggest you try the RSiteSearch package. The following will open a list of options in a web browser, sorted by package most often found with your
Re: [R] Add data count to plot?
Hi Mike. In the plotting you do, there are usually parameters in the function you use to help you add text to the plot. for example: par(mfrow = c(2, 2)) plot(x, sub = text in the bottom, main = text in top, ylab = text to the side of the axis, xlab = text to the bottom of the axis) plot(y) # this one is without the text, but text(...) # you can add text inside the plot with the text command make sure to read: ? plot ? text ? the name of plotting function you use Cheers, Tal On Tue, May 12, 2009 at 12:17 AM, MikSmith m...@hsm.org.uk wrote: Hi I'm a relative newbie to R and had a query concerning plotting. I have generated a par(mfrow = c(2, 2)) graphic with 10 rose diagrams using circular. What I wanted to add to each individual plot was n = x for the number of data observations in each dataset. How might I go about doing this?? Thanks Mike -- View this message in context: http://www.nabble.com/Add-data-count-to-plot--tp23491681p23491681.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
Hi, I have a question about derivatives of functions. if i have the following function, f - function(x) x^3-2*x^2+3*x-5 i need a simple function for the derivative of this with respect to 'x', so that i can then sub in values to the the derivative function, and use Newtons method of finding a root for this. Please help. Regards, Kon Knafelman _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] decimal troubles ?
Dear all, I have some trouble with the number of decimals in R (currently R 2.9.0). For instance: options()$digits [1] 3 let me hope that I will get three digits where useful when a number is printed. BUT: 44.25+31.1+50 [1] 125 No way to get the right result 125.35 Can anybody tell me what's happens ? Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lattice histogram for multiple variables : adjusting x axis
Hello all, I have a large data frame and I want to look at the distribution of each variable very quickly by plotting an individual histogram for each variable. I'd like to do so using lattice. Here is a small example using the iris data set: histogram(as.formula(paste(~,paste(colnames(iris[,!sapply(iris,is.factor)]),collapse=+))),data=iris[,!sapply(iris,is.factor)]) However in this graphic, the x-axis and the breaks are the same for all 4 variables. I would like them to be adjusted to the data in each panel separately. I have tried using the scales argument to no avail: histogram( as.formula(paste(~,paste(colnames(iris[,!sapply(iris,is.factor)]),collapse=+))) ,data=iris[,!sapply(iris,is.factor)], scales=list(x=list(relation=free)) ) How can I individualize the x-axis and breaks for each individual panel? Thanks for any help. David Gouache ARVALIS - Institut du végétal Station de La Minière 78280 Guyancourt Tel: 01.30.12.96.22 / Port: 06.86.08.94.32 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] decimal troubles ?
Patrick Giraudoux wrote: I have some trouble with the number of decimals in R (currently R 2.9.0). For instance: options()$digits [1] 3 let me hope that I will get three digits where useful when a number is printed. BUT: 44.25+31.1+50 [1] 125 No way to get the right result 125.35 It says digits, not decimals: (44.25+31.1+50)/100 [1] 1.25 Dieter -- View this message in context: http://www.nabble.com/decimal-troubles---tp23498062p23498345.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] decimal troubles ?
Tena koe Patrick If you want more than three digits, change the options: options(digits=7) 44.25+31.1+50 [1] 125.35 HTH Peter Alspach -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Patrick Giraudoux Sent: Tuesday, 12 May 2009 8:08 p.m. To: r-help@r-project.org Subject: [R] decimal troubles ? Dear all, I have some trouble with the number of decimals in R (currently R 2.9.0). For instance: options()$digits [1] 3 let me hope that I will get three digits where useful when a number is printed. BUT: 44.25+31.1+50 [1] 125 No way to get the right result 125.35 Can anybody tell me what's happens ? Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. The contents of this e-mail are confidential and may be subject to legal privilege. If you are not the intended recipient you must not use, disseminate, distribute or reproduce all or any part of this e-mail or attachments. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. Any opinion or views expressed in this e-mail are those of the individual sender and may not represent those of The New Zealand Institute for Plant and Food Research Limited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to display data content on a only row?
Thom_249 wrote: I have data like this: [1] 16.800 6.533 5.067 3.933 2.200 1.667 [7] 1.200 1.067 0.733 0.667 And I want that all these data, printed on a 4 rows instead of 8, and it's be great without the [x] First look would be cat(). Second, tell us where you got the data from. Matrix? Many vectors? Dieter -- View this message in context: http://www.nabble.com/How-to-display-data-content-on-a-only-row--tp23497437p23498397.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
Kon Knafelman wrote: if i have the following function, f - function(x) x^3-2*x^2+3*x-5 i need a simple function for the derivative of this with respect to 'x', so that i can then sub in values to the the derivative function, and use Newtons method of finding a root for this. ?deriv ?uniroot (if this is not homework) Dieter -- View this message in context: http://www.nabble.com/%28no-subject%29-tp23498039p23498366.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how the break is calculated by R?
Hi all: As to hist,the help file says: R's default with equi-spaced breaks (also the default) is to plot the counts in the cells defined by breaks. I wanna know how the break is calculated by R? In other words: break = (max - min)/(number of group) but how the number of group is calculated by R? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bayes newby
* * * I have to setup a simple classification method (mix of factor and numeric variables) that can iteratively: - Use expert opinion prior on item X1 - Classify X1 based on estimated model (training set) and expert opinion (previous requirement) - Update the model with true classification information (and eventually give some feedback on expert opinion contribution) Where can I start from? Is there a library good for this kind of work? Can someone write down some simple example to kick off this project? Thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] times family unavailable in postscript device (Ubuntu Linux)
Hi This happens because of the way PostScript files are generated, with all metadata in the head of the file, including font information. So you need to predeclare any fonts that you are going to use in a PostScript file. In this case, something like ... postscript(fonts=Times) ... or if Times is the ONLY font that will be used in the file, then you can just use ... postscript(family=Times) ... (and then you don't need to specify the family in the call to plot()) Paul Paul Johnson wrote: I'm running Ubuntu 9.04. I could use some advice about fonts in postscript devices. sessionInfo() R version 2.9.0 (2009-04-17) i486-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base I can use family=Times with pdf output, but postscript refuses. It says: plot(rnorm(10),rnorm(10), family=Times) Error in axis(side = side, at = at, labels = labels, ...) : family 'Times' not included in PostScript device This happens even though Times *appears* to be listed as a valid family : names(postscriptFonts()) [1] serifsans mono [4] AvantGarde Bookman Courier [7] HelveticaHelvetica-Narrow NewCenturySchoolbook [10] Palatino TimesURWGothic [13] URWBookman NimbusMonNimbusSan [16] URWHelvetica NimbusSanCondCenturySch [19] URWPalladio NimbusRomURWTimes [22] ComputerModern ComputerModernItalic Japan1 [25] Japan1HeiMin Japan1GothicBBB Japan1Ryumin [28] Korea1 Korea1debCNS1 [31] GB1 example(postscriptFonts) pstscF postscriptFonts() $serif $family [1] Times $metrics [1] Times-Roman.afm Times-Bold.afm Times-Italic.afm [4] Times-BoldItalic.afm Symbol.afm $encoding [1] default attr(,class) [1] Type1Font $sans $family [1] Helvetica $metrics [1] Helvetica.afm Helvetica-Bold.afm [3] Helvetica-Oblique.afm Helvetica-BoldOblique.afm [5] Symbol.afm $encoding [1] default attr(,class) [1] Type1Font $mono $family [1] Courier $metrics [1] Courier.afm Courier-Bold.afm [3] Courier-Oblique.afm Courier-BoldOblique.afm [5] Symbol.afm $encoding [1] default attr(,class) [1] Type1Font $AvantGarde $family [1] AvantGarde $metrics [1] agw_.afm agd_.afm agwo.afm agdo.afm Symbol.afm $encoding [1] default attr(,class) [1] Type1Font $Bookman $family [1] Bookman $metrics [1] bkl_.afm bkd_.afm bkli.afm bkdi.afm Symbol.afm $encoding [1] default attr(,class) [1] Type1Font $Courier $family [1] Courier $metrics [1] Courier.afm Courier-Bold.afm [3] Courier-Oblique.afm Courier-BoldOblique.afm [5] Symbol.afm $encoding [1] default attr(,class) [1] Type1Font $Helvetica $family [1] Helvetica $metrics [1] Helvetica.afm Helvetica-Bold.afm [3] Helvetica-Oblique.afm Helvetica-BoldOblique.afm [5] Symbol.afm $encoding [1] default attr(,class) [1] Type1Font $`Helvetica-Narrow` $family [1] Helvetica-Narrow $metrics [1] hvn_.afm hvnb.afm hvno.afm hvnbo___.afm Symbol.afm $encoding [1] default attr(,class) [1] Type1Font $NewCenturySchoolbook $family [1] NewCenturySchoolbook $metrics [1] ncr_.afm ncb_.afm nci_.afm ncbi.afm Symbol.afm $encoding [1] default attr(,class) [1] Type1Font $Palatino $family [1] Palatino $metrics [1] por_.afm pob_.afm poi_.afm pobi.afm Symbol.afm $encoding [1] default attr(,class) [1] Type1Font $Times $family [1] Times $metrics [1] Times-Roman.afm Times-Bold.afm Times-Italic.afm [4] Times-BoldItalic.afm Symbol.afm $encoding [1] default attr(,class) [1] Type1Font $URWGothic $family [1] URWGothic $metrics [1] a010013l.afm a010015l.afm a010033l.afm a010035l.afm s05l.afm $encoding [1] default attr(,class) [1] Type1Font $URWBookman $family [1] URWBookman $metrics [1] b018012l.afm b018015l.afm b018032l.afm b018035l.afm s05l.afm $encoding [1] default attr(,class) [1] Type1Font $NimbusMon $family [1] NimbusMon $metrics [1] n022003l.afm n022004l.afm n022023l.afm n022024l.afm s05l.afm $encoding [1] default attr(,class) [1] Type1Font $NimbusSan $family [1] NimbusSan $metrics [1] n019003l.afm n019004l.afm n019023l.afm n019024l.afm s05l.afm $encoding [1] default attr(,class) [1] Type1Font $URWHelvetica $family [1] URWHelvetica $metrics [1] n019003l.afm n019004l.afm n019023l.afm n019024l.afm
Re: [R] Plotting colors on a world map
So, this would allow me to assign a color to each of the 2592 rectangular grid cells on a map of the world? I guess I am having trouble finding a suitable method of doing this. Also, to be able to distinguish country boundaries from the grid lines (which are both black color), could the grid lines be transparent so that all we see is the map of the world in various colors? I was thinking that 10 degrees would be light blue, 10-30 degrees would be blue, 31-40 be green, 41-50 be yellow, 51-60 be dark yellow, 61-70 be orange, 71-80 be red, 81-90 be dark red. Any help is appreciated in getting this task accomplished. Thanks Duncan Murdoch-2 wrote: On 5/11/2009 10:32 AM, dxc13 wrote: Hi useR's I have created a simple map of the world using the following code: m - map(xlim=c(-180,180), ylim=c(-90,90)) map.axes() I then create a grid of dimension 36x72 using the code: map.grid(m, nx=72, ny=36, labels=FALSE, col=black) This gives 2592 grid cells. In a separate data set of dimension 36x72, I have 2592 temperature values (some missing values are present) ranging from -20 degrees F to 90 degrees F. My question is, how can I create a reasonable color scheme (low temperatures in light blue to higher temperatures in dark red) and plot the temperature colors in their respective grid cells on the map? Take this data matrix as some example data: T - sample(-10:90, 2592, replace=TRUE) mat - matrix(T, nrow=36, ncol=72) The colorspace package can make nice color sequences for this sort of thing, and image() can add them to a plot. In colorspace, look at the examples for diverge_hcl. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Plotting-colors-on-a-world-map-tp23484524p23495370.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to display data content on a only row?
Hi I'm sorry for my so basic question, but it's so basic that I can't find anwser anywhere... I have data like this: [1] 16.800 6.533 5.067 3.933 2.200 1.667 [7] 1.200 1.067 0.733 0.667 [1] 35.6113946 6.9576953 4.5271667 2.3744674 1.4735768 1.1751393 [7] 1.2649111 0.7988086 0.7037316 0.7237469 [1] 32.933 15.667 8.133 6.533 2.800 2.267 [7] 1.467 0.733 0.733 0.667 [1] 27.4992641 10.2794293 3.4198301 2.3864698 1.4735768 1.2227993 [7] 1.1872337 0.7037316 1.0327956 0.7237469 [1] 234.533 11.333 5.800 2.933 2.000 1.533 [7] 0.933 0.667 0.267 0.533 [1] 425.7797665 9.5966264 4.2627959 2.4043611 2.0701967 1.5522641 [7] 1.0997835 1.0465362 0.5936168 0.7432234 [1] 430.2667 18.2000 4.5333 2.1333 1.0667 [6] 0.4000 0.0667 0.2667 0.0667 0.0667 [1] 450.8085540 17.1514264 4.7938453 2.5597619 2.2824381 0.9102590 [7] 0.2581989 0.5936168 0.2581989 0.2581989 And I want that all these data, printed on a 4 rows instead of 8, and it's be great without the [x] How can I don that? Regards Thomas -- View this message in context: http://www.nabble.com/How-to-display-data-content-on-a-only-row--tp23497437p23497437.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extracting residuals from glm object
Hi All, Can anybody explain why the following three ways of extracting residuals from a glm object are giving me different outputs: idv = runif(1000,0,1) dv = rbinom(1000,1,0.5) d = data.frame(idv,dv) fit = glm(dv~idv, data=d, family=binomial) head(residuals(fit)) 1 2 3 4 5 6 1.216862 -1.161059 -1.156795 1.204759 -1.141068 1.201437 head(fit$residuals) 1 2 3 4 5 6 2.096724 -1.962126 -1.952454 2.066224 -1.917492 2.057981 head(d$dv-fit$fitted.values) 1 2 3 4 5 6 0.5230655 -0.4903489 -0.4878241 0.5160253 -0.4784855 0.5140869 Regards Utkarsh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] newtons method
Hi, Does anyone know how to code newton's method for finding the roots of polynomial functions? im not sure whether i need to do this manually, or just code something with a loop to stop when it gets to the desired result thanks guys! _ Looking to move somewhere new this winter? Let ninemsn property help [[elided Hotmail spam]] =Domain_tagline_m=EXT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] decimal troubles ?
Dieter Menne wrote: It says digits, not decimals: (44.25+31.1+50)/100 [1] 1.25 Dieter (44.25+31.1+50)*10 [1] 1254 Strictly speaking, this should print as 1250 (no flames, please, I can live with it) Dieter -- View this message in context: http://www.nabble.com/decimal-troubles---tp23498062p23498439.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] decimal troubles ?
I have some trouble with the number of decimals in R (currently R 2.9.0). For instance: options()$digits [1] 3 let me hope that I will get three digits where useful when a number is printed. BUT: 44.25+31.1+50 [1] 125 No way to get the right result 125.35 Can anybody tell me what's happens ? The digits option specifies the number of significant figures, not the number of decimal places. (The help documentation on the options page doesn't make this clear at the moment, though it does point you to print.default, which describes it as setting significant digits.) Also note that the true value is being stored, so you can retrieve it with explicit formatting, e.g. x - 44.25+31.1+50 x # 125 print(x, digits=5) # 125.35 Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems to run SVM regression with e1071
Hi, Is the variable st character or a factor? What does str(train$st) show? str(train$st) Factor w/ 208 levels 0,000,0,0058643,..: 132 134 41 29 42 151 195 195 196 207 ... Thank you very much Max with your help I found my error, now it works. Marlene. _ [[elided Hotmail spam]] _medium=Taglineutm_campaign=IE8 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
if i have the following function, f - function(x) x^3-2*x^2+3*x-5 i need a simple function for the derivative of this with respect to 'x', so that i can then sub in values to the the derivative function, and use Newtons method of finding a root for this. You could take a look at ?deriv, but since the example you presented is just a polymonial, you can probably learn about calculus ( http://en.wikipedia.org/wiki/Derivative#Derivatives_of_elementary_functions ) and differentiate it in your head in less time than it took you to write this help question. Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice histogram for multiple variables : adjusting x axis
I have a large data frame and I want to look at the distribution of each variable very quickly by plotting an individual histogram for each variable. I'd like to do so using lattice. Here is a small example using the iris data set: histogram(as.formula(paste(~,paste(colnames(iris[,!sapply(iris,is. factor)]),collapse=+))),data=iris[,!sapply(iris,is.factor)]) However in this graphic, the x-axis and the breaks are the same for all 4 variables. I would like them to be adjusted to the data in each panel separately. I have tried using the scales argument to no avail: histogram( as.formula(paste(~,paste(colnames(iris[,!sapply(iris, is.factor)]),collapse=+))) ,data=iris[,!sapply(iris,is.factor)], scales=list(x=list(relation=free)) ) How can I individualize the x-axis and breaks for each individual panel? You're on the right track. If you don't specify a breaks argument, then histogram calculates a common axis and breaks across all the panels. Declaring breaks=NULL will force it to calculate them for each panel. histogram( as.formula(paste(~,paste(colnames(iris[,!sapply(iris,is.factor)]),collapse=+))), data=iris[,!sapply(iris,is.factor)], scales=list(x=list(relation=free)), breaks=NULL ) You can also specify a hist-style method, e.g. breaks=Sturges to get what you want. Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] decimal troubles ?
Shame on me... I confused digits and decimals Thanks anyway to make me come to the English basics... Patrick Peter Alspach a écrit : Tena koe Patrick If you want more than three digits, change the options: options(digits=7) 44.25+31.1+50 [1] 125.35 HTH Peter Alspach -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Patrick Giraudoux Sent: Tuesday, 12 May 2009 8:08 p.m. To: r-help@r-project.org Subject: [R] decimal troubles ? Dear all, I have some trouble with the number of decimals in R (currently R 2.9.0). For instance: options()$digits [1] 3 let me hope that I will get three digits where useful when a number is printed. BUT: 44.25+31.1+50 [1] 125 No way to get the right result 125.35 Can anybody tell me what's happens ? Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. The contents of this e-mail are confidential and may be subject to legal privilege. If you are not the intended recipient you must not use, disseminate, distribute or reproduce all or any part of this e-mail or attachments. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. Any opinion or views expressed in this e-mail are those of the individual sender and may not represent those of The New Zealand Institute for Plant and Food Research Limited. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how the break is calculated by R?
As to hist,the help file says: R's default with equi-spaced breaks (also the default) is to plot the counts in the cells defined by breaks. I wanna know how the break is calculated by R? In other words: break = (max - min)/(number of group) but how the number of group is calculated by R? Thanks! If you look closely at the help page for hist, you'll see that in the Usage section it says breaks=Sturges. Then in the Arguments section it says that when breaks is a character string, it means it is the name of a method to compute the number of cells. Finally, in the Details section, it tells you to look at the nclass.Sturges help page for details of Sturges' algorithm. Wikipedia has a good description of common alorgithms for calculating the number of breaks. http://en.wikipedia.org/wiki/Histogram#Number_of_bins_and_width Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] newtons method
Kon Knafelman wrote: Hi, Does anyone know how to code newton's method for finding the roots of polynomial functions? im not sure whether i need to do this manually, or just code something with a loop to stop when it gets to the desired result See ?optim for optimization methods. Uwe Ligges thanks guys! _ Looking to move somewhere new this winter? Let ninemsn property help [[elided Hotmail spam]] =Domain_tagline_m=EXT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ROCR: auc and logarithm plot
Hi, I am quite new to R and I have two questions regarding ROCR. 1. I have tried to understand how to extract area-under-curve value by looking at the ROCR document and googling. Still I am not sure if I am doing the right thing. Here is my code, is auc1 the auc value? pred1 - prediction(resp1,label1) perf1 - performance(pred1,tpr,fpr) plot( perf1, type=l,col=1 ) auc1 - performance(pred1,auc) auc1 - a...@y.values[[2]] 2. I have to compare two models that have very close ROCs. I'd like to have a more distinguishable plot of the ROCs. So is it possible to have a logarithm FP axis which might probably separate them well? Or zoom in the part close to the leftup corner of ROC plot? Or any other ways to make the ROCs more separate? Thanks and regards! --Tim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Kumaraswamy distribution
Dear R users, Does anyone know how to write function for Kumaraswamy distribution in R? Since I cannot write dkumar, pkumar, etc. in R. Please help. Thanks a lot, Debbie _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] inconsistent results for axis.POSIXct
I can reproduce it. Can you please send a bug report to R-bugs so that this won't get lost. Thank you, Uwe Ligges Dan Kelley wrote: Some time ago, I posted a note about what I considered to be a bug in axis.POSIXt() for R 2.8.x, relating to whether timezones in the data are obeyed on the axes. A link to that note, and to a quick and helpful response, is at the following URL http://www.nabble.com/patch-for-axis.POSIXct-%28related-to-timezones%29-td22338700.html#a22338700 Note that R 2.9.0 has been adjusted to help with this. However, there are still problems. Test code is given below: # test code (top panel, OK; bottom panel, incorrect times on x axis) par(mar=c(2,2,2,1)) par(mfrow=c(2,1)) start - as.POSIXct(2008-05-01 00:00:00, tz=UTC) end - as.POSIXct(2008-05-01 00:10:00, tz=UTC) plot(c(start, end), c(0, 1), ylab=) mtext(paste(start, to, end), side=3) start - as.POSIXct(2008-05-01 00:00:00, tz=UTC) end - as.POSIXct(2008-05-01 01:10:00, tz=UTC) plot(c(start, end), c(0, 1), ylab=) mtext(paste(start, to, end), side=3) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how often we use the libraries/packages in R
Dear list members, I am trying to make a clean up on my computer (windows problems, off course), and I wanted to remove some libraries that I have installed in R, which have been hardly used. Is there any way to see how often each library is used by us? I have many that I am not sure... because they can be used with other packages, etc. Thank you very much in advance, Best wishes, Marta [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Read many .csv files into an R session automatically, without specifying filenames
Romain Francois wrote: Hi, Something like this perhaps: files - dir( pattern = \\.csv$ ) for( x in files){ assign( sub( \\.csv$, , x ) , read.csv(x), envir = .GlobalEnv ) } or maybe csvs = Map(read.csv, dir(pattern='\\.csv$')) possibly with a correction of list item names names(csvs) = sub('\\.csv$', '', names(csvs)) which is optional, since the content of foo.csv can be accessed as csvs$foo (or as csvs[['foo', exact=FALSE]]) instead of csvs$foo.cvs even without such a correction. vQ Romain Mark Na wrote: Hi R-helpers, I would like to read into R all the .csv files that are in my working directory, without having to use a read.csv statement for each file. Each .csv would be read into a separate dataframe which would acquire the filename of the .csv. As an example: Mark-read.csv(Mark.csv) ...but the code (or command) would do this automatically for every .csv in the working directory without my specifying each file. I'd appreciate any help or ideas you might have. Thanks! Mark Na __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ROCR: auc and logarithm plot
1. I have tried to understand how to extract area-under-curve value by looking at the ROCR document and googling. Still I am not sure if I am doing the right thing. Here is my code, is auc1 the auc value? pred1 - prediction(resp1,label1) perf1 - performance(pred1,tpr,fpr) plot( perf1, type=l,col=1 ) auc1 - performance(pred1,auc) auc1 - a...@y.values[[2]] If you have only one set of predictions and matching class labels, it would be in a...@y.values[[1]]. If you have multiple sets (as from cross-validation or bootstrapping), the AUCs would be in a...@y.values[[1]], a...@y.values[[2]], etc. You can collect all of them for example by unlist(p...@y.values). Btw, you can use str(auc1) to see the structure of objects. 2. I have to compare two models that have very close ROCs. I'd like to have a more distinguishable plot of the ROCs. So is it possible to have a logarithm FP axis which might probably separate them well? Or zoom in the part close to the leftup corner of ROC plot? Or any other ways to make the ROCs more separate? To zoom in to a specific part: plot(perf1, xlim=c(0,0.2), ylim=c(0.7,1)) plot(perf2, add=TRUE, lty=2, col='red') If you want logarithmic axes (though I wouldn't personally do this for a ROC plot), you can set up an empty canvas and add ROC curves to it: plot(1,1, log='x', xlim=c(0.001,1), ylim=c(0,1), type='n') plot(perf, add=TRUE) You can adjust all components of the performance plots. See ?plot.performance and the examples in this slide deck: http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt Hope that helps, Tobias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] working with groups of labels?
Phillip Porter wrote: At 9:25 PM +1000 5/11/09, Jim Lemon wrote: Hi Phillip, I'm not exactly sure how you are positioning the labels. Is it possible to give us an example with some data that will produce a plot to show what you want? It shouldn't be too hard to do. Jim Data is something along the lines of: variablescore Real Bad Stuff 0 Real23 Bad 14 Stuff 17 Other Crazy Things 0 Other 18 Crazy 43 Things 13 The basic plot looks like this: barplot(score, horiz=TRUE, axes=FALSE, las=2, space=1, names.arg=c(Real Bad Stuff, Real,Bad,Stuff,Other Crazy Things,Other,Crazy,Things)) I've been trying make it look the way I want it by doing this: barplot(score, horiz=TRUE, axes=FALSE, las=1, space=1, names.arg=c(, Real,Bad,Stuff,,Other,Crazy,Things), cex.names=.5) mtext(c(Real Bad Stuff,Other Crazy Things),at=c(8,15.6),side=2, line=1.5,adj=1, las=1, cex=.5, font=2) The mtext is annoying to line up with the actual plot. If there was a way to get it to line up with the variables that would solve my problems, but I've tried at=variable and at=score (putting in for all of the variable names I don't want bolded and left justified) and neither way gets them to line up. Thanks, Phillip Hi Phillip, Try this: pporter-read.csv(pporter.csv) pporter variable score 1 Real Bad Stuff 0 2 Real23 3Bad14 4 Stuff17 5 Other Crazy Things 0 6 Other18 7 Crazy43 8 Things13 par(mar=c(5,8,4,2)) barpos-barplot(pporter$score, horiz=TRUE, axes=FALSE, las=1, space=1, names.arg=c(, Real,Bad,Stuff,,Other,Crazy,Things), cex.names=.5) mtext(c(Real Bad Stuff,Other Crazy Things),at=barpos[c(1,5)], side=2, line=5,adj=0, las=1, cex=.5, font=2) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how often we use the libraries/packages in R
Marta Rufino wrote: Dear list members, I am trying to make a clean up on my computer (windows problems, off course), and I wanted to remove some libraries that I have installed in R, which have been hardly used. Is there any way to see how often each library is used by us? No, unless you have an internal protocol if files accesses which is probably not the case. Uwe Ligges I have many that I am not sure... because they can be used with other packages, etc. Thank you very much in advance, Best wishes, Marta [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting colors on a world map
dxc13 wrote: Hi useR's I have created a simple map of the world using the following code: m - map(xlim=c(-180,180), ylim=c(-90,90)) map.axes() I then create a grid of dimension 36x72 using the code: map.grid(m, nx=72, ny=36, labels=FALSE, col=black) This gives 2592 grid cells. In a separate data set of dimension 36x72, I have 2592 temperature values (some missing values are present) ranging from -20 degrees F to 90 degrees F. My question is, how can I create a reasonable color scheme (low temperatures in light blue to higher temperatures in dark red) and plot the temperature colors in their respective grid cells on the map? Take this data matrix as some example data: T - sample(-10:90, 2592, replace=TRUE) mat - matrix(T, nrow=36, ncol=72) Thanks in advance, dxc13 Howdy dxc13, Try this: library(plotrix) tempcolor-color.scale(mat,c(0.5,1),c(0.5,0),c(1,0)) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What's the best way to tell a function about relevant fields in data frames
Hi list, I have a function that detects saccadic eye movements in a time series of eye positions sampled at a rate of 250Hz. This function needs three vectors: x-coordinate, y-coordinate, trial-id. This information is usually contained in a data frame that also has some other fields. The names of the fields are not standardized. head(eyemovements) time x y trial 51 880446504 53.18 375.73 1 52 880450686 53.20 375.79 1 53 880454885 53.35 376.14 1 54 880459060 53.92 376.39 1 55 880463239 54.14 376.52 1 56 880467426 54.46 376.74 1 There are now several possibilities for the signature of the function: 1. Passing the columns separately: detect(eyemovements$x, eyemovements$y, eyemovements$trial) or: with(eyemovements, detect(x, y, trial)) 2. Passing the data frame plus the names of the fields: detect(eyemovements, x, y, trial) 3. Passing the data frame plus a formula specifying the relevant fields: detect(eyemovements, ~x+y|trial) 4. Passing a formula and getting the data from the environment: with(eyemovements, detect(~x+y|trial)) I saw instances of all those variants (and others) in the wild. Is there a canonical way to tell a function which fields in a data frame are relevant? What other alternatives are possible? What are the pros and cons of the alternatives? Thanks, Titus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What's the best way to tell a function about relevant fields in data frames
On Tue, May 12, 2009 at 12:18:59PM +0200, Titus von der Malsburg wrote: Is there a canonical way to tell a function which fields in a data frame are relevant? What other alternatives are possible? What are the pros and cons of the alternatives? Why not simply rearrange your data frames to have standardized column names (see names() function), and write functions that operate on the standardized format? The change need not be destructive, you can first make a copy of the data. If all data frames have the same sequence of variables (time, x, y), you can just use indices to refer to the columns, e.g. 1 corresponds to the time variable. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What's the best way to tell a function about relevant fields in data frames
You could define a generic detect(obj, ...) that dispatches (using S3): detect.formula(fo, data) detect.data.frame(data) detect.default(x, y, trial) where the first two call the third thereby modeling it on lm, a common approach, and giving the user choice in interface. On Tue, May 12, 2009 at 6:18 AM, Titus von der Malsburg malsb...@gmail.com wrote: Hi list, I have a function that detects saccadic eye movements in a time series of eye positions sampled at a rate of 250Hz. This function needs three vectors: x-coordinate, y-coordinate, trial-id. This information is usually contained in a data frame that also has some other fields. The names of the fields are not standardized. head(eyemovements) time x y trial 51 880446504 53.18 375.73 1 52 880450686 53.20 375.79 1 53 880454885 53.35 376.14 1 54 880459060 53.92 376.39 1 55 880463239 54.14 376.52 1 56 880467426 54.46 376.74 1 There are now several possibilities for the signature of the function: 1. Passing the columns separately: detect(eyemovements$x, eyemovements$y, eyemovements$trial) or: with(eyemovements, detect(x, y, trial)) 2. Passing the data frame plus the names of the fields: detect(eyemovements, x, y, trial) 3. Passing the data frame plus a formula specifying the relevant fields: detect(eyemovements, ~x+y|trial) 4. Passing a formula and getting the data from the environment: with(eyemovements, detect(~x+y|trial)) I saw instances of all those variants (and others) in the wild. Is there a canonical way to tell a function which fields in a data frame are relevant? What other alternatives are possible? What are the pros and cons of the alternatives? Thanks, Titus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What's the best way to tell a function about relevant fields in data frames
Hi Zeljko, thanks for your suggestion! On Tue, May 12, 2009 at 12:26:48PM +0200, Zeljko Vrba wrote: Why not simply rearrange your data frames to have standardized column names (see names() function), and write functions that operate on the standardized format? Actually that's what I'm currently doing. And if the code was only for my personal use I would stick with this solution. However, I want to publish my stuff as a package and want to make its use as convenient as possible for the users. The drawbacks of the current solution are: The users have to perform the additional processing step and the users have to know the correct format of the data frame. Titus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What's the best way to tell a function about relevant fields in data frames
On 12/05/2009 6:18 AM, Titus von der Malsburg wrote: Hi list, I have a function that detects saccadic eye movements in a time series of eye positions sampled at a rate of 250Hz. This function needs three vectors: x-coordinate, y-coordinate, trial-id. This information is usually contained in a data frame that also has some other fields. The names of the fields are not standardized. head(eyemovements) time x y trial 51 880446504 53.18 375.73 1 52 880450686 53.20 375.79 1 53 880454885 53.35 376.14 1 54 880459060 53.92 376.39 1 55 880463239 54.14 376.52 1 56 880467426 54.46 376.74 1 There are now several possibilities for the signature of the function: 1. Passing the columns separately: detect(eyemovements$x, eyemovements$y, eyemovements$trial) or: with(eyemovements, detect(x, y, trial)) I'd choose this one, with one modification described below. 2. Passing the data frame plus the names of the fields: detect(eyemovements, x, y, trial) I think this is too inflexible. What if you want to temporarily change one variable? You don't want to have to create a whole new dataframe, it's better to just substitute in another variable. 3. Passing the data frame plus a formula specifying the relevant fields: detect(eyemovements, ~x+y|trial) 4. Passing a formula and getting the data from the environment: with(eyemovements, detect(~x+y|trial)) Rather than 3 or 4, I would use the more common idiom detect(~x+y|trial, data=eyemovements) (and the formula might be x+y~trial). But I think the formula interface is too general for your needs. What would ~x+y+z|trial mean? I'd suggest something like 1 but using the convention plot.default() uses, where you have x and y arguments, but y can be skipped if x is a matrix/dataframe/formula/list. It uses the xy.coords() function to do the extraction. Duncan Murdoch I saw instances of all those variants (and others) in the wild. Is there a canonical way to tell a function which fields in a data frame are relevant? What other alternatives are possible? What are the pros and cons of the alternatives? Thanks, Titus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xyplot: no ticks for a factor scale?
Hi, I would like to have no ticks on a scale that represents a factor. The tick.number argument from scales does not work in such a situation, as the help page as well as this simple (fairly stupid) code show: require(lattice) fact-gl(4,1,labels=LETTERS[1:4]) y-c(1,4,3,2) xyplot(y~fact,scales=list(x=list(tick.number=0))) I want to leave out the ticks at the x-axis (removing labels is easy, of course). What can I do with that? Thanks, Marcin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kumaraswamy distribution
Dear Debbie, On Tue, May 12, 2009 at 5:37 AM, Debbie Zhang debbie0...@hotmail.com wrote: Dear R users, Does anyone know how to write function for Kumaraswamy distribution in R? Since I cannot write dkumar, pkumar, etc. in R. Please help. Thanks a lot, Debbie Check the CRAN Task View on Probability distributions: http://cran.r-project.org/web/views/Distributions.html and in particular, check out package VGAM. HTH, Jay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kumaraswamy distribution
dkumar = function(x,a,b) a*b*(x^(a-1))*((1-x^a)^(b-1)) pkumar = function(x,a,b) 1-(1-x^a)^b (I've based this entirely on the wikipedia entry on the Kumaraswamy distribution [http://en.wikipedia.org/wiki/Kumaraswamy_distribution], so best to check both my replication of the formula there and the accuracy of the wikipedia entry itself) On Tue, May 12, 2009 at 6:37 AM, Debbie Zhang debbie0...@hotmail.com wrote: Dear R users, Does anyone know how to write function for Kumaraswamy distribution in R? Since I cannot write dkumar, pkumar, etc. in R. Please help. Thanks a lot, Debbie _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] From two colors to 01 sequences
Dear All, Perhaps, what I am asking is impossible, but I am asking it anyway. I have got several pdf files with rows of colored rectangles: red rectangles should be read as 0; green rectangles as 1. No other color exists. Is there some way to have R reading the colored rectangles to a matrix or data frame converting the color of the rectangles to sequences of 01? Thanks in advance, Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Partial Derivatives in R
You could also use rSymPy to symbolically differentiate it. For example, based on the semi-automatic differentiation example on the rSymPy home page: http://code.google.com/p/rsympy/#Semi-Automatic_Differentiation just replace the indented lines there with the indented lines here (I've also added # after each such line in case the email messes up the indentation.) The remaining lines are identical to that example. Note that x and bx hold symbolic variables so one must use [[ ... ]] rather than [ ... ] with them: library(rSymPy) # next file is sourced from devel version of rSymPy source(http://rsympy.googlecode.com/svn/trunk/R/Sym.R;) n - 2 # res - x - lapply(paste(x, 1:n, sep = ), Sym) sympy(paste(var(, shQuote(paste(x, collapse = )), ))) [1] (x1, x2) b - matrix(1:9, 3) # bx - list(NA, NA, NA) # for(i in 1:3) bx[[i]] - b[i, 1] + b[i, 2] * x[[1]] + b[i, 3] * x[[2]] # result - bx[[1]] / (bx[[1]] + bx[[2]] + bx[[3]]) # g - sapply(x, deriv, expr = result, USE.NAMES = FALSE) g2 - sapply(g, sympy, USE.NAMES = FALSE) g3 - gsub(x([0-9]+), x[\\1], g2) gfun0 - paste(grad - function(x, n =, n, ) { c(, paste(g3, collapse = ,), ) }) gfun - eval(parse(text = gfun0)) gfun function(x, n = 2 ) { c( -15*(1 + 4*x[1] + 7*x[2])/(6 + 15*x[1] + 24*x[2])**2 + 4/(6 + 15*x[1] + 24*x[2]),-24*(1 + 4*x[1] + 7*x[2])/(6 + 15*x[1] + 24*x[2])**2 + 7/(6 + 15*x[1] + 24*x[2]) ) } etc. On Tue, May 12, 2009 at 2:25 AM, spencerg spencer.gra...@prodsyse.com wrote: Hi, Paul: Your example is so complicated that I don't want to take the time to check it. You apply deriv to an exponential divided by a sum of exponentials, and I'm not convinced that your manual Correct way is actually correct. It looks like you've followed the examples in the deriv help page, so I would be more inclined to trust that, especially since it matched the answer I got from genD, as follows. In your genD example, x01 and x02 should be x[1] and x[2]: p1 - function(x) {exp(b00.1+b01.1*x[1]+b02.1*x[2]) / (exp(b00.1+b01.1*x[1]+b02.1*x[2])+ exp(b00.2+b01.2*x[1]+b02.2*x[2])+ exp(b00.3+b01.3*x[1]+b02.3*x[2])) - phat1} test - genD(p1, c(x01, x02)) test$D [,1] [,2] [,3] [,4] [,5] [1,] -0.2012997 0.1296301 -0.03572875 0.07082898 -0.1261376 The first two components of test$D here match your attr(eval(dp1.dx), gradient). The next three are the lower triangular portion of the matrix of second partials of the function p1, per the genD documentation. The function numericGradient in the maxLik package could also be used for this, I believe. However, I won't take the time here to test that. Hope this helps. Spencer Graves Paul Heinrich Dietrich wrote: Hi Spencer, Thanks for suggesting the genD function. In attempting it, I have rearranged my function from phat1 ~ ... to ... - 1, it apparently doesn't like the first one :) But when I run it, it tells me the partials are all zero. I'm trying out a simple MNL equation before I expand it to what I'm looking for. Here is what I tried (and I get different answers from a textbook solution, deriv(), and genD()): ### Variables for an observation x01 - rnorm(1,0,1) x02 - rnorm(1,0,1) ### Parameters for an observation b00.1 - rnorm(1,0,1) b00.2 - rnorm(1,0,1) b00.3 - 0 b01.1 - rnorm(1,0,1) b01.2 - rnorm(1,0,1) b01.3 - 0 b02.1 - rnorm(1,0,1) b02.2 - rnorm(1,0,1) b02.3 - 0 ### Predicted Probabilities for an observation phat1 - 0.6 phat2 - 0.3 phat3 - 0.1 ### Correct way to calculate a partial derivative partial.b01.1 - phat1 * (b01.1 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b01.2 - phat2 * (b01.2 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b01.3 - phat3 * (b01.3 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b01.1; partial.b01.2; partial.b01.3 [1] 0.04288663 [1] -0.1804876 [1] 0.1376010 partial.b02.1 - phat1 * (b02.1 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b02.2 - phat2 * (b02.2 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b02.3 - phat3 * (b02.3 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b02.1; partial.b02.2; partial.b02.3 [1] 0.8633057 [1] 0.3171978 [1] 0.1376010 ### Derivatives for MNL dp1.dx - deriv(phat1 ~ exp(b00.1+b01.1*x01+b02.1*x02) / + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+ + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02)) dp2.dx - deriv(phat2 ~ exp(b00.2+b01.2*x01+b02.2*x02) / + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+ + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02)) dp3.dx - deriv(phat3 ~ exp(b00.3+b01.3*x01+b02.3*x02) / + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+ + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02)) attr(eval(dp1.dx), gradient) x01 x02 [1,] -0.01891354 0.058918 attr(eval(dp2.dx), gradient) x01 x02 [1,]
Re: [R] xyplot: no ticks for a factor scale?
I would like to have no ticks on a scale that represents a factor. The tick.number argument from scales does not work in such a situation, as the help page as well as this simple (fairly stupid) code show: require(lattice) fact-gl(4,1,labels=LETTERS[1:4]) y-c(1,4,3,2) xyplot(y~fact,scales=list(x=list(tick.number=0))) I want to leave out the ticks at the x-axis (removing labels is easy, of course). What can I do with that? Replacing tick.number with tck will do the trick. Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] From two colors to 01 sequences
On Tue, May 12, 2009 at 12:20:56PM +0100, Paul Smith wrote: I have got several pdf files with rows of colored rectangles: red rectangles should be read as 0; green rectangles as 1. No other color exists. Is there some way to have R reading the colored rectangles to a matrix or data frame converting the color of the rectangles to sequences of 01? I would not do it with R, but.. here's the general approach: 1. Convert the PDF to some raster format at high enough resolution (DPI) without any kind of compression or anti-aliasing 2. Use some image manipulation program to replace red/green with black/white 3. Save the resulting picture in ASCII PBM format 4. Parse the resulting PBM and find 0-1 and 1-0 transitions which will give you rectangle boundaries. 5. You did not specify the kind of rectangles, nor whether rows are of uniform height, so I assume uniform grid. Otherwise, position and size might also be relevant[1], the interpretation is completely up to you. [1] As in, for example, http://educ.queensu.ca/~fmc/october2001/GoldenArt3.gif (Imagine that there are only two colors instead of 4 + black lines) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ROCR: auc and logarithm plot
Thanks Tobias! A new question: if I want to draw an average ROC from cross-validation, how to make the bar color same as the line color? Here is my code: plot( perf2,avg=threshold,lty=2,col=2, spread.estimate=stddev,barcol=2) Even I specify barcol=2, the color of bars are still black, the default one, instead of red 2. --Tim --- On Tue, 5/12/09, Tobias Sing tobias.s...@gmail.com wrote: From: Tobias Sing tobias.s...@gmail.com Subject: Re: [R] ROCR: auc and logarithm plot To: timlee...@yahoo.com, r-help@r-project.org Date: Tuesday, May 12, 2009, 5:54 AM 1. I have tried to understand how to extract area-under-curve value by looking at the ROCR document and googling. Still I am not sure if I am doing the right thing. Here is my code, is auc1 the auc value? pred1 - prediction(resp1,label1) perf1 - performance(pred1,tpr,fpr) plot( perf1, type=l,col=1 ) auc1 - performance(pred1,auc) auc1 - a...@y.values[[2]] If you have only one set of predictions and matching class labels, it would be in a...@y.values[[1]]. If you have multiple sets (as from cross-validation or bootstrapping), the AUCs would be in a...@y.values[[1]], a...@y.values[[2]], etc. You can collect all of them for example by unlist(p...@y.values). Btw, you can use str(auc1) to see the structure of objects. 2. I have to compare two models that have very close ROCs. I'd like to have a more distinguishable plot of the ROCs. So is it possible to have a logarithm FP axis which might probably separate them well? Or zoom in the part close to the leftup corner of ROC plot? Or any other ways to make the ROCs more separate? To zoom in to a specific part: plot(perf1, xlim=c(0,0.2), ylim=c(0.7,1)) plot(perf2, add=TRUE, lty=2, col='red') If you want logarithmic axes (though I wouldn't personally do this for a ROC plot), you can set up an empty canvas and add ROC curves to it: plot(1,1, log='x', xlim=c(0.001,1), ylim=c(0,1), type='n') plot(perf, add=TRUE) You can adjust all components of the performance plots. See ?plot.performance and the examples in this slide deck: http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt Hope that helps, Tobias [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Power function for ratio of lognormal means: two equally valid results? [SEC=Unclassified]
Hi All This is a general stats problem that I am dealing with using R, so any help is greater appreciated. I have two lognormal distributions with means M1 and M2. If we have: H0: log(M1/M2)=0 H1: log(M1/M2) !=0 equivalent to log(M1/M2)=log(1+P) where P0 or P0. If we calculate the power for a value of P=0.1 or P=-0.1 (i.e. a 10% difference) and say assume SE{log(M1/M2)}=0.05, and confidence level= 100(1-alpha), alpha=0.05, then how is the power function calculated? As far as I can see we can calculate the power in the two ways given below and if there is no assumed direction to difference between M1 and M2 are not both calculations valid? # P=-0.1 qn - qnorm(p=0.05, mean=0, sd=0.05, lower.tail=T) Power.1 - pnorm(q=qn, mean=log(0.9), sd=0.05, lower.tail=T) # P=0.1 qn - qnorm(p=0.95, mean=0, sd=0.05, lower.tail=T) Power.2 - pnorm(q=qn, mean=log(1.1), sd=0.05, lower.tail=F) print(c(Power.1,Power.2)) [1] 0.6780872 0.6030887 So which value should I use? Or would the average of the two values be appropriate to use? Or is there a fault in my logic? After a quick lit search I contacted a colleague who has written two stats text books and taught at University level for many years and he has not come across this problem and suggested averaging the values. This post is to ask if anyone has come across this pretty basic problem and has a suggested approach. thanks Steve Classification: Unclassified ___ Australian Antarctic Division - Commonwealth of Australia IMPORTANT: This transmission is intended for the addressee only. If you are not the intended recipient, you are notified that use or dissemination of this communication is strictly prohibited by Commonwealth law. If you have received this transmission in error, please notify the sender immediately by e-mail or by telephoning +61 3 6232 3209 and DELETE the message. Visit our web site at http://www.antarctica.gov.au/ ___ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting residuals from glm object
On May 12, 2009, at 3:50 AM, utkarshsinghal wrote: Hi All, Can anybody explain why the following three ways of extracting residuals from a glm object are giving me different outputs: idv = runif(1000,0,1) dv = rbinom(1000,1,0.5) d = data.frame(idv,dv) fit = glm(dv~idv, data=d, family=binomial) head(residuals(fit)) 1 2 3 4 5 6 1.216862 -1.161059 -1.156795 1.204759 -1.141068 1.201437 head(fit$residuals) 1 2 3 4 5 6 2.096724 -1.962126 -1.952454 2.066224 -1.917492 2.057981 head(d$dv-fit$fitted.values) 1 2 3 4 5 6 0.5230655 -0.4903489 -0.4878241 0.5160253 -0.4784855 0.5140869 set.seed(1) idv - runif(1000, 0, 1) d - data.frame(idv, dv) fit - glm(dv ~ idv, data = d, family = binomial) head(fit$residuals) 1 2 3 4 5 6 -1.957016 -1.960477 -1.967029 -1.978074 -1.954949 -1.977749 head(residuals(fit, type = working)) 1 2 3 4 5 6 -1.957016 -1.960477 -1.967029 -1.978074 -1.954949 -1.977749 head(d$dv - fit$fitted.values) 1 2 3 4 5 6 -0.4890179 -0.4899201 -0.4916190 -0.4944577 -0.4884778 -0.4943746 head(residuals(fit, type = response)) 1 2 3 4 5 6 -0.4890179 -0.4899201 -0.4916190 -0.4944577 -0.4884778 -0.4943746 See ?glm and ?residuals.glm and read the information there regarding the type of residuals stored in the glm model object as opposed to the multiple types of residuals that can be returned by residuals.glm(). See the references in ?residuals.glm for more information as per the Details section therein. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ROCR: auc and logarithm plot
To color the error bars in ROCR the same way as the performance curve, you need to add one more argument (plotCI.col='red') to your plot call: plot( perf2,avg=threshold,lty=2,col=2, spread.estimate=stddev, plotCI.col=2) The use of 'plotCI.col' is an example for the general mechanism of ROCR to propagate arguments to the components of a plot (also explained in ?plot.performance): Optional graphical parameters to adjust different components of the performance plot. Parameters are directed to their target component by prefixing them with the name of the component (component.parameter, e.g. text.cex). The following components are available: xaxis, yaxis, coloraxis, box (around the plotting region), points, text, plotCI (error bars), boxplot. The names of these components are influenced by the R functions that are used to create them. Thus, par(component) can be used to see which parameters are available for a given component (with the expection of the three axes; use par(axis) here). To adjust the canvas or the performance curve(s), the standard plot parameters can be used without any prefix. Good luck, Tobias On Tue, May 12, 2009 at 1:48 PM, Tim timlee...@yahoo.com wrote: Thanks Tobias! A new question: if I want to draw an average ROC from cross-validation, how to make the bar color same as the line color? Here is my code: plot( perf2,avg=threshold,lty=2,col=2, spread.estimate=stddev,barcol=2) Even I specify barcol=2, the color of bars are still black, the default one, instead of red 2. --Tim --- On Tue, 5/12/09, Tobias Sing tobias.s...@gmail.com wrote: From: Tobias Sing tobias.s...@gmail.com Subject: Re: [R] ROCR: auc and logarithm plot To: timlee...@yahoo.com, r-help@r-project.org Date: Tuesday, May 12, 2009, 5:54 AM 1. I have tried to understand how to extract area-under-curve value by looking at the ROCR document and googling. Still I am not sure if I am doing the right thing. Here is my code, is auc1 the auc value? pred1 - prediction(resp1,label1) perf1 - performance(pred1,tpr,fpr) plot( perf1, type=l,col=1 ) auc1 - performance(pred1,auc) auc1 - a...@y.values[[2]] If you have only one set of predictions and matching class labels, it would be in a...@y.values[[1]]. If you have multiple sets (as from cross-validation or bootstrapping), the AUCs would be in a...@y.values[[1]], a...@y.values[[2]], etc. You can collect all of them for example by unlist(p...@y.values). Btw, you can use str(auc1) to see the structure of objects. 2. I have to compare two models that have very close ROCs. I'd like to have a more distinguishable plot of the ROCs. So is it possible to have a logarithm FP axis which might probably separate them well? Or zoom in the part close to the leftup corner of ROC plot? Or any other ways to make the ROCs more separate? To zoom in to a specific part: plot(perf1, xlim=c(0,0.2), ylim=c(0.7,1)) plot(perf2, add=TRUE, lty=2, col='red') If you want logarithmic axes (though I wouldn't personally do this for a ROC plot), you can set up an empty canvas and add ROC curves to it: plot(1,1, log='x', xlim=c(0.001,1), ylim=c(0,1), type='n') plot(perf, add=TRUE) You can adjust all components of the performance plots. See ?plot.performance and the examples in this slide deck: http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt Hope that helps, Tobias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] From two colors to 01 sequences
Depending on the nature of your pdf file, it may be possible to use the grImport package. I've never used it before, but a quick test seems promising, # create a test picture colorStrip - function (colors, draw = T) { x - seq(0, 1 - 1/ncol(colors), length = ncol(colors)) y - rep(0.5, length(colors)) my.grob - grid.rect(x = unit(x, npc), y = unit(y, npc), width = unit(1/ncol(colors), npc), height = unit(1, npc), just = left, hjust = NULL, vjust = NULL, default.units = npc, name = NULL, gp = gpar(fill = rgb(colors[1, ], colors[2, ], colors[3, ]), col = rgb(colors[1, ], colors[2, ], colors[3, ])), draw = draw, vp = NULL) my.grob } colors - rbind(c(1, 0, 1), c(0, 1, 0), c(0, 0, 0)) pdf(testRGB.pdf) colorStrip(colors) dev.off() # import the pdf file into R library(grImport) PostScriptTrace(testRGB.pdf) test - readLines(testRGB.pdf.xml) testRead - readPicture(testRGB.pdf.xml) str(testRead) grid.picture(testRead) # somehow I've lost the fill color in the process?! grep(rgb.+, test, value=T) # this should allow you to find the sequence of red and green rectangles HTH, baptiste On 12 May 2009, at 13:38, Zeljko Vrba wrote: I have got several pdf files with rows of colored rectangles: red rectangles should be read as 0; green rectangles as 1. No other color exists. Is there some way to have R reading the colored rectangles to a matrix or data frame converting the color of the rectangles to sequences of 01? _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SAS PROC SORT nodupkey
Hi, I have the following data and I would like to delete douple names, it is almost similar to SAS PROC SORT nodupkey! Is there any function in R does this? x1 - rnorm(11,5,1) x2 - runif(11,0,1) nam -paste(A, c(1:4,4,5:9,9), sep=.) mydata - data.frame(x1,x2) crownames(mydata) - nam Many thanks in advance, Amor [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to display data content on a only row?
Thom_249 wrote: I got them from a Matrix on with I use the applyfunction tu compute the mean columns by columns print(apply(matSD,2,mean)) matSD = matrix(round(rnorm(20),2),nrow=4) cat(matSD) print(matSD) dput(matSD) # How to send this matrix to r-help newMat = apply(matSD,2,mean) print(newMat) cat(newMat,\n) dput(newMat) # Good for r-help questions -- View this message in context: http://www.nabble.com/How-to-display-data-content-on-a-only-row--tp23497437p23501655.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Response surface plot
Dear List, I am trying to plot a similar graph to attached from minitab manual in R. I have a response Y and three components which systematically vary in their proportions. I have found in R methods/packages to plot ternary plots (eg. plotrix) but nothing which can extend it to response surface in 3-D. Any help appreciated, Tim Carnus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] From two colors to 01 sequences
Thanks, Baptiste and Zeljko. I am attaching here an example of the picture of the rectangles. Paul On Tue, May 12, 2009 at 1:23 PM, baptiste auguie ba...@exeter.ac.uk wrote: Depending on the nature of your pdf file, it may be possible to use the grImport package. I've never used it before, but a quick test seems promising, # create a test picture colorStrip - function (colors, draw = T) { x - seq(0, 1 - 1/ncol(colors), length = ncol(colors)) y - rep(0.5, length(colors)) my.grob - grid.rect(x = unit(x, npc), y = unit(y, npc), width = unit(1/ncol(colors), npc), height = unit(1, npc), just = left, hjust = NULL, vjust = NULL, default.units = npc, name = NULL, gp = gpar(fill = rgb(colors[1, ], colors[2, ], colors[3, ]), col = rgb(colors[1, ], colors[2, ], colors[3, ])), draw = draw, vp = NULL) my.grob } colors - rbind(c(1, 0, 1), c(0, 1, 0), c(0, 0, 0)) pdf(testRGB.pdf) colorStrip(colors) dev.off() # import the pdf file into R library(grImport) PostScriptTrace(testRGB.pdf) test - readLines(testRGB.pdf.xml) testRead - readPicture(testRGB.pdf.xml) str(testRead) grid.picture(testRead) # somehow I've lost the fill color in the process?! grep(rgb.+, test, value=T) # this should allow you to find the sequence of red and green rectangles HTH, baptiste On 12 May 2009, at 13:38, Zeljko Vrba wrote: I have got several pdf files with rows of colored rectangles: red rectangles should be read as 0; green rectangles as 1. No other color exists. Is there some way to have R reading the colored rectangles to a matrix or data frame converting the color of the rectangles to sequences of 01? _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Response surface plot
Dear List, I am trying to plot a similar graph to attached from minitab manual in R. I have a response Y and three components which systematically vary in their proportions. I have found in R methods/packages to plot ternary plots (eg. plotrix) but nothing which can extend it to response surface in 3-D. Any help appreciated, Tim Carnus attachment: reponse_surface.png__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survival curves for time dependent covariates (was consultation)
*I´m writing to ask you how can I do Survivals Curves using Time-dependent *covariates? Which packages I need to Install?* This is a very difficult problem statistically. That is, there are not many good ideas for what SHOULD be done. Hence, there are no packages. Almost everything you find in an applied paper (e.g. a medical journal) is wrong. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotCI line types for line and for bar
Hi, I was wondering how to specify the line type for line instead of for bar. Here is my code: plotCI(x=mcra1avg, uiw=stdev1, type=l,col=2,lty=2) This way, I will have the bar line as dashed lty=2 and red col=2, and the line connecting the centers of the bars is also red col=2 but solid lty=1. How to make the line connecting the bar centers have the same solid lty as the bar? Thanks and regards! -- View this message in context: http://www.nabble.com/plotCI-line-types-for-line-and-for-bar-tp23501900p23501900.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAS PROC SORT nodupkey
On 5/12/2009 8:29 AM, amor Gandhi wrote: Hi, I have the following data and I would like to delete douple names, it is almost similar to SAS PROC SORT nodupkey! Is there any function in R does this? x1 - rnorm(11,5,1) x2 - runif(11,0,1) nam -paste(A, c(1:4,4,5:9,9), sep=.) mydata - data.frame(x1,x2) crownames(mydata) - nam Many thanks in advance, Amor x1 - rnorm(11,5,1) x2 - runif(11,0,1) nam -paste(A, c(1:4,4,5:9,9), sep=.) mydata - data.frame(nam,x1,x2) mydata[!duplicated(mydata$nam),] nam x1 x2 1 A.1 5.418824 0.04867219 2 A.2 5.658403 0.18398337 3 A.3 4.458279 0.50975887 4 A.4 5.465920 0.16425144 6 A.5 3.769153 0.80380230 7 A.6 6.827979 0.13745441 8 A.7 5.353053 0.91418900 9 A.8 5.409866 0.41879708 10 A.9 5.041249 0.38226152 ?duplicated [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] newtons method
Finding polynomial roots is not a problem where one wants a quick and dirty code. There are a lot of pitfalls, especially if there are roots that are multiples, and there has been a lot of work on this problem. See http://en.wikipedia.org/wiki/Category:Root-finding_algorithms . And Uwe may not be aware that optim() is contra-recommended for functions of 1 variable, which seems to be the problem here. But there is ?polyroot JN Message: 130 Date: Tue, 12 May 2009 11:12:51 +0200 From: Uwe Ligges lig...@statistik.tu-dortmund.de Subject: Re: [R] newtons method To: Kon Knafelman konk2...@hotmail.com Cc: r-h...@stat.math.ethz.ch Message-ID: 4a093d93.1020...@statistik.tu-dortmund.de Content-Type: text/plain; charset=ISO-8859-1; format=flowed Kon Knafelman wrote: Hi, Does anyone know how to code newton's method for finding the roots of polynomial functions? im not sure whether i need to do this manually, or just code something with a loop to stop when it gets to the desired result See ?optim for optimization methods. Uwe Ligges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Warning trying to plot -log(log(survival)) (John Sorkin)
John, Nothing about the data set you posted should give issues in any version of the package. The points =0 usually arises when the survival curve drops to zero, and so log(-log(0)) is being trimmed off the graph by the plot command. This is what should happen. As Thomas pointed out to you earlier, log-log survival curves are not that useful and there are now much better methods. I suspect that the data set you posted and the data set giving the warning are not the same. Terry T. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RES: SAS PROC SORT nodupkey
Amor, You have the possibility to use the duplicated() function. A[!duplicated(A[,var]),] Atenciosamente, Leandro Lins Marino Centro de Avaliação Fundação CESGRANRIO Rua Santa Alexandrina, 1011 - 2º andar Rio de Janeiro, RJ - CEP: 20261-903 R (21) 2103-9600 R.:236 ( (21) 8777-7907 ( lean...@cesgranrio.org.br Aquele que suporta o peso da sociedade é precisamente aquele que obtém as menores vantagens. (SMITH, Adam) Antes de imprimir pense em sua responsabilidade e compromisso com o MEIO AMBIENTE -Mensagem original- De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Em nome de amor Gandhi Enviada em: terça-feira, 12 de maio de 2009 09:30 Para: r-h...@stat.math.ethz.ch Assunto: [R] SAS PROC SORT nodupkey Hi, I have the following data and I would like to delete douple names, it is almost similar to SAS PROC SORT nodupkey! Is there any function in R does this? x1 - rnorm(11,5,1) x2 - runif(11,0,1) nam -paste(A, c(1:4,4,5:9,9), sep=.) mydata - data.frame(x1,x2) crownames(mydata) - nam Many thanks in advance, Amor [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Power function for ratio of lognormal means: two equally
On 12-May-09 12:00:50, Steve Candy wrote: Hi All This is a general stats problem that I am dealing with using R, so any help is greater appreciated. I have two lognormal distributions with means M1 and M2. If we have: H0: log(M1/M2)=0 H1: log(M1/M2) !=0 equivalent to log(M1/M2)=log(1+P) where P0 or P0. If we calculate the power for a value of P=0.1 or P=-0.1 (i.e. a 10% difference) and say assume SE{log(M1/M2)}=0.05, and confidence level= 100(1-alpha), alpha=0.05, then how is the power function calculated? As far as I can see we can calculate the power in the two ways given below and if there is no assumed direction to difference between M1 and M2 are not both calculations valid? # P=-0.1 qn - qnorm(p=0.05, mean=0, sd=0.05, lower.tail=T) Power.1 - pnorm(q=qn, mean=log(0.9), sd=0.05, lower.tail=T) # P=0.1 qn - qnorm(p=0.95, mean=0, sd=0.05, lower.tail=T) Power.2 - pnorm(q=qn, mean=log(1.1), sd=0.05, lower.tail=F) print(c(Power.1,Power.2)) [1] 0.6780872 0.6030887 So which value should I use? Or would the average of the two values be appropriate to use? Or is there a fault in my logic? After a quick lit search I contacted a colleague who has written two stats text books and taught at University level for many years and he has not come across this problem and suggested averaging the values. This post is to ask if anyone has come across this pretty basic problem and has a suggested approach. thanks Steve The Power Function is the probability of rejecting H0 for a particular instance of H1 (defined by a specific non-null value of a parameter), considered as a function of that parameter. So the power for H1: +10% is not the same as the power for H1: -10%. So then you face the problem of what to do about that when you want to consider What is the power when H1 is +/-10%?. The answer, in general terms, is whatever best suits the context in which the problem arises. One general solution which makes a certain kind of sense is: Adopt the smaller value of the two results (0.6780872 or 0.6030887). Then you know that the power is at least 0.6030887 if H1 (either +10% or -10%) is true. If your objective is to know at least what power can be achieved in the circumstances, then that answers the question. If 0.6030887 is adequate for your needs, then your ivestigation is in a satisfactory state. On the other hand, if in the context of the problem, you are concerned that there may not be enough power, then you may want to know at most what power can be achieved. In that case, you can achieve at most 0.6780872. If that is inadequate for your needs, then you need to re-consider the investigation as a whole, with a view to increasing the power.. You might also be considering a situation in which you are prepared to think as though +/-10% may arise at random with probabilities (P+) and (P-), and are interested in the average power as being representative of what you may expect to achieve overall (even though this will never be true for any particular case). In that case, you may reasonably adopt 0.6780872*(P-) + 0.6030887*(P+). Your colleague's suggestion is equivalent to (P-) = (P+) = 0.5. Additionally, in such a context, you may have different utilities for successfully rejecting H0 when H1: -10% is true, versus successfully rejecting H0 when H1: +10% is true. In that case, you would be thinking of computing the expected utility (the preceding case is tantamount to having equal utilities). Summary: Apply the results in the way that best fits into what you know about the situation you are investigating, and best meets the objectives you have in that aituation. There is not a universal answer to your question! Hoping this helps, Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 12-May-09 Time: 14:19:41 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] newtons method
John C Nash wrote: Finding polynomial roots is not a problem where one wants a quick and dirty code. There are a lot of pitfalls, especially if there are roots that are multiples, and there has been a lot of work on this problem. See http://en.wikipedia.org/wiki/Category:Root-finding_algorithms . And Uwe may not be aware that optim() is contra-recommended for functions of 1 variable, Has anybody told us something about just 1 variable? uwe which seems to be the problem here. But there is ?polyroot JN Message: 130 Date: Tue, 12 May 2009 11:12:51 +0200 From: Uwe Ligges lig...@statistik.tu-dortmund.de Subject: Re: [R] newtons method To: Kon Knafelman konk2...@hotmail.com Cc: r-h...@stat.math.ethz.ch Message-ID: 4a093d93.1020...@statistik.tu-dortmund.de Content-Type: text/plain; charset=ISO-8859-1; format=flowed Kon Knafelman wrote: Hi, Does anyone know how to code newton's method for finding the roots of polynomial functions? im not sure whether i need to do this manually, or just code something with a loop to stop when it gets to the desired result See ?optim for optimization methods. Uwe Ligges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R^2 extraction and autocorrelation/heterokedasticity on TSLS regression
Hi, Â I'm actually Iâm performing a TSLS linear multiple regression on annually data which go from 1971 to 1997. After performing the TSLS regression, I tried to extract the R squared value using âoutput$r.squaredâ function and to perform autocorrelation (Durbin Watson and Breush Godfrey) and heterokedasticity tests (Breush-pagan and Goldfeld Quandt)Â but I have errors messages. More specifically, this is function that I write to R and below its response : for R^2 : output$r.squared NULL for heterokedasticity tests : bptest(reg1) Error in terms.default(formula) : no terms component and for autocorrelation test, when I try : durbin.watson(reg1$residuals, max.lag=10) Â [1] 1.509 2.520 2.247 2.001 1.743 1.092 1.392 1.439 1.468 1.035 this give me only the durbin watson value and not the probabilities (p-value) When performing these tests on lm object I have no problem. So my question is how to extract R^2 from a tsls regression (object) and how to perform autocorrelation and heterokedasticity tests on tsls regression. I looked at the sem package but I found no answer to my questions. So please is there any person who can help me. Â Think you in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] import HTML tables
Hello, I was wondering if there is a function in R that imports tables directly from a HTML document. I know there are functions (say, getURL() from {RCurl} ) that download the entire page source, but here I refer to something like google document's function importHTML() (if you don't know this function, go check it, it's very useful). Anyway, if someone of something that does this job, I'd be very grateful if you could let me know. Otherwise, here's a suggestion for R-developers: please do write something inspired in google's importHMTL() function. Many thanks, Dimitri [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] neural network not using all observations
I am exploring neural networks (adding non-linearities) to see if I can get more predictive power than a linear regression model I built. I am using the function nnet and following the example of Venables and Ripley, in Modern Applied Statistics with S, on pages 246 to 249. I have standardized variables (z-scores) such as assets, age and tenure. I have other variables that are binary (0 or 1). In max_acc_ownr_nwrth_n_med for example, the variable has a value of 1 if the client's net worth is above the median net worth and a value of 0 otherwise. These are derived variable I created and variables that the regression algorithm has found to be predictive. A regression on the same variables shown below gives me an R-Square of about 0.12. I am trying to increase the predictive power of this regression model with a neural network being careful to avoid overfitting. Similar to Venables and Ripley, I used the following code: library(nnet) dim(coreaff.trn.nn) [1] 50888 head(coreaff.trn.nn) hh.iast.y WC_Total_Assets all_assets_per_hh age tenure max_acc_ownr_liq_asts_n_med max_acc_ownr_nwrth_n_med max_acc_ownr_ann_incm_n_med 1 3059448 -0.4692186-0.4173532 -0.06599001 -1.04747935 01 0 2 4899746 3.4854334 4.064 -0.06599001 -0.72540200 11 1 3727333 -0.2677357-0.4177944 -0.30136473 -0.40332465 11 1 4443138 -0.5295170-0.6999646 -0.1825 -1.04747935 00 0 5484253 -0.6112205-0.7306664 0.64013414 0.07979137 10 0 6799054 0.6580506 1.1763114 0.24784295 0.07979137 01 1 coreaff.nn1 - nnet(hh.iast.y ~ WC_Total_Assets + all_assets_per_hh + age + tenure + max_acc_ownr_liq_asts_n_med + + max_acc_ownr_nwrth_n_med + max_acc_ownr_ann_incm_n_med, coreaff.trn.nn, size = 2, decay = 1e-3, + linout = T, skip = T, maxit = 1000, Hess = T) # weights: 26 initial value 12893652845419998.00 iter 10 value 6352515847944854.00 final value 6287104424549762.00 converged summary(coreaff.nn1) a 7-2-1 network with 26 weights options were - skip-layer connections linear output units decay=0.001 b-h1 i1-h1 i2-h1 i3-h1 i4-h1 i5-h1 i6-h1 i7-h1 -21604.84 -2675.80 -5001.90 -1240.16-335.44 -12462.51 -13293.80 -9032.34 b-h2 i1-h2 i2-h2 i3-h2 i4-h2 i5-h2 i6-h2 i7-h2 210841.52 47296.92 58100.43 -13819.10 -9195.80 117088.99 131939.57 106994.47 b-o h1-o h2-o i1-o i2-o i3-o i4-o i5-o i6-o i7-o 1115190.67 894123.33 -417269.57 89621.84 170268.12 44833.63 59585.05 112405.30 437581.05 244201.69 sum((hh.iast.y - predict(coreaff.nn1))^2) Error: object hh.iast.y not found So I try: sum((coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1))^2) Error: dims [product 5053] do not match the length of object [5088] In addition: Warning message: In coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1) : longer object length is not a multiple of shorter object length Doing a little debugging: pred - predict(coreaff.nn1) dim(pred) [1] 50531 dim(coreaff.trn.nn) [1] 50888 So it looks like the dimensions (number of records/cases) of the vector pred is 5,053 and the number of records of the input dataset is 5,088. It looks like the neural network is dropping 35 records. Does anyone have any idea of why it would do this? It is most probably because those 35 records are bad data, a pretty common occurrence in the real world. Does anyone know how I can identify the dropped records? If I can do this I can get the dimensions of the input dataset to be 5,053 and then: sum((coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1))^2) would work. A summary of my dataset is: summary(coreaff.trn.nn) hh.iast.yWC_Total_Assets all_assets_per_hh age tenure max_acc_ownr_liq_asts_n_med Min. : 0 Min. :-6.970e-01 Min. :-8.918e-01 Min. :-4.617e+00 Min. :-1.209e+00 Min. :0. 1st Qu.: 565520 1st Qu.:-5.387e-01 1st Qu.:-6.147e-01 1st Qu.:-4.583e-01 1st Qu.:-7.254e-01 1st Qu.:0. Median : 834164 Median :-3.160e-01 Median :-3.718e-01 Median : 9.093e-02 Median :-2.423e-01 Median :0. Mean : 1060244 Mean : 2.948e-13 Mean : 3.204e-12 Mean :-1.884e-11 Mean :-3.302e-12 Mean :0.4951 3rd Qu.: 1207181 3rd Qu.: 1.127e-01 3rd Qu.: 1.891e-01 3rd Qu.: 5.617e-01 3rd Qu.: 5.629e-01 3rd Qu.:1. Max. :45003160 Max. :
[R] R Group on Professional Networking Site
Greetings: There is a LinkedIn R Group. There are job postings, news, as well as other discussions. http://www.linkedin.com/groups?about=gid=77616trk=anet_ug_grppro Regards, Ajit -- _ Ajit Gemunu de Silva Oakland CA 94619 skype: ajit_de_silva [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Response surface plot
On 5/12/2009 8:43 AM, Tim Carnus wrote: Dear List, I am trying to plot a similar graph to attached from minitab manual in R. I have a response Y and three components which systematically vary in their proportions. I have found in R methods/packages to plot ternary plots (eg. plotrix) but nothing which can extend it to response surface in 3-D. Any help appreciated, I'm not aware of anyone who has done this. The way to do the surface in rgl would be to construct a mesh of triangles using tmesh3d, and set the color of each vertex as part of the material argument. It's a little tricky to get the colors right when they vary by vertex, but the code below gives an example. I would construct the mesh by starting with one triangle and calling subdivision3d, but you may want more control over them. For example: library(rgl) # First create a flat triangle and subdivide it triangle - c(0,0,0,1, 1,0,0,1, 0.5, sqrt(3)/2, 0, 1) mesh - tmesh3d( triangle, 1:3, homogeneous=TRUE) mesh - subdivision3d(mesh, 4, deform=FALSE, normalize=TRUE) # Now get the x and y coordinates and compute the surface height x - with(mesh, vb[1,]) y - with(mesh, vb[2,]) z - x^2 + y^2 mesh$vb[3,] - z # Now assign colors according to the height; remember that the # colors need to be in the order of mesh$it, not vertex order. vcolors - rainbow(100)[99*z+1] tricolors - vcolors[mesh$it] mesh$material = list(color=tricolors) # Now draw the surface, and a rudimentary frame behind it. shade3d(mesh) triangles3d(matrix(triangle, byrow=TRUE, ncol=4), col=white) quads3d(matrix(c(1,0.5,0.5,1, 0,sqrt(3)/2, sqrt(3)/2,0, 0,0,1,1), ncol=3), col=white) bg3d(gray) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Group on Professional Networking Site
Greetings: There is a LinkedIn R Group. There are job postings, news, as well as other discussions. http://www.linkedin.com/groups?about=gid=77616trk=anet_ug_grppro Regards, Ajit http://www.linkedin.com/groups?about=gid=77616trk=anet_ug_grppro __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble with parametric bootstrap
Hi, I'm having trouble understanding how to construct a random number generator for a parametric bootstrap. My aim is to bootstrap a Likelihood Ratio statistic (under the null) for a linear model. The function at this point is given by boot.test.n01 - function(data, indeces, maxit=20) { y1 - fit1+se(e2)*rnorm(314) mod1 - glm(y1 ~ X1-1, maxit=maxit) y2 - fit2+se(e2)*rnorm(314) mod2 - glm(y2~1, maxit=maxit) t - 2*(logLik(mod1)-logLik(mod2)) t } boot.lrtest.n01 - boot(data=M1, statistic=boot.test.n01, R=3999, maxit=100, sim=parametric, ran.gen=???, mle=???) fit1 fit2 are vectors containing fitted values, the se() is the standard error of a residual vector, e2, which I'm using as an estimate of the variance. I'm not sure if I have constructed the function boot.test.n01 correctly with respect to the bootstrap dependent variables y1 y2. Furthermore I'm rather lost when it comes to how to construct the random number generator (as indicated by ???) and what to use as MLE's (as indicated by ???). I would really appriciate any help that I could get. Sincerely, /Anders [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] AFT-model with time-dependent covariates
Dear R-community, Dear Prof. Therneau, I would like to fit an AFT-model with time-dependent covariates and right-censored data. Searching the mailing list for information on the subject, I found some old posts which said it didn't work back then. My questions: (1) Has this kind of fitting already been implemented in the survival library in R? (2) If not: Are there any alternatives/ workarounds in order to get this job done in R anyways? (3) Can you recommend any other (commercial) software to fit this model? Thanks for your great help, I areally ppreciate it! All the best Philipp __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] newtons method
Uwe, John's comment about the difficulties with finding polynomial roots is even more forceful for a system of polynomials. There are likely numerous roots, some possibly real, and some possibly multiple. Homotopy methods are currrently the state-of-art for finding all the roots, but beware that they are very time-consuming. For locating the real roots, I have found that a relatively simple approach like multiple random starts works failrly well with a root-finder such as dfsane() in the BB package. However, I don't know of any tests to check whether I have found all the roots. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Uwe Ligges Sent: Tuesday, May 12, 2009 9:23 AM To: John C Nash Cc: r-help@r-project.org Subject: Re: [R] newtons method John C Nash wrote: Finding polynomial roots is not a problem where one wants a quick and dirty code. There are a lot of pitfalls, especially if there are roots that are multiples, and there has been a lot of work on this problem. See http://en.wikipedia.org/wiki/Category:Root-finding_algorithms . And Uwe may not be aware that optim() is contra-recommended for functions of 1 variable, Has anybody told us something about just 1 variable? uwe which seems to be the problem here. But there is ?polyroot JN Message: 130 Date: Tue, 12 May 2009 11:12:51 +0200 From: Uwe Ligges lig...@statistik.tu-dortmund.de Subject: Re: [R] newtons method To: Kon Knafelman konk2...@hotmail.com Cc: r-h...@stat.math.ethz.ch Message-ID: 4a093d93.1020...@statistik.tu-dortmund.de Content-Type: text/plain; charset=ISO-8859-1; format=flowed Kon Knafelman wrote: Hi, Does anyone know how to code newton's method for finding the roots of polynomial functions? im not sure whether i need to do this manually, or just code something with a loop to stop when it gets to the desired result See ?optim for optimization methods. Uwe Ligges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] variability
I am analyzing the voice telephone traffic of some operators. In this type of phenomenon is useful to consider the anomalies in the duration in minutes of conversation but it is very important to take account of another variable: average time of conversation, the relationship between the duration in minutes of talk time and the number of such calls. Is there a way to get the values of the abnormal traffic based on these variables? one can calculate an index of variability that represent the variability of the duration of the conversation and the average time of conversation? -- View this message in context: http://www.nabble.com/variability-tp23503877p23503877.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: Hold out procedures in R software
I am a PhD Student at the University of Agriculture,Abeokuta Nigeria.Also a Statistician at National Horticultural Research institites ibadan Nigeria. I currently working on Discriniant analysis and i need a help on hoe to solve Lanche Bruch's hold out Procedure using R . Kindly help because i consired you as my last option. Thanks ARIYO Oludare Samuel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Two-way Anova
Hello, I'm trying to do a comparsion on a large scale say 10L bottle of liquid and a small scale bottle of liquid 0.5L, I have 5 different samples from each and they are measured over the space of 8 days as % viability and the % viability decreases over time. However not all 10 samples got measured every day. How would I do a two-way anova on this in R? Thanks for any help. Regards, Al [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rbind help
Hi I have a dataset of orientation and elevation in a dataframe which I am plotting in circular. Orientation is on the range 0-360, however I need to reduce this to 0-180 as the values over 180 are not necessarily correct. I can do striae$orientation[striae$orientation180]-180) to extract these values, but I then want to combine them back in to the original dataframe (with the elevation values). I assume rbind is the thing to use and have tried something like: test - rbind((striae$orientation[striae$orientation180]-180), (striae$orientation[striae$orientation=180])) Not being a regular R user I suspect the syntax is easy when you know it!! Any help much appreciated. Best wishes mike -- View this message in context: http://www.nabble.com/rbind-help-tp23499705p23499705.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to display data content on a only row?
Totally awsom! Thank you very much Thom -- View this message in context: http://www.nabble.com/How-to-display-data-content-on-a-only-row--tp23497437p23502952.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Partial Derivatives in R
Thanks for showing me how to use genD correctly and that it matches deriv here. Like you say, there must be a problem with the manual way of doing it, and I will look at it closely. Thanks again. spencerg wrote: Hi, Paul: Your example is so complicated that I don't want to take the time to check it. You apply deriv to an exponential divided by a sum of exponentials, and I'm not convinced that your manual Correct way is actually correct. It looks like you've followed the examples in the deriv help page, so I would be more inclined to trust that, especially since it matched the answer I got from genD, as follows. In your genD example, x01 and x02 should be x[1] and x[2]: p1 - function(x) {exp(b00.1+b01.1*x[1]+b02.1*x[2]) / (exp(b00.1+b01.1*x[1]+b02.1*x[2])+ exp(b00.2+b01.2*x[1]+b02.2*x[2])+ exp(b00.3+b01.3*x[1]+b02.3*x[2])) - phat1} test - genD(p1, c(x01, x02)) test$D [,1] [,2][,3] [,4] [,5] [1,] -0.2012997 0.1296301 -0.03572875 0.07082898 -0.1261376 The first two components of test$D here match your attr(eval(dp1.dx), gradient). The next three are the lower triangular portion of the matrix of second partials of the function p1, per the genD documentation. The function numericGradient in the maxLik package could also be used for this, I believe. However, I won't take the time here to test that. Hope this helps. Spencer Graves Paul Heinrich Dietrich wrote: Hi Spencer, Thanks for suggesting the genD function. In attempting it, I have rearranged my function from phat1 ~ ... to ... - 1, it apparently doesn't like the first one :) But when I run it, it tells me the partials are all zero. I'm trying out a simple MNL equation before I expand it to what I'm looking for. Here is what I tried (and I get different answers from a textbook solution, deriv(), and genD()): ### Variables for an observation x01 - rnorm(1,0,1) x02 - rnorm(1,0,1) ### Parameters for an observation b00.1 - rnorm(1,0,1) b00.2 - rnorm(1,0,1) b00.3 - 0 b01.1 - rnorm(1,0,1) b01.2 - rnorm(1,0,1) b01.3 - 0 b02.1 - rnorm(1,0,1) b02.2 - rnorm(1,0,1) b02.3 - 0 ### Predicted Probabilities for an observation phat1 - 0.6 phat2 - 0.3 phat3 - 0.1 ### Correct way to calculate a partial derivative partial.b01.1 - phat1 * (b01.1 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b01.2 - phat2 * (b01.2 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b01.3 - phat3 * (b01.3 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b01.1; partial.b01.2; partial.b01.3 [1] 0.04288663 [1] -0.1804876 [1] 0.1376010 partial.b02.1 - phat1 * (b02.1 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b02.2 - phat2 * (b02.2 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b02.3 - phat3 * (b02.3 - (b01.1*phat1+b01.2*phat2+b01.3*phat3)) partial.b02.1; partial.b02.2; partial.b02.3 [1] 0.8633057 [1] 0.3171978 [1] 0.1376010 ### Derivatives for MNL dp1.dx - deriv(phat1 ~ exp(b00.1+b01.1*x01+b02.1*x02) / + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+ + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02)) dp2.dx - deriv(phat2 ~ exp(b00.2+b01.2*x01+b02.2*x02) / + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+ + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02)) dp3.dx - deriv(phat3 ~ exp(b00.3+b01.3*x01+b02.3*x02) / + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+ + exp(b00.3+b01.3*x01+b02.3*x02)), c(x01,x02)) attr(eval(dp1.dx), gradient) x01 x02 [1,] -0.01891354 0.058918 attr(eval(dp2.dx), gradient) x01 x02 [1,] -0.1509395 -0.06258685 attr(eval(dp3.dx), gradient) x01 x02 [1,] 0.169853 0.003668849 library(numDeriv) dp1.dx - function(x) {exp(b00.1+b01.1*x01+b02.1*x02) / + (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+ + exp(b00.3+b01.3*x01+b02.3*x02)) - phat1} test - genD(dp1.dx, c(phat1,b00.1,b01.1,b02.1,x01,x02)); test $D [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [1,]000000000 0 0 0 0 0 [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [1,] 0 0 0 0 0 0 0 0 0 0 0 0 [,27] [1,] 0 $p [1] 6 $f0 [1] 0.05185856 $func function(x) {exp(b00.1+b01.1*x01+b02.1*x02) / (exp(b00.1+b01.1*x01+b02.1*x02)+exp(b00.2+b01.2*x01+b02.2*x02)+ exp(b00.3+b01.3*x01+b02.3*x02)) - phat1} $x [1] 0.6 1.401890082 -1.304531849 0.062833294 -0.143141379 [6] -0.005995477 $d [1] 1e-04 $method [1] Richardson $method.args $method.args$eps [1] 1e-04 $method.args$d [1] 1e-04 $method.args$zero.tol
[R] Graphic aspect ratio
Hi I've been playing with a 3x2 graphics device using the default size as it appears on screen. This has given me tall thin plots which I can resize by dragging the window and increasing the window width. However I was wondering if I can force R to produce square plots or set the actual aspect ratio. asp seems to affect data units, not on-screen pixel units. thanks mike -- View this message in context: http://www.nabble.com/Graphic-aspect-ratio-tp23503495p23503495.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] questions on rpart (tree changes when rearrange the order of covariates?!)
Greetings, I am using rpart for classification with class method. The test data is the Indian diabetes data from package mlbench. I fitted a classification tree firstly using the original data, and then exchanged the order of Body mass and Plasma glucose which are the strongest/important variables in the growing phase. The second tree is a little different from the first one. The misclassification tables are different too. I did not change the data, but why the results are so different? Does anyone know how rpart deal with ties? Here is the codes for running the two trees. library(mlbench) data(PimaIndiansDiabetes2) mydata-PimaIndiansDiabetes2 library(rpart) fit2-rpart(diabetes~., data=mydata,method=class) plot(fit2,uniform=T,main=CART for original data) text(fit2,use.n=T,cex=0.6) printcp(fit2) table(predict(fit2,type=class),mydata$diabetes) ## misclassifcation table: rows are fitted class neg pos neg 437 68 pos 63 200 #Klimt(fit2,mydata) pmydata-data.frame(mydata[,c(1,6,3,4,5,2,7,8,9)]) fit3-rpart(diabetes~., data=pmydata,method=class) plot(fit3,uniform=T,main=CART after exchaging mass glucose) text(fit3,use.n=T,cex=0.6) printcp(fit3) table(predict(fit3,type=class),pmydata$diabetes) ##after exchage the order of BODY mass and PLASMA glucose neg pos neg 436 64 pos 64 204 #Klimt(fit3,pmydata) Thanks, -- Yuanyuan Huang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to take away the same varible when I use merge
Dear: I am trying to merge two tables by a common variable. However, there are a few same variables which are in both of two tables. How can I take them away when I merge the two tables? Thanks! Xin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to display data content on a only row?
Hello Dieter I got them from a Matrix on with I use the applyfunction tu compute the mean columns by columns print(apply(matSD,2,mean)) So it's a kind of vector. I'll check with the cat function Thank you, have a nice day Thomas -- View this message in context: http://www.nabble.com/How-to-display-data-content-on-a-only-row--tp23497437p23500703.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] triangle.class pch per factor
Hello, I am using triangle.class in ade4 package and a factor. I would like to use this graphic parameter (pch) to set a different kind of point for each group in my factor (lakes). I have tried pch=1:4, I guess it does what it is supposed to do but not what I want. I can't use colors. Thank you in advance, -- View this message in context: http://www.nabble.com/triangle.class-pch-per-factor-tp23502805p23502805.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using loops to run functions over a list of variables
Hello, I have a data set with many variables, and often I want to run a given function, like summary() or cor() or lmer() etc. on many combinations of one or more than one of these variables. For every combination of variables I want to analyze I have been writing out the code by hand, but given that I want to run many different functions over dozens and dozens of variables combinations it is taking a lot of time and making for very inelegent code. There *has* to be a better way! I have tried looking through numerous message boards but everything I've tried has failed. It seems like loops would solve this problem nicely. (1) Create list of variables of interest (2) Iterate through the list, running a given function on each variable I have a data matrix which I have creatively called data. It has variables named focus and productive. If I run the function summary(), for instance, it works fine: summary(data$focus) summary(data$productive) Both of these work. If I try to use a loop like: factors - c(data$focus, data$productive) for(i in 1:2){ summary(get(factors[i])) } It given the following errors: Error in get(factors[i]) : variable data$focus was not found Error in summary(get(factors[i])) : error in evaluating the argument 'object' in selecting a method for function 'summary' But data$focus *does* exist! I could run summary(data$focus) and it works perfectly. What am I doing wrong? Even if I get this working, is there a better way to do this, especially if I have dozens of variables to analyze? Any ideas would be greatly appreciated! -- View this message in context: http://www.nabble.com/Using-loops-to-run-functions-over-a-list-of-variables-tp23505399p23505399.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rbind help
Hi Mike, MikSmith wrote: Hi I have a dataset of orientation and elevation in a dataframe which I am plotting in circular. Orientation is on the range 0-360, however I need to reduce this to 0-180 as the values over 180 are not necessarily correct. I can do striae$orientation[striae$orientation180]-180) to extract these values, but I then want to combine them back in to the original dataframe (with the elevation values). I assume rbind is the thing to use and have tried something like: test - rbind((striae$orientation[striae$orientation180]-180), (striae$orientation[striae$orientation=180])) I would do the conversion in place: striae$orientation - ifelse(striae$orientation 108, striae$orientation - 180, striae$orientation) Best, Jim Not being a regular R user I suspect the syntax is easy when you know it!! Any help much appreciated. Best wishes mike -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pair matching
Given two numeric vectors of possibly unequal length, I'd like to pair each element of the shorter vector with an element of the longer vector such that the sum of squared differences between the pairs is minimized. Can someone point me to an R function or an algorithm for accomplishing this? All the best, Tom Thomas S. Dye, Ph.D. T. S. Dye Colleagues, Archaeologists, Inc. Phone: (808) 529-0866 Fax: (808) 529-0884 http://www.tsdye.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple plot margins
Hello I'm plotting 6 graphs using mfrow = c(2, 3). In these plots, only graphs in the first column have titles for the y axis, and only the ones in the last row have titles for the x axis. I'd like all plots to be of the same size, and I'm trying to keep them as near each other as possible, but I'm having the following problem. If I make a single call to par(mar = ...), to leave room on the left and bottom for the axes titles, a lot of space will be wasted because not all graphs need titles; however, if I make one call of par(mar = ...) per plot, to have finer control of the margins, the first column and last row plots will be smaller than the rest, because the titles use up some of their space. I thought that setting large enough values for oma would do what I want, but it doesn't appear to work if mar is too small. To illustrate better what I'm trying to do: l +-+ +-+ +-+ a | | | | | | b | | | | | | e | | | | | | l +-+ +-+ +-+ l +-+ +-+ +-+ a | | | | | | b | | | | | | e | | | | | | l +-+ +-+ +-+ label label label where the margins between each plot should be narrow. Should I just plot the graphs without axis titles and then use text() to manually position them? Thanks in advance, Andre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using loops to run functions over a list of variables
I don't think get(factor[i]) will work. The problem is get only sees a character string data$focus instead of doing extracting focus from data. In your case isn't lapply (or sapply) good enough? sapply (data, summary) try ?lapply for details Jun On Tue, May 12, 2009 at 10:55 AM, rapton mattc...@gmail.com wrote: Hello, I have a data set with many variables, and often I want to run a given function, like summary() or cor() or lmer() etc. on many combinations of one or more than one of these variables. For every combination of variables I want to analyze I have been writing out the code by hand, but given that I want to run many different functions over dozens and dozens of variables combinations it is taking a lot of time and making for very inelegent code. There *has* to be a better way! I have tried looking through numerous message boards but everything I've tried has failed. It seems like loops would solve this problem nicely. (1) Create list of variables of interest (2) Iterate through the list, running a given function on each variable I have a data matrix which I have creatively called data. It has variables named focus and productive. If I run the function summary(), for instance, it works fine: summary(data$focus) summary(data$productive) Both of these work. If I try to use a loop like: factors - c(data$focus, data$productive) for(i in 1:2){ summary(get(factors[i])) } It given the following errors: Error in get(factors[i]) : variable data$focus was not found Error in summary(get(factors[i])) : error in evaluating the argument 'object' in selecting a method for function 'summary' But data$focus *does* exist! I could run summary(data$focus) and it works perfectly. What am I doing wrong? Even if I get this working, is there a better way to do this, especially if I have dozens of variables to analyze? Any ideas would be greatly appreciated! -- View this message in context: http://www.nabble.com/Using-loops-to-run-functions-over-a-list-of-variables-tp23505399p23505399.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jun Shen PhD PK/PD Scientist BioPharma Services Millipore Corporation 15 Research Park Dr. St Charles, MO 63304 Direct: 636-720-1589 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using loops to run functions over a list of variables
Forgot one thing, make sure your data is a list or data frame. On Tue, May 12, 2009 at 12:19 PM, Jun Shen jun.shen...@gmail.com wrote: I don't think get(factor[i]) will work. The problem is get only sees a character string data$focus instead of doing extracting focus from data. In your case isn't lapply (or sapply) good enough? sapply (data, summary) try ?lapply for details Jun On Tue, May 12, 2009 at 10:55 AM, rapton mattc...@gmail.com wrote: Hello, I have a data set with many variables, and often I want to run a given function, like summary() or cor() or lmer() etc. on many combinations of one or more than one of these variables. For every combination of variables I want to analyze I have been writing out the code by hand, but given that I want to run many different functions over dozens and dozens of variables combinations it is taking a lot of time and making for very inelegent code. There *has* to be a better way! I have tried looking through numerous message boards but everything I've tried has failed. It seems like loops would solve this problem nicely. (1) Create list of variables of interest (2) Iterate through the list, running a given function on each variable I have a data matrix which I have creatively called data. It has variables named focus and productive. If I run the function summary(), for instance, it works fine: summary(data$focus) summary(data$productive) Both of these work. If I try to use a loop like: factors - c(data$focus, data$productive) for(i in 1:2){ summary(get(factors[i])) } It given the following errors: Error in get(factors[i]) : variable data$focus was not found Error in summary(get(factors[i])) : error in evaluating the argument 'object' in selecting a method for function 'summary' But data$focus *does* exist! I could run summary(data$focus) and it works perfectly. What am I doing wrong? Even if I get this working, is there a better way to do this, especially if I have dozens of variables to analyze? Any ideas would be greatly appreciated! -- View this message in context: http://www.nabble.com/Using-loops-to-run-functions-over-a-list-of-variables-tp23505399p23505399.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jun Shen PhD PK/PD Scientist BioPharma Services Millipore Corporation 15 Research Park Dr. St Charles, MO 63304 Direct: 636-720-1589 -- Jun Shen PhD PK/PD Scientist BioPharma Services Millipore Corporation 15 Research Park Dr. St Charles, MO 63304 Direct: 636-720-1589 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] FW: neural network not using all observations
As a follow-up to my email below: The input data frame to nnet() has dimensions: dim(coreaff.trn.nn) [1] 50888 And the predictions from the neural network (35 records are dropped - see email below for more details) has dimensions: pred - predict(coreaff.nn1) dim(pred) [1] 50531 So, the following line of R code does not work as the dimensions are different. sum((coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1))^2) Error: dims [product 5053] do not match the length of object [5088] In addition: Warning message: In coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1) : longer object length is not a multiple of shorter object length While: dim(pred) [1] 50531 tail(pred) [,1] 5083 664551.9 5084 552170.6 5085 684834.3 5086 1215282.5 5087 1116302.2 5088 658112.1 shows that the last row of pred is 5,088, which corresponds to the dimension of coreaff.trn.nn, the input data frame to the neural network. I tried using row() to identify the 35 records that were dropped (or not scored). The code I tried was: coreaff.trn.nn.subset - coreaff.trn.nn[row(coreaff.trn.nn) == row(pred), ] Error in row(coreaff.trn.nn) == row(pred) : non-conformable arrays But I am not doing something right. pred has dimension = 1 and row() requires an object of dimension = 2. So using cbind() I bound a column of sequence numbers to pred to make the dimension = 2 but that did not help. Basically, if I can identify the 5,053 records that the neural network made predictions for, in the data frame of 5,088 records (coreaff.trn.nn) used by the neural network, then I can compare the predictions to the actual values, and compare the predictive power of the neural network to the predictive power of the linear regression model. Any idea how I can extract the 5,053 records that the neural network made predictions for from the data frame (5,088 records) used to train the neural network? Thanks in advance, Jude From: Ryan, Jude Sent: Tuesday, May 12, 2009 11:11 AM To: 'r-help@r-project.org' Cc: juderya...@yahoo.com Subject: neural network not using all observations I am exploring neural networks (adding non-linearities) to see if I can get more predictive power than a linear regression model I built. I am using the function nnet and following the example of Venables and Ripley, in Modern Applied Statistics with S, on pages 246 to 249. I have standardized variables (z-scores) such as assets, age and tenure. I have other variables that are binary (0 or 1). In max_acc_ownr_nwrth_n_med for example, the variable has a value of 1 if the client's net worth is above the median net worth and a value of 0 otherwise. These are derived variable I created and variables that the regression algorithm has found to be predictive. A regression on the same variables shown below gives me an R-Square of about 0.12. I am trying to increase the predictive power of this regression model with a neural network being careful to avoid overfitting. Similar to Venables and Ripley, I used the following code: library(nnet) dim(coreaff.trn.nn) [1] 50888 head(coreaff.trn.nn) hh.iast.y WC_Total_Assets all_assets_per_hh age tenure max_acc_ownr_liq_asts_n_med max_acc_ownr_nwrth_n_med max_acc_ownr_ann_incm_n_med 1 3059448 -0.4692186-0.4173532 -0.06599001 -1.04747935 01 0 2 4899746 3.4854334 4.064 -0.06599001 -0.72540200 11 1 3727333 -0.2677357-0.4177944 -0.30136473 -0.40332465 11 1 4443138 -0.5295170-0.6999646 -0.1825 -1.04747935 00 0 5484253 -0.6112205-0.7306664 0.64013414 0.07979137 10 0 6799054 0.6580506 1.1763114 0.24784295 0.07979137 01 1 coreaff.nn1 - nnet(hh.iast.y ~ WC_Total_Assets + all_assets_per_hh + age + tenure + max_acc_ownr_liq_asts_n_med + + max_acc_ownr_nwrth_n_med + max_acc_ownr_ann_incm_n_med, coreaff.trn.nn, size = 2, decay = 1e-3, + linout = T, skip = T, maxit = 1000, Hess = T) # weights: 26 initial value 12893652845419998.00 iter 10 value 6352515847944854.00 final value 6287104424549762.00 converged summary(coreaff.nn1) a 7-2-1 network with 26 weights options were - skip-layer connections linear output units decay=0.001 b-h1 i1-h1 i2-h1 i3-h1 i4-h1 i5-h1 i6-h1 i7-h1 -21604.84 -2675.80 -5001.90 -1240.16-335.44 -12462.51 -13293.80 -9032.34 b-h2 i1-h2 i2-h2 i3-h2 i4-h2 i5-h2 i6-h2 i7-h2 210841.52 47296.92 58100.43 -13819.10
Re: [R] Looking for a quick way to combine rows in a matrix
jim holtman schrieb: Try this: key - rownames(a) key[key == AT] - TA do.call(rbind, by(a, key, colSums)) something like paste(sort(strsplit(key, split=)[[1]]), ) might be more general. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two-way Anova
Using traditional ANOVA, you'd have to drop either cases or time points with missing data. Using linear mixed effects analysis, you'd be able to use all the data. LME also has the benefit of *not* assuming sphericity, which is good for data like yours (many measures across few cases) where the traditional ANOVA sphericity assumption is unlikely to hold. Your dependent variable, % valid, suggests that there's some more raw representation of the data that may be better to look at. For example, if % valid is derived from, say, the success/failure rate of 10 observations per sample/timepoint, you might want to take a look the lme4 package (as suggested in a previous post: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q3/001160.html ) On Tue, May 12, 2009 at 6:03 AM, Alan O'Loughlin olo...@wyeth.com wrote: Hello, I'm trying to do a comparsion on a large scale say 10L bottle of liquid and a small scale bottle of liquid 0.5L, I have 5 different samples from each and they are measured over the space of 8 days as % viability and the % viability decreases over time. However not all 10 samples got measured every day. How would I do a two-way anova on this in R? Thanks for any help. Regards, Al [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pair matching
If the matching need not be one-to-one, then you can just compute the Euclidean distances between the two vectors, then in each row (or column, which ever corresponds to the shorter vector) and find the smallest. This should be fairly easy to do. Andy From: Thomas S. Dye Given two numeric vectors of possibly unequal length, I'd like to pair each element of the shorter vector with an element of the longer vector such that the sum of squared differences between the pairs is minimized. Can someone point me to an R function or an algorithm for accomplishing this? All the best, Tom Thomas S. Dye, Ph.D. T. S. Dye Colleagues, Archaeologists, Inc. Phone: (808) 529-0866 Fax: (808) 529-0884 http://www.tsdye.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: From two colors to 01 sequences
On Tue, May 12, 2009 at 2:35 PM, Zeljko Vrba zv...@ifi.uio.no wrote: Aha, so you DON'T have only red and green rectangles in the picture, there is also white in between, and some pale delimiter lines. Nevertheless, both things ease the job slightly, and the approach I described should work. You can even script it by using ImageMagick and ghostscript. If all pictures are of the same size and layout, you could even manually (or programataically) construct a mask of rectangle midpoints, so you don't have to detect transitions between red/green and white, which will enormously simplify the job. [Note that the distance between rectangle centers is uniform in both directions.] I thank again both Zeljko and Baptiste. Meanwhile, I found out a more or less simple way of solving the problem: 1. edited the pdf file with Inkscape, inserting 0 or 1 over the rectangles; 2. saved the pdf file; 3. opened it with Acrobat Reader, and selected, as text, the 0 and 1 inserted; 4. pasted the 0 and 1 into a text editor. And that was done! Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using loops to run functions over a list of variables
Jun, sapply() does the trick! Thank you Jun! I really appreciate your help. -Matt On Tue, May 12, 2009 at 1:20 PM, Jun Shen jun.shen...@gmail.com wrote: Forgot one thing, make sure your data is a list or data frame. On Tue, May 12, 2009 at 12:19 PM, Jun Shen jun.shen...@gmail.com wrote: I don't think get(factor[i]) will work. The problem is get only sees a character string data$focus instead of doing extracting focus from data. In your case isn't lapply (or sapply) good enough? sapply (data, summary) try ?lapply for details Jun On Tue, May 12, 2009 at 10:55 AM, rapton mattc...@gmail.com wrote: Hello, I have a data set with many variables, and often I want to run a given function, like summary() or cor() or lmer() etc. on many combinations of one or more than one of these variables. For every combination of variables I want to analyze I have been writing out the code by hand, but given that I want to run many different functions over dozens and dozens of variables combinations it is taking a lot of time and making for very inelegent code. There *has* to be a better way! I have tried looking through numerous message boards but everything I've tried has failed. It seems like loops would solve this problem nicely. (1) Create list of variables of interest (2) Iterate through the list, running a given function on each variable I have a data matrix which I have creatively called data. It has variables named focus and productive. If I run the function summary(), for instance, it works fine: summary(data$focus) summary(data$productive) Both of these work. If I try to use a loop like: factors - c(data$focus, data$productive) for(i in 1:2){ summary(get(factors[i])) } It given the following errors: Error in get(factors[i]) : variable data$focus was not found Error in summary(get(factors[i])) : error in evaluating the argument 'object' in selecting a method for function 'summary' But data$focus *does* exist! I could run summary(data$focus) and it works perfectly. What am I doing wrong? Even if I get this working, is there a better way to do this, especially if I have dozens of variables to analyze? Any ideas would be greatly appreciated! -- View this message in context: http://www.nabble.com/Using-loops-to-run-functions-over-a-list-of-variables-tp23505399p23505399.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jun Shen PhD PK/PD Scientist BioPharma Services Millipore Corporation 15 Research Park Dr. St Charles, MO 63304 Direct: 636-720-1589 -- Jun Shen PhD PK/PD Scientist BioPharma Services Millipore Corporation 15 Research Park Dr. St Charles, MO 63304 Direct: 636-720-1589 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] selecting points on 3D scatterplots
Hello Everyone, I am new to R and need some help. I have a matrix of x,y,z coordinates that I would like to interactively plot in 3D and then using the cursor select points on the plot and have the coordinates sent to a matrix. I am using the rgl package to plot the data at the moment because it allows me to rotate and zoom. I also tried cloud and scatterplot3D. I am looking for a function like 'locator' which is used to select points on 2D scatterplots. Thanks in advance! Abby __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave: Howto write real TeX formula in plot
Thank you, Baptiste and Charlie. I found some examples wich look great on: http://www.texample.net/tikz/examples/ Perhaps try the pgfSweave package on r-forge? http://r-forge.r-project.org/projects/pgfsweave/ [..] accomplished using the java utility eps2pgf and requires the PGF/TiKZ package for LaTeX: http://sourceforge.net/projects/pgf/ I have installed pgf with apt-get install pgf. That worked fine, but installing pgfSweave fails. I did this: ,[ in R interpreter (running as root) ] | | install.packages('pgfSweave',,'http://www.rforge.net/') | | Warning in install.packages(pgfSweave, , http://www.rforge.net/;) : | argument 'lib' is missing: using '/usr/local/lib/R/site-library' | trying URL 'http://www.rforge.net/src/contrib/pgfSweave_0.7.1.tar.gz' | Content type 'application/x-tar' length 1017992 bytes (994 Kb) | opened URL | == | downloaded 994 Kb | | * Installing *source* package 'pgfSweave' ... | ** R | ** exec | ** inst | ** preparing package for lazy loading | Loading required package: stashR | Warning in library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc = lib.loc) : | there is no package called 'stashR' | Error: package 'stashR' could not be loaded | Execution halted | ERROR: lazy loading failed for package 'pgfSweave' | ** Removing '/usr/local/lib/R/site-library/pgfSweave' | | The downloaded packages are in | /tmp/Rtmp6huu9t/downloaded_packages | Warning messages: | 1: In install.packages(pgfSweave, , http://www.rforge.net/;) : | dependencies ‘stashR’, ‘filehash’, ‘digest’, ‘cacheSweave’ are not available | 2: In install.packages(pgfSweave, , http://www.rforge.net/;) : | installation of package 'pgfSweave' had non-zero exit status | ` what will i have to do now? And could someone give me an example how to write a formula in a plot? Like plot(... title=$\sigma^2 + \int x$) Thank you very much, -- Jonas Stein n...@jonasstein.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] selecting points on 3D scatterplots
On 5/12/2009 3:08 PM, Abby Drake wrote: Hello Everyone, I am new to R and need some help. I have a matrix of x,y,z coordinates that I would like to interactively plot in 3D and then using the cursor select points on the plot and have the coordinates sent to a matrix. I am using the rgl package to plot the data at the moment because it allows me to rotate and zoom. I also tried cloud and scatterplot3D. I am looking for a function like 'locator' which is used to select points on 2D scatterplots. There's no locator3d, but select3d is somewhat similar. You could write a locator3d function using the rgl.setMouseCallbacks, rgl.user2window and rgl.window2user functions, if you can figure out how to indicate depth with a mouse click. identify3d might be a little easier. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How do I extract the scoring equations for neural networks and support vector machines?
Sorry for these multiple postings. I solved the problem using na.omit() to drop records with missing values for the time being. I will worry about imputation, etc. later. I calculated the sum of squared errors for 3 models, linear regression, neural networks, and support vector machines. This is the first run. Without doing any parameter tuning on the SVM or playing around with the number of nodes in the hidden layer of the neural network, I found that the SVM had the lowest sum of squared errors, followed by neural networks, with regression being last. This probably indicates that the data has non-linear patterns. I have a couple of questions. 1) Besides sum of squared errors, are there any other metrics that can be used to compare these 3 models? AIC, BIC, etc, can be used for regressions, but I am not sure whether they can be used for SVM's and neural networks. 2) Is there any easy way to extract the scoring equations for SVM's and neural networks? Using the R objects I can always score new data manually but the model will need to be implemented in a production environment. When the model gets implemented in production (could be the mainframe) I will need equations that can be coded in any language (COBOL or SAS on the mainframe). Also, getting the scoring equations for all 3 models will let me create an ensemble model where the predicted value could be the average of the predictions from the SVM, neural network and linear regression. If the ensemble model has the smallest sum of squared errors this would be the model I would use. I have SAS Enterprise Miner as well and can get a scoring equation for the neural network (I don't have SVM), but the scoring code that SAS EM generates sucks and I would much rather extract a scoring equation from R. I am using nnet() for the neural network. Thanks in advance, Jude Ryan From: Ryan, Jude Sent: Tuesday, May 12, 2009 1:23 PM To: 'r-help@r-project.org' Cc: juderya...@yahoo.com Subject: FW: neural network not using all observations As a follow-up to my email below: The input data frame to nnet() has dimensions: dim(coreaff.trn.nn) [1] 50888 And the predictions from the neural network (35 records are dropped - see email below for more details) has dimensions: pred - predict(coreaff.nn1) dim(pred) [1] 50531 So, the following line of R code does not work as the dimensions are different. sum((coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1))^2) Error: dims [product 5053] do not match the length of object [5088] In addition: Warning message: In coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1) : longer object length is not a multiple of shorter object length While: dim(pred) [1] 50531 tail(pred) [,1] 5083 664551.9 5084 552170.6 5085 684834.3 5086 1215282.5 5087 1116302.2 5088 658112.1 shows that the last row of pred is 5,088, which corresponds to the dimension of coreaff.trn.nn, the input data frame to the neural network. I tried using row() to identify the 35 records that were dropped (or not scored). The code I tried was: coreaff.trn.nn.subset - coreaff.trn.nn[row(coreaff.trn.nn) == row(pred), ] Error in row(coreaff.trn.nn) == row(pred) : non-conformable arrays But I am not doing something right. pred has dimension = 1 and row() requires an object of dimension = 2. So using cbind() I bound a column of sequence numbers to pred to make the dimension = 2 but that did not help. Basically, if I can identify the 5,053 records that the neural network made predictions for, in the data frame of 5,088 records (coreaff.trn.nn) used by the neural network, then I can compare the predictions to the actual values, and compare the predictive power of the neural network to the predictive power of the linear regression model. Any idea how I can extract the 5,053 records that the neural network made predictions for from the data frame (5,088 records) used to train the neural network? Thanks in advance, Jude From: Ryan, Jude Sent: Tuesday, May 12, 2009 11:11 AM To: 'r-help@r-project.org' Cc: juderya...@yahoo.com Subject: neural network not using all observations I am exploring neural networks (adding non-linearities) to see if I can get more predictive power than a linear regression model I built. I am using the function nnet and following the example of Venables and Ripley, in Modern Applied Statistics with S, on pages 246 to 249. I have standardized variables (z-scores) such as assets, age and tenure. I have other variables that are binary (0 or 1). In max_acc_ownr_nwrth_n_med for example, the variable has a value of 1 if the client's net worth is above the median net worth and a value of 0 otherwise. These are derived variable I created and variables that the regression algorithm has found to be predictive. A regression on the same variables shown below gives me an R-Square of
[R] Help needed to compile R with shared library.
Hi, I cannot compile R with shared library. I am using Redhad Linux on Dell hardware. Here''s what I am doing: I set ?PICFLAGS in config.site file: CPICFLAGS=-fPIC FPICFLAGS=-fPIC and issue: ./configure --enable-R-shlib --prefix=$HOME/newR make configure finishes with no complaints, the last informative line being: Options enabled: shared R library, shared BLAS, R profiling, Java But make is upset: /usr/bin/ld: CConverters.o: relocation R_X86_64_32S against `a local symbol' can not be used when making a shared object; recompile with -fPIC CConverters.o: could not read symbols: Bad value collect2: ld returned 1 exit status It saysm recompile with -fPIC But I am already doing so, am I not? What am I doing wrong here? Any help would be appreciated. Regards, Tena Sakai tsa...@gallo.ucsf.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.