Re: [R] grid(Base): How to avoid Figure region too small and/or viewport too large by specifying 'relative' units?
In the meanwhile, I found a more minimal example which shows the problem (just change 'inch' to TRUE to see the difference): require(grid) inch - FALSE # TRUE d - if(inch) 5 else 1 pspc - d*c(0.3, 0.3) # width, height of panels spc - d*c(0.05, 0.05) # width, height of space axlabspc - d*c(0.1, 0.1) # width y label, height x label labspc - d*c(0.05, 0.05) # width label boxes, height label boxes par. - par(no.readonly=TRUE) gl - grid.layout(5, 5, default.units=if(inch) inches else npc, widths=c(axlabspc[1], pspc[1], spc[1], pspc[1], labspc[1]), heights=c(labspc[2], pspc[2], spc[2], pspc[2], axlabspc[2])) grid.show.layout(gl) pushViewport(viewport(layout=gl)) for(i in 1:2) { for(j in 1:2) { pushViewport(viewport(layout.pos.row=2*i, layout.pos.col=2*j, name=foo)) grid.rect() upViewport() } } par(par.) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mark sections on a time chart
Hi is there a comfortable way to mark periods on time chart (axis.Date)? To do it with arrows(...), seems to be irritating. I want to have something like ---winterspringsummer-- thx Christof __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a new by variable in a dataframe
HI, Without using ifelse() on the same example dataset. d - data.frame(stringsAsFactors = FALSE, transaction = c(T01, T02, T03, T04, T05, T06, T07, T08, T09, T10),date = c(2012-10-19, 2012-10-19, 2012-10-19, 2012-10-19, 2012-10-22, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23),time = c(08:00, 09:00, 10:00, 11:00, 12:00, 13:00, 14:00, 15:00, 16:00, 17:00)) d$date - as.Date(d$date,format=%Y-%m-%d) d$time-strptime(d$time,format=%H:%M)$hour d$flag-unlist(rbind(lapply(split(d,d$date),function(x) x[3]==max(x[3] d$datetime-as.POSIXct(paste(d$date,d$time, ),format=%Y-%m-%d %H) d1-d[,c(1,5,4)] d1 # transaction datetime flag #1 T01 2012-10-19 08:00:00 FALSE #2 T02 2012-10-19 09:00:00 FALSE #3 T03 2012-10-19 10:00:00 FALSE #4 T04 2012-10-19 11:00:00 TRUE #5 T05 2012-10-22 12:00:00 TRUE #6 T06 2012-10-23 13:00:00 FALSE #7 T07 2012-10-23 14:00:00 FALSE #8 T08 2012-10-23 15:00:00 FALSE #9 T09 2012-10-23 16:00:00 FALSE #10 T10 2012-10-23 17:00:00 TRUE str(d1) #'data.frame': 10 obs. of 3 variables: # $ transaction: chr T01 T02 T03 T04 ... # $ datetime : POSIXct, format: 2012-10-19 08:00:00 2012-10-19 09:00:00 ... # $ flag : logi FALSE FALSE FALSE TRUE TRUE FALSE ... A.K. - Original Message - From: Flavio Barros flaviomargar...@gmail.com To: William Dunlap wdun...@tibco.com Cc: r-help@r-project.org r-help@r-project.org; ramoss ramine.mossad...@finra.org Sent: Friday, October 19, 2012 4:24 PM Subject: Re: [R] Creating a new by variable in a dataframe I think i have a better solution *## Example data.frame* d - data.frame(stringsAsFactors = FALSE, transaction = c(T01, T02, T03, T04, T05, T06, T07, T08, T09, T10),date = c(2012-10-19, 2012-10-19, 2012-10-19, 2012-10-19, 2012-10-22, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23),time = c(08:00, 09:00, 10:00, 11:00, 12:00, 13:00, 14:00, 15:00, 16:00, 17:00)) *## As date tranfomation* d$date - as.Date(d$date) d$time - strptime(d$time, format='%H') library(reshape) *## Create factor to split the data* fdate - factor(format(d$date, '%D')) *## Create a list with logical TRUE when is the last transaction* ex - sapply(split(d, fdate), function(x) ifelse(as.numeric(x[,'time'])==max(as.numeric(x[,'time'])),T,F)) *## Coerce to logical vector* flag - unlist(rbind(ex)) *## With reshape we have the transform function e can add the flag column * d - transform(d, flag = flag) On Fri, Oct 19, 2012 at 3:51 PM, William Dunlap wdun...@tibco.com wrote: Suppose your data frame is d - data.frame( stringsAsFactors = FALSE, transaction = c(T01, T02, T03, T04, T05, T06, T07, T08, T09, T10), date = c(2012-10-19, 2012-10-19, 2012-10-19, 2012-10-19, 2012-10-22, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23), time = c(08:00, 09:00, 10:00, 11:00, 12:00, 13:00, 14:00, 15:00, 16:00, 17:00 )) (Convert the date and time to your favorite classes, it doesn't matter here.) A general way to say if an item is the last of its group is: isLastInGroup - function(...) ave(logical(length(..1)), ..., FUN=function(x)seq_along(x)==length(x)) is_last_of_dayA - with(d, isLastInGroup(date)) If you know your data is sorted by date you could save a little time for large datasets by using isLastInRun - function(x) c(x[-1] != x[-length(x)], TRUE) is_last_of_dayB - isLastInRun(d$date) The above d is sorted by date so you get the same results for both: cbind(d, is_last_of_dayA, is_last_of_dayB) transaction date time is_last_of_dayA is_last_of_dayB 1 T01 2012-10-19 08:00 FALSE FALSE 2 T02 2012-10-19 09:00 FALSE FALSE 3 T03 2012-10-19 10:00 FALSE FALSE 4 T04 2012-10-19 11:00 TRUE TRUE 5 T05 2012-10-22 12:00 TRUE TRUE 6 T06 2012-10-23 13:00 FALSE FALSE 7 T07 2012-10-23 14:00 FALSE FALSE 8 T08 2012-10-23 15:00 FALSE FALSE 9 T09 2012-10-23 16:00 FALSE FALSE 10 T10 2012-10-23 17:00 TRUE TRUE Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of ramoss Sent: Friday, October 19, 2012 10:52 AM To: r-help@r-project.org Subject: [R] Creating a new by variable in a dataframe Hello, I have a dataframe w/ 3 variables of interest: transaction,date(tdate) time(event_tim). How could I create a 4th variable (last_trans) that would flag the last transaction of the day for each day? In SAS I use: proc sort data=all6; by tdate event_tim; run; /*Create last
Re: [R] Question about survdiff in for-loop.
Thank you for your replay and help !! Best Regards, Young. -- View this message in context: http://r.789695.n4.nabble.com/Question-about-survdiff-in-for-loop-tp4646707p4646827.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xyplot type 'a' with mean symbols
Hi there I almost have the graph I want (except for moving the legend into the graph). Can someone please tell me how to put symbols at the means for this graph? xyplot (anxiety ~ treatment, groups=therapist, data=study2, ylim=c(0,50), auto.key=T, type='a') Thank you so much :) -- View this message in context: http://r.789695.n4.nabble.com/xyplot-type-a-with-mean-symbols-tp4646829.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with programming a tricky algorithm
Hi All, I'm a little stumped by the following problem. I've got a dataset with the following structure: idxyixiycountry(other variables) 111c1x1 212c1x2 313c1x3 ... .. 37395567c7x3739 37405568c7x3740 where ix and iy are interger-valued indices of the actual x and y coordinates for the gridded data I want to define a border variable that equals 1 if the cell north, east, west, or south of it has a different value of the country variable. So, for the row with idxy = 1, border would equal 1 if there is any idxy with country !=c1 and ix = 2 (or zero) or iy = 2 (or zero). Any thoughts? Thanks! Andrew __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Centering labels on X-axis
On 10/20/2012 01:39 AM, YAddo wrote: Dear all: I am trying to center labels on my plot with not much success. I have tried text(), mtext() but it's not working. I think I am using the wrong function for my task. Any help will be appreciated. My working codes. axis(1, at=c(1,2,3,4,5),font.lab=2,cex.axis=1.5,cex.lab=3,label=c(W0,CWH2,CWHmc,CH2,CHmc) ,text=c(1.5,2.5,3.5,4.5,5.5)) Hi YAddo, Perhaps the most common problem with getting the positions of axis labels right occurs with barplots. The bars in the standard barplot function are not centered on integer values, but the centers are returned from the function: barpos-barplot(...) axis(1,at=barpos,...) Is this the problem you are having? Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Axis Breaks with ggplot2
On 10/20/2012 01:50 AM, Edward Patzelt wrote: R-help - I'm trying to create axis breaks similar to this : http://www.r-bloggers.com/wp-content/uploads/2010/08/bar-chart-natural-axis-split1.png . Hi Edward, The gap.barplot function in the plotrix package does something like this, but it is not ggplot2. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how can I make a legend that applies for all the barplots in one same page?
On 10/20/2012 06:22 AM, Yakamu Yakamu wrote: Dear all, I would like to make 6 barplots in one page but with a legend that applies to all the barplots and would like to put it in the central-bottom of the page. I know only how to make legend for individual barplot, but since all my barplots have the same type and would be better if i just use one legend for all of them, and i cannot seem to find a way to put the legend in the bottom and centered. Please advice... I set the margins (and font type) as follows : par(mfrow=c(3,2),pty=m) # Layout with 3x2 plots par(oma=c(3,4,3,4)) # Outer margins par(mar=c(5,5,2,2)) # Inner margins par(xpd=F) par(family=serif, font=1) # Times New Roman font Hi Yayu, First, leave enough space below your plots, then: # allow the legend to display outside the plots par(xpd=TRUE) # work out where you want the legend in the user coordinates # of the _last_ plot displayed legend(x,y,...) par(xpd=FALSE) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how can I make a legend that applies for all the barplots in one same page?
On 20/10/2012 10:33, Jim Lemon wrote: On 10/20/2012 06:22 AM, Yakamu Yakamu wrote: Dear all, I would like to make 6 barplots in one page but with a legend that applies to all the barplots and would like to put it in the central-bottom of the page. I know only how to make legend for individual barplot, but since all my barplots have the same type and would be better if i just use one legend for all of them, and i cannot seem to find a way to put the legend in the bottom and centered. Please advice... I set the margins (and font type) as follows : par(mfrow=c(3,2),pty=m) # Layout with 3x2 plots par(oma=c(3,4,3,4)) # Outer margins par(mar=c(5,5,2,2)) # Inner margins par(xpd=F) par(family=serif, font=1) # Times New Roman font Hi Yayu, First, leave enough space below your plots, then: # allow the legend to display outside the plots par(xpd=TRUE) # work out where you want the legend in the user coordinates # of the _last_ plot displayed legend(x,y,...) par(xpd=FALSE) It is clearer and safer to use legend(x,y, ..., xpd = TRUE) See the help page. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimization in R similar to MS Excel Solver
I do not know what algorithms the Excel solver function uses. See inline for how to do what you want in R. Forgive me if I have misinterpreted your request. On 19-10-2012, at 16:25, Richard James wrote: Dear Colleagues, I am attempting to develop an optimization routine for a river suspended sediment mixing model. I have Calcium and Magnesium concentrations (%) for sediments from 4 potential source areas (Topsoil, Channel Banks, Roads, Drains) and I want to work out, based on the suspended sediment calcium and magnesium concentrations, what are the optimal contributions from each source area to river suspended sediments. The dataset is as follows: Topsoil Channel Bank Roads Drains Ca(%) 0.030.6 0.2 0.35 Mg(%)0.0073 0.0102 0.0141 0.012 Contribution 0.250.25 0.250.25 and Ca in river(%) 0.33 Mg in river (%) 0.0114 Read data (it would have been a lot more helpful if you had provided the result of dput for the data and the target). basedat - read.table(text=Topsoil Channel-Bank Roads Drains Ca(%) 0.03 0.60.2 0.35 Mg(%) 0.0073 0.0102 0.0141 0.012 Contribution 0.25 0.25 0.250.25, header=TRUE) basedat target - c(Ca in river(%)=0.33,Mg in river (%)=0.0114) # convert to matrix and vector for later use bmat - as.matrix(basedat[1:2,]) pstart - as.numeric(basedat[Contribution,1:3]) I want to optimize the contribution values (currently set at 0.25) by minimizing the sum of squares of the relative errors of the mixing model, calculated as follows: SSRE =( (x1-((a1*c1)+(a2*c2)+(a3*c3)+(a4*c4))/x1)^2) + (x2-((b1*c1)+(b2*c2)+(b3*c3)+(b4*c4))/x2)^2) I do not understand why you are dividing by x1 and x2. It make no sense to me to calculate (xi- (xi_calculated)/xi)^2 Given what your stated purpose then (xi- xi_calculated)^2 or something like (1-x1/xi_calculated)^2 seem more appropriate. Where: x1 = calcium in river; x2 = magnesium in river; a1:a4 = Ca in topsoil, channel banks, roads, drains b1:b4 = Mg in topsoil, channel banks, roads, drains c1:4 = Contribution to be optimized I can generate a solution very quickly using the MS Excel Solver function, however later I want to be able to run a Monte-Carlo simulation on to generate confidence intervals based on variance in the source area concentrations – hence my desire to use R to develop the mixing model. So far I have been using the ‘optim’ function, however I’m confused as to what form the ‘par’ and ‘fn’ arguments should take from my data above. I am also unsure of how to write the model constraints – i.e. total contribution should sum to 1, and all values must be non-negative. The 'par' should be a vector of starting values of the parameters for your objective function. The 'fn' is the function that calculates a scalar given the parameter values. Your 'par' is a vector with all elements between 0 and 1 and with a sum == 1. That can't be done with optim but you can simulate the requirements by letting optim work with a three element vector and defining the fourth value as 1-sum(first three params). The requirement that all params must lie between 0 and 1 can be met by making 'fn' return a large value when the requirement is not met. Some code: SSRE - function(parx) { par- c(parx,1-sum(parx)) if(all(par 0 par 1)) { # parameters meet requirements sum((target - (bmat %*% par))^2) # this is a linear algebra version of your objective without the division by xi # or if you want to divide by target (see above) see below in the benchmark section for comment #sum(((target - (bmat %*% par))/target)^2) } else 1e7 # penalty for parameters not satisfying constraints } SSRE(pstart) z - optim(pstart,SSRE) z c(z$par, 1-sum(z$par)) # final contributions Results: # z - optim(pstart,SSRE) # z # $par # [1] 0.1492113 0.2880078 0.2950191 # # $value # [1] 2.157446e-12 # # $counts # function gradient # 108 NA # # $convergence # [1] 0 # # $message # NULL You can also try this: z - optim(pstart,SSRE,method=BFGS) z c(z$par, 1-sum(z$par)) z - nlminb(pstart,SSRE) z c(z$par, 1-sum(z$par)) If you intend to do run these optimizations many times I wouldn't use optim without specifying the method argument. See this benchmark. library(rbenchmark) benchmark(optim(pstart,SSRE),optim(pstart,SSRE,method=BFGS), nlminb(pstart,SSRE), + replications=1000, columns=c(test,replications,elapsed,relative)) test replications elapsed relative 3 nlminb(pstart,
Re: [R] Help with programming a tricky algorithm
Hello, You should post a data example with ?dput. If your dataset is named MyData, use dput( head(MyData, 30) ) # paste the output of this in a post Anyway, I believe the following function might do what you want. It's untested, though. (Your example dataset is usefull but could be better) is.border - function(idx, DF){ ix - DF$ix %in% DF$ix[idx] + c(-1, 1) iy - DF$iy %in% DF$iy[idx] + c(-1, 1) any(DF$country != DF$country[ix iy]) } sapply(MyData$idxy, fun, MyData) It returns a logical value, so if you want 0/1 use as.integer to do the conversion. Hope this helps, Rui Barradas Em 20-10-2012 08:36, Andrew Crane-Droesch escreveu: Hi All, I'm a little stumped by the following problem. I've got a dataset with the following structure: idxyixiycountry(other variables) 111c1x1 212c1x2 313c1x3 ... .. 37395567c7x3739 37405568c7x3740 where ix and iy are interger-valued indices of the actual x and y coordinates for the gridded data I want to define a border variable that equals 1 if the cell north, east, west, or south of it has a different value of the country variable. So, for the row with idxy = 1, border would equal 1 if there is any idxy with country !=c1 and ix = 2 (or zero) or iy = 2 (or zero). Any thoughts? Thanks! Andrew __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] can't find the error in if function... maybe i'm blind?
Hi everybody, the following alway gives me the error Fehler in if (File$X.Frame.Number[a] + 1 == File$X.Frame.Number[a + 1]) (File$FishNr[a] - File$FishNr[a - : Fehlender Wert, wo TRUE/FALSE nötig ist. Maybe its stupid, but i'm not getting why... Maybe someone can help me. Thanks a lot! for (i in unique(BigFile$TrackAll)) { File - subset(BigFile,BigFile$TrackAll == i) File$FishNr [1] - 1 for ( a in File$X.Frame.Number) {if(File$X.Frame.Number[a]+1== File$X.Frame.Number[a+1]) (File$FishNr [a] - File$FishNr[a-1]) else(if (File$X.Frame.Number[a]+1 != File$X.Frame.Number[a+1]) (File$FishNr [a] - File$FishNr[a-1]+1 )) } } -- View this message in context: http://r.789695.n4.nabble.com/can-t-find-the-error-in-if-function-maybe-i-m-blind-tp4646839.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: not 'a real'?
Hi List, when supplying a vector of atomic vector classes to read.table, I get: # column classes colClasses=c(character, character,numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric) # Error: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, (from #2) : scan() expected 'a real', got '894.6' How is '894.6' not 'a real [number]'? Thanks for the ensuing enlightenment or punishment... Best, Brian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] supply methods to read.table
Hi List, I would like to optimize some data reading as well as clean up some code. The manual tells me to supply methods to colClasses but the manual and the methods documentation aren't helping... Can someone provide me an example please? Best, Brian R version 2.15.1 (2012-06-22) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets base other attached packages: [1] MASS_7.3-18 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: not 'a real'?
how about supplying the context of the error. Show the lines in the file where the error occurred. Sent from my iPad On Oct 20, 2012, at 7:21, Brian zenli...@gmail.com wrote: Hi List, when supplying a vector of atomic vector classes to read.table, I get: # column classes colClasses=c(character, character,numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric) # Error: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, (from #2) : scan() expected 'a real', got '894.6' How is '894.6' not 'a real [number]'? Thanks for the ensuing enlightenment or punishment... Best, Brian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can't find the error in if function... maybe i'm blind?
Learn how to debug your programs. Start with options(error = recover) this will give you control at the point of the error so you can examine values. Most likely one of the variables in the 'if' expression is an NA. Sent from my iPad On Oct 20, 2012, at 6:27, Janosch janosch1...@web.de wrote: Hi everybody, the following alway gives me the error Fehler in if (File$X.Frame.Number[a] + 1 == File$X.Frame.Number[a + 1]) (File$FishNr[a] - File$FishNr[a - : Fehlender Wert, wo TRUE/FALSE nötig ist. Maybe its stupid, but i'm not getting why... Maybe someone can help me. Thanks a lot! for (i in unique(BigFile$TrackAll)) { File - subset(BigFile,BigFile$TrackAll == i) File$FishNr [1] - 1 for ( a in File$X.Frame.Number) {if(File$X.Frame.Number[a]+1== File$X.Frame.Number[a+1]) (File$FishNr [a] - File$FishNr[a-1]) else(if (File$X.Frame.Number[a]+1 != File$X.Frame.Number[a+1]) (File$FishNr [a] - File$FishNr[a-1]+1 )) } } -- View this message in context: http://r.789695.n4.nabble.com/can-t-find-the-error-in-if-function-maybe-i-m-blind-tp4646839.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can't find the error in if function... maybe i'm blind?
Le samedi 20 octobre 2012 à 03:27 -0700, Janosch a écrit : Hi everybody, the following alway gives me the error Fehler in if (File$X.Frame.Number[a] + 1 == File$X.Frame.Number[a + 1]) (File$FishNr[a] - File$FishNr[a - : Fehlender Wert, wo TRUE/FALSE nötig ist. Maybe its stupid, but i'm not getting why... Maybe someone can help me. Thanks a lot! You could have told us what this error means in English, but my guess is that File$X.Frame.Number[a] or File$X.Frame.Number[a + 1] is NA, which means that the whole expression evaluates to NA, which is not corrrect for if(). If you want NAs to be considered as FALSE, than put the expression inside a isTRUE() call. If NAs are not expected there in your code, there's a problem elsewhere. Since the loop stops when the error is raised, a still contains the value that produced it and you can easily check where the problem comes from that way. My two cents for (i in unique(BigFile$TrackAll)) { File - subset(BigFile,BigFile$TrackAll == i) File$FishNr [1] - 1 for ( a in File$X.Frame.Number) {if(File$X.Frame.Number[a]+1== File$X.Frame.Number[a+1]) (File$FishNr [a] - File$FishNr[a-1]) else(if (File$X.Frame.Number[a]+1 != File$X.Frame.Number[a+1]) (File$FishNr [a] - File$FishNr[a-1]+1 )) } } -- View this message in context: http://r.789695.n4.nabble.com/can-t-find-the-error-in-if-function-maybe-i-m-blind-tp4646839.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mark sections on a time chart
Hello, The function below requires package plotrix and is far from fully tested, but it might do what you want. library(zoo) arrowsRange - function(from, to, at = 1, labels = NULL, length = 1/8, horizontal = TRUE, border = FALSE, ...){ require(plotrix) f1 - function(){ if(horizontal) arrows(from, at, to, at, length = length, code = 3, ...) else arrows(at, from, at, to, length = length, code = 3, ...) } f2 - function(){ f1() if(horizontal){ x - rowMeans(cbind(from, to)) y - at }else{ x - at y - rowMeans(cbind(from, to)) } boxed.labels(x, y, labels = labels, border = border) } if(is.null(text)) f1() else f2() } # Make up some test data set.seed(9365) z - zoo(cumsum(rnorm(365))) index(z) - Sys.Date() + 1:365 start - as.Date(c(2012-12-21, 2013-03-21, 2013-06-22)) end - as.Date(c(2013-03-20, 2013-06-21, 2013-09-21)) seasons - c(Winter, Spring, Summer) plot(z) abline(v = c(start[1], end)) Hope this helps, Rui Barradas Em 20-10-2012 07:33, Christof Kluß escreveu: Hi is there a comfortable way to mark periods on time chart (axis.Date)? To do it with arrows(...), seems to be irritating. I want to have something like ---winterspringsummer-- thx Christof __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: not 'a real'?
Hi Jim, On 10/20/12 13:36, Jim Holtman wrote: how about supplying the context of the error. Show the lines in the file where the error occurred. Sent from my iPad On Oct 20, 2012, at 7:21, Brian zenli...@gmail.com wrote: Hi List, when supplying a vector of atomic vector classes to read.table, I get: # column classes colClasses=c(character, character,numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric) # Error: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, (from #2) : Line #2: (with 'skip=4', so really line 4, but the first line after the 'header', as indicated above) LAE;20100101;2;894.6;3.9;93.8;5.3;3.1;204 scan() expected 'a real', got '894.6' How is '894.6' not 'a real [number]'? Thanks for the ensuing enlightenment or punishment... Best, Brian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. my full command: read.csv2(file=file, skip=2, na.strings = -, colClasses=c(character, character,numeric, numeric, numeric, numeric, numeric, numeric)) Also, sorry I forgot this: R version 2.15.1 (2012-06-22) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets base other attached packages: [1] MASS_7.3-18 Thanks again. Brian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: not 'a real'?
Le samedi 20 octobre 2012 à 14:25 +0200, Brian a écrit : Hi Jim, On 10/20/12 13:36, Jim Holtman wrote: how about supplying the context of the error. Show the lines in the file where the error occurred. Sent from my iPad On Oct 20, 2012, at 7:21, Brian zenli...@gmail.com wrote: Hi List, when supplying a vector of atomic vector classes to read.table, I get: # column classes colClasses=c(character, character,numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric) # Error: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, (from #2) : Line #2: (with 'skip=4', so really line 4, but the first line after the 'header', as indicated above) LAE;20100101;2;894.6;3.9;93.8;5.3;3.1;204 scan() expected 'a real', got '894.6' How is '894.6' not 'a real [number]'? Thanks for the ensuing enlightenment or punishment... Best, Brian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. my full command: read.csv2(file=file, skip=2, na.strings = -, colClasses=c(character, character,numeric, numeric, numeric, numeric, numeric, numeric)) Why didn't you tell us that you were using read.csv2() and not read.table() directly? read.csv() passes dec=, to read.table(), which explains why numbers with a dot are not detected as reals. Just pass dec=. to fix the problem Regards Also, sorry I forgot this: R version 2.15.1 (2012-06-22) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets base other attached packages: [1] MASS_7.3-18 Thanks again. Brian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: not 'a real'?
Hi Milan, that's right, forgot about that one! Sorry to bother. Brian On 10/20/12 14:37, Milan Bouchet-Valat wrote: Le samedi 20 octobre 2012 à 14:25 +0200, Brian a écrit : Hi Jim, On 10/20/12 13:36, Jim Holtman wrote: how about supplying the context of the error. Show the lines in the file where the error occurred. Sent from my iPad On Oct 20, 2012, at 7:21, Brian zenli...@gmail.com wrote: Hi List, when supplying a vector of atomic vector classes to read.table, I get: # column classes colClasses=c(character, character,numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric, numeric) # Error: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, (from #2) : Line #2: (with 'skip=4', so really line 4, but the first line after the 'header', as indicated above) LAE;20100101;2;894.6;3.9;93.8;5.3;3.1;204 scan() expected 'a real', got '894.6' How is '894.6' not 'a real [number]'? Thanks for the ensuing enlightenment or punishment... Best, Brian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. my full command: read.csv2(file=file, skip=2, na.strings = -, colClasses=c(character, character,numeric, numeric, numeric, numeric, numeric, numeric)) Why didn't you tell us that you were using read.csv2() and not read.table() directly? read.csv() passes dec=, to read.table(), which explains why numbers with a dot are not detected as reals. Just pass dec=. to fix the problem Regards Also, sorry I forgot this: R version 2.15.1 (2012-06-22) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets base other attached packages: [1] MASS_7.3-18 Thanks again. Brian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mark sections on a time chart
Hello, Sorry, didn't paste the function call. plot(z) abline(v = c(start[1], end)) arrowsRange(start, end, at = 0, labels = seasons) Rui Barradas Em 20-10-2012 13:19, Rui Barradas escreveu: Hello, The function below requires package plotrix and is far from fully tested, but it might do what you want. library(zoo) arrowsRange - function(from, to, at = 1, labels = NULL, length = 1/8, horizontal = TRUE, border = FALSE, ...){ require(plotrix) f1 - function(){ if(horizontal) arrows(from, at, to, at, length = length, code = 3, ...) else arrows(at, from, at, to, length = length, code = 3, ...) } f2 - function(){ f1() if(horizontal){ x - rowMeans(cbind(from, to)) y - at }else{ x - at y - rowMeans(cbind(from, to)) } boxed.labels(x, y, labels = labels, border = border) } if(is.null(text)) f1() else f2() } # Make up some test data set.seed(9365) z - zoo(cumsum(rnorm(365))) index(z) - Sys.Date() + 1:365 start - as.Date(c(2012-12-21, 2013-03-21, 2013-06-22)) end - as.Date(c(2013-03-20, 2013-06-21, 2013-09-21)) seasons - c(Winter, Spring, Summer) plot(z) abline(v = c(start[1], end)) Hope this helps, Rui Barradas Em 20-10-2012 07:33, Christof Kluß escreveu: Hi is there a comfortable way to mark periods on time chart (axis.Date)? To do it with arrows(...), seems to be irritating. I want to have something like ---winterspringsummer-- thx Christof __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fit a threshold function with nls
Thank you all for your help. I think I now understand the issue. I tried to write a likelihood function for my binomial model. Please excuse my ignorance if I am not doing this right; I do not have any statistical background. #Example data x - seq(0, 1000) y - ifelse(x 300, 0, ifelse(x700, 0, 1)) #Function assuming binomial errors fn - function(par) { y.pred = ifelse(xpar[1], 0, ifelse(xpar[2], 0, 1)) -sum(dbinom(1-abs(y-y.pred), prob = mean(y.pred == y), size = 1, log = T)) } optim(par=c(300,700), fn) Would the Nelder-Mead optimization method be appropriate? On Tue, Oct 16, 2012 at 11:28 AM, Ravi Varadhan ravi.varad...@jhu.eduwrote: Veronique, Can you write down the likelihood function in R and send it to me (this is very easy to do, but I don't want to do your work)? Also send me the code for simulating the data. I will show you how to fit such models using optimization tools. Best, Ravi From: Vito Muggeo [vito.mug...@unipa.it] Sent: Tuesday, October 16, 2012 9:55 AM To: Bert Gunter Cc: Véronique Boucher Lalonde; r-help@r-project.org; Ravi Varadhan Subject: Re: [R] fit a threshold function with nls Véronique, in addition to Bert's comments, I would like to bring to your attention that there are several packages that perform threshold/breakpoint/changepoint estimation in R, including cumSeg, segmented, strucchange, and bcp for a Bayesian approach Moreover some packages, such as cghFLasso, perfom smoothing with a L1 penalty that can yield a step-function fit. I hope this helps you, vito Il 15/10/2012 22.21, Bert Gunter ha scritto: Véronique: I've cc'ed this to a true expert (Ravi Varadhan) who is one of those who can give you a definitive response, but I **believe** the problem is that threshhold type function fits have objective functions whose derivatives are discontinuous,and hence gradient -based methods can run into the sorts of problems that you see. **If** this is so, then you might do better using an explicit non-gradient optimizer = rss minimizer via one of the optim() suite of functions (maybe even the default Nelder-Mead); but this is definitely where the counsel of an expert would be valuable. Also check out the CRAN Optimization Task View for advice on optimization options. Cheers, Bert On Mon, Oct 15, 2012 at 9:43 AM, Véronique Boucher Lalonde veronique.boucher.lalo...@gmail.com wrote: I am trying to model a dependent variable as a threshold function of my independent variable. What I mean is that I want to fit different intercepts to y following 2 breakpoints, but with fixed slopes. I am trying to do this with using ifelse statements in the nls function. Perhaps, this is not an appropriate approach. I have created a very simple example to illustrate what I am trying to do. #Creating my independent variable x - seq(0, 1000) #Creating my dependent variable, where all values below threshold #1 are 0, all values above threshold #2 are 0 and all values within the two thresholds are 1 y - ifelse(x 300, 0, ifelse(x700, 0, 1)) #Adding noise to my dependent variable y - y + rnorm(length(y), 0, 0.01) #I am having trouble clearly explaining the model I want to fit but perhaps the plot is self explanatory plot(x, y) #Now I am trying to adjust a nonlinear model to fit the two breakpoints, with known slopes between the breakpoints (here, respectively 0, 1 and 0) threshold.model - nls(y~ifelse(xb1, 0, ifelse(xb2, 0, 1)), start=list(b1=300, b2=700), trace=T) I have played around with this function quite a bit and systematically get an error saying: singular gradient matrix at initial parameter estimates I admit that I don't fully understand what a singular gradient matrix implies. But it seems to me that both my model and starting values are sensible given the data, and that the break points are in fact estimable. Could someone point to what I am doing wrong? More generally, I am interested in knowing (1) whether I can use such ifelse statements in the function nls (1) whether I can even use nls for this model (3) whether I can model this with a function that would allow me to assume that the errors are binomial, rather than normally distributed (ultimately I want to use such a model on binomial data) I am using R version 2.15.1 on 64-bit Windows 7 Any guidance would be greatly appreciated. Veronique [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128
Re: [R] MLE of negative binomial distribution parameters
Zoraida zmorales at ingellicom.com writes: I need to estimate the parameters for negative binomial distribution (pdf) using maximun likelihood, I also need to estimate the parameter for the Poisson by ML, which can be done by hand, but later I need to conduct a likelihood ratio test between these two distributions and I don't know how to start! I'm not an expert programmer in R. Please help It sounds like you might need some local help. If you're trying to fit the parameters to a single data set (i.e. no predictor variables, just a set of values), then you probably want fitdistr() from the MASS package: modified from ?fitdistr: library(MASS) set.seed(123) x4 - rnegbin(500, mu = 5, theta = 4) ff - fitdistr(x4, Negative Binomial) ff2 - fitdistr(x4, Poisson) ff size mu 4.2159071 4.9447685 (0.5043658) (0.1466082) ff2 lambda 4.9440 (0.09943842) logLik(ff) 'log Lik.' -1250.121 (df=2) logLik(ff2) 'log Lik.' -1350.088 (df=1) You can use the pchisq() function to compute the p-value for the likelihood ratio test (hint: use lower.tail=FALSE to compute the upper tail area ...) If you want to fit and compare negative binomial or Poisson models with covariates, use glm and MASS::glm.nb, or mle2 from the bbmle packages ... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] supply methods to read.table
Brian zenlines at gmail.com writes: Hi List, I would like to optimize some data reading as well as clean up some code. The manual tells me to supply methods to colClasses but the manual and the methods documentation aren't helping... Can someone provide me an example please? Your post to the list from a minute before this seems to give an example ... do you still have a question, or is it resolved? Can you give us a reproducible example of what you're trying to do (e.g. the first few lines of your data file, what you tried, what your results were, and (if not obvious) why the results weren't what you wanted/expected)? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fit a threshold function with nls
Inline. -- Bert On Sat, Oct 20, 2012 at 6:44 AM, Véronique Boucher Lalonde veronique.boucher.lalo...@gmail.com wrote: Thank you all for your help. I think I now understand the issue. I tried to write a likelihood function for my binomial model. Please excuse my ignorance if I am not doing this right; I do not have any statistical background. Ignorance is excusable; failure to deal with it sensibly is not. If you are not able to work with Ravi, get local statistical help. Also. for such help (which this primarily seems to be) post on a statistical help list like stats.stackexchange.com, not here. -- Bert #Example data x - seq(0, 1000) y - ifelse(x 300, 0, ifelse(x700, 0, 1)) #Function assuming binomial errors fn - function(par) { y.pred = ifelse(xpar[1], 0, ifelse(xpar[2], 0, 1)) -sum(dbinom(1-abs(y-y.pred), prob = mean(y.pred == y), size = 1, log = T)) } optim(par=c(300,700), fn) Would the Nelder-Mead optimization method be appropriate? On Tue, Oct 16, 2012 at 11:28 AM, Ravi Varadhan ravi.varad...@jhu.edu wrote: Veronique, Can you write down the likelihood function in R and send it to me (this is very easy to do, but I don't want to do your work)? Also send me the code for simulating the data. I will show you how to fit such models using optimization tools. Best, Ravi From: Vito Muggeo [vito.mug...@unipa.it] Sent: Tuesday, October 16, 2012 9:55 AM To: Bert Gunter Cc: Véronique Boucher Lalonde; r-help@r-project.org; Ravi Varadhan Subject: Re: [R] fit a threshold function with nls Véronique, in addition to Bert's comments, I would like to bring to your attention that there are several packages that perform threshold/breakpoint/changepoint estimation in R, including cumSeg, segmented, strucchange, and bcp for a Bayesian approach Moreover some packages, such as cghFLasso, perfom smoothing with a L1 penalty that can yield a step-function fit. I hope this helps you, vito Il 15/10/2012 22.21, Bert Gunter ha scritto: Véronique: I've cc'ed this to a true expert (Ravi Varadhan) who is one of those who can give you a definitive response, but I **believe** the problem is that threshhold type function fits have objective functions whose derivatives are discontinuous,and hence gradient -based methods can run into the sorts of problems that you see. **If** this is so, then you might do better using an explicit non-gradient optimizer = rss minimizer via one of the optim() suite of functions (maybe even the default Nelder-Mead); but this is definitely where the counsel of an expert would be valuable. Also check out the CRAN Optimization Task View for advice on optimization options. Cheers, Bert On Mon, Oct 15, 2012 at 9:43 AM, Véronique Boucher Lalonde veronique.boucher.lalo...@gmail.com wrote: I am trying to model a dependent variable as a threshold function of my independent variable. What I mean is that I want to fit different intercepts to y following 2 breakpoints, but with fixed slopes. I am trying to do this with using ifelse statements in the nls function. Perhaps, this is not an appropriate approach. I have created a very simple example to illustrate what I am trying to do. #Creating my independent variable x - seq(0, 1000) #Creating my dependent variable, where all values below threshold #1 are 0, all values above threshold #2 are 0 and all values within the two thresholds are 1 y - ifelse(x 300, 0, ifelse(x700, 0, 1)) #Adding noise to my dependent variable y - y + rnorm(length(y), 0, 0.01) #I am having trouble clearly explaining the model I want to fit but perhaps the plot is self explanatory plot(x, y) #Now I am trying to adjust a nonlinear model to fit the two breakpoints, with known slopes between the breakpoints (here, respectively 0, 1 and 0) threshold.model - nls(y~ifelse(xb1, 0, ifelse(xb2, 0, 1)), start=list(b1=300, b2=700), trace=T) I have played around with this function quite a bit and systematically get an error saying: singular gradient matrix at initial parameter estimates I admit that I don't fully understand what a singular gradient matrix implies. But it seems to me that both my model and starting values are sensible given the data, and that the break points are in fact estimable. Could someone point to what I am doing wrong? More generally, I am interested in knowing (1) whether I can use such ifelse statements in the function nls (1) whether I can even use nls for this model (3) whether I can model this with a function that would allow me to assume that the errors are binomial, rather than normally distributed (ultimately I want to use such a model on binomial data) I am using R version 2.15.1 on 64-bit Windows 7 Any guidance would be greatly appreciated. Veronique [[alternative HTML version deleted]]
Re: [R] R2OpenBUGS quesion
Received the same error with OpenBUGS322 trying to run the schools example under Wine 1.5.15, R 2.15.1, R2OpenBUGS_3.2-1.4, on a mac pro with Snow Leopard . I then tried with OpenBUGS321 and it ran fine. -- View this message in context: http://r.789695.n4.nabble.com/R2OpenBUGS-quesion-tp4636392p4646855.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can I please be taken off the mailing list
[[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can I please be taken off the mailing list
On Oct 20, 2012, at 7:50 AM, Jonathan Brown wrote: [[alternative HTML version deleted]] __ R-help@r-project.org mailing list The link below is where you most likely subscribed and where you also need to go to unsubscribe. None of us have you password. https://stat.ethz.ch/mailman/listinfo/r-help ^^ PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dotchart ordering problem
I've got another problem. What if I have another dotchart with same categories, but one of the data are missing (control A)?? So I want to leave out from the dotchart. In this case in the control category should have only one dotline, because the other one is empty. Or mark somehow that it's missing, destroyed, etc. On the same example: x = c(39, NA, 23, 35, 30, 26, 30, 30, 29, 29, 26, 29, 34, 33) y = c(Control, DMSO, 0,1 mg/L, 0,3 mg/L, 1 mg/L, 3 mg/L, 10 mg/L) a = matrix(data= x, nrow=2) rownames(a) = c(A,B); colnames(a) = y a dotchart(a, main=Dotchart, xlim=c(0,50)) -- View this message in context: http://r.789695.n4.nabble.com/dotchart-ordering-problem-tp4646038p4646856.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble returning 2D array into R from Fortran
Hello, I have been trying to use a collection of Fortran subroutines to return a 2D array of calculated values to my R code, calling a Fortran wrapper subroutine from R. I've done this successfully before with C C++ code. The Fortran wrapper subroutine which is to be called by R takes a set of input arguments then should return an array of 2 columns the specified number of rows. I've tested the wrapping subroutine from another Fortan main program it does work as expected, so the Fortran works, but my problem has been with the correct syntax for retrieving the output double array from within R. The wrapping Fortran subroutine is defined; subroutine xypos_parallax_r(k,year,ra,dec,ti,t0,tE,alpha,u0,piee,pien,y) k is the number of rows y an assumed size double array, y(2,*), the other arguments are other input variables for the Fortran. I've tried putting the Fortran function call inside an R function, with this syntax; #Source positions in lens plane under annual parallax xypos_parallax - function(year,ra,dec,ti,model_par) { if (!is.loaded('xypos_parallax_r')){ dyn.load('parallax.so')} returned_data = .Fortran('xypos_parallax_r', as.integer(length(ti)), as.integer(year), as.double(ra), as.double(dec), as.double(ti), as.double(model_par$t0), as.double(model_par$tE), as.double(model_par$alpha), as.double(model_par$u0), as.double(model_par$piee), as.double(model_par$pien), as.double(array(double,dim=c(2,length(ti )[[12]] } The input arguments to the Fortran call include the number of rows of the output array, with the final argument in the .Fortran call intended to be the output array itself. ti is a vector of same length as the number of rows needed in the output array. Testing this R function calling the Fortran code returns; Error in xypos_parallax(year, ra, dec, ti, model_par) : NA/NaN/Inf in foreign function call (arg 12) In addition: Warning message: In xypos_parallax(year, ra, dec, ti, model_par) : NAs introduced by coercion The Fortran code does work as intended, so the problem must be with how I've written the R function making the Fortran call, but I can't see where I've gone wrong. Would anyone have any idea, or experience with returning multi-dimensional arrays from external code into R? Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/Trouble-returning-2D-array-into-R-from-Fortran-tp4646862.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a new by variable in a dataframe
d$flag-unlist(rbind(lapply(split(d,d$date),function(x) x[3]==max(x[3] I think that line is unnecessarily complicated. lapply() returns a list and rbind applied to one argument, L, mainly adds dimensions c(length(L),1) to it (it also changes its names to rownames). unlist doesn't care about the dimensions, so you may as well leave out the rbind. The only difference in the results with and without calling rbind is that the rbind version omits the names from flag. Use the more direct unname() on split's output or unlists's output if that concerns you. Also, if you are interested in saving time and memory when the input, d, is large, you will be better off applying split() to just the column of the data.frame that you want split instead of to the entire data.frame. d$flag2 - unlist(lapply(unname(split(d[[3]], d$date), function(x)x==max(x (I used d[[3]] instead of the more readable d$time to follow your original more closely.) You ought to check that the data is sorted by date: otherwise these give the wrong answer. What result do you want when there are several transactions at the last time in the day? Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arun Sent: Friday, October 19, 2012 7:49 PM To: Flavio Barros Cc: R help; ramoss Subject: Re: [R] Creating a new by variable in a dataframe HI, Without using ifelse() on the same example dataset. d - data.frame(stringsAsFactors = FALSE, transaction = c(T01, T02, T03, T04, T05, T06, T07, T08, T09, T10),date = c(2012-10-19, 2012-10-19, 2012-10-19, 2012-10-19, 2012-10-22, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23),time = c(08:00, 09:00, 10:00, 11:00, 12:00, 13:00, 14:00, 15:00, 16:00, 17:00)) d$date - as.Date(d$date,format=%Y-%m-%d) d$time-strptime(d$time,format=%H:%M)$hour d$flag-unlist(rbind(lapply(split(d,d$date),function(x) x[3]==max(x[3] d$datetime-as.POSIXct(paste(d$date,d$time, ),format=%Y-%m-%d %H) d1-d[,c(1,5,4)] d1 # transaction datetime flag #1 T01 2012-10-19 08:00:00 FALSE #2 T02 2012-10-19 09:00:00 FALSE #3 T03 2012-10-19 10:00:00 FALSE #4 T04 2012-10-19 11:00:00 TRUE #5 T05 2012-10-22 12:00:00 TRUE #6 T06 2012-10-23 13:00:00 FALSE #7 T07 2012-10-23 14:00:00 FALSE #8 T08 2012-10-23 15:00:00 FALSE #9 T09 2012-10-23 16:00:00 FALSE #10 T10 2012-10-23 17:00:00 TRUE str(d1) #'data.frame': 10 obs. of 3 variables: # $ transaction: chr T01 T02 T03 T04 ... # $ datetime : POSIXct, format: 2012-10-19 08:00:00 2012-10-19 09:00:00 ... # $ flag : logi FALSE FALSE FALSE TRUE TRUE FALSE ... A.K. - Original Message - From: Flavio Barros flaviomargar...@gmail.com To: William Dunlap wdun...@tibco.com Cc: r-help@r-project.org r-help@r-project.org; ramoss ramine.mossad...@finra.org Sent: Friday, October 19, 2012 4:24 PM Subject: Re: [R] Creating a new by variable in a dataframe I think i have a better solution *## Example data.frame* d - data.frame(stringsAsFactors = FALSE, transaction = c(T01, T02, T03, T04, T05, T06, T07, T08, T09, T10),date = c(2012-10-19, 2012-10-19, 2012-10-19, 2012-10-19, 2012-10-22, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23),time = c(08:00, 09:00, 10:00, 11:00, 12:00, 13:00, 14:00, 15:00, 16:00, 17:00)) *## As date tranfomation* d$date - as.Date(d$date) d$time - strptime(d$time, format='%H') library(reshape) *## Create factor to split the data* fdate - factor(format(d$date, '%D')) *## Create a list with logical TRUE when is the last transaction* ex - sapply(split(d, fdate), function(x) ifelse(as.numeric(x[,'time'])==max(as.numeric(x[,'time'])),T,F)) *## Coerce to logical vector* flag - unlist(rbind(ex)) *## With reshape we have the transform function e can add the flag column * d - transform(d, flag = flag) On Fri, Oct 19, 2012 at 3:51 PM, William Dunlap wdun...@tibco.com wrote: Suppose your data frame is d - data.frame( stringsAsFactors = FALSE, transaction = c(T01, T02, T03, T04, T05, T06, T07, T08, T09, T10), date = c(2012-10-19, 2012-10-19, 2012-10-19, 2012-10-19, 2012-10-22, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23), time = c(08:00, 09:00, 10:00, 11:00, 12:00, 13:00, 14:00, 15:00, 16:00, 17:00 )) (Convert the date and time to your favorite classes, it doesn't matter here.) A general way to say if an item is the last of its group is: isLastInGroup - function(...) ave(logical(length(..1)), ..., FUN=function(x)seq_along(x)==length(x)) is_last_of_dayA - with(d, isLastInGroup(date)) If you know your data is sorted by date you could save a little time for large datasets by using isLastInRun - function(x)
[R] system.time question
Hi : I looked at the help for system.time but I still have the following question. Can someone explain the output following output of system.time : user system elapsed 12399.681 5632.352 56935.647 Here's my take based on the fact that I was doing ps -aux | grep R off and on and the total amount of CPU minutes that got allotted before the job ended was about 5 hours and the total actual time that the job took was about 15 hours. Does elapsed = total actual time job taken ? That seems to be the case or a strange coincidence. Does user + system = CPU time from ps -aux | grep R ? That seems to be the case also or a weird coincidence. Finally, why can't the CPU get a higher percentage ? It's seems like it's always around 30% which would make sense since 5 is ~ 30% of 15 hours. Also, assuming my take above is correct, when talking about timing of algorithms, in this case, does one say the job took 5 hours or 15 hours ? I'm trying to see how fast an algorithm is compared to others and I'm not sure what the standard is. I'm on fedora 16.0 and using R 2.15. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble returning 2D array into R from Fortran
See inline. On 20-10-2012, at 17:18, paulfjbrowne paulfj.bro...@gmail.com wrote: Hello, I have been trying to use a collection of Fortran subroutines to return a 2D array of calculated values to my R code, calling a Fortran wrapper subroutine from R. I've done this successfully before with C C++ code. The Fortran wrapper subroutine which is to be called by R takes a set of input arguments then should return an array of 2 columns the specified number of rows. I've tested the wrapping subroutine from another Fortan main program it does work as expected, so the Fortran works, but my problem has been with the correct syntax for retrieving the output double array from within R. The wrapping Fortran subroutine is defined; subroutine xypos_parallax_r(k,year,ra,dec,ti,t0,tE,alpha,u0,piee,pien,y) k is the number of rows y an assumed size double array, y(2,*), the other arguments are other input variables for the Fortran. I've tried putting the Fortran function call inside an R function, with this syntax; #Source positions in lens plane under annual parallax xypos_parallax - function(year,ra,dec,ti,model_par) { if (!is.loaded('xypos_parallax_r')){ dyn.load('parallax.so')} returned_data = .Fortran('xypos_parallax_r', as.integer(length(ti)), as.integer(year), as.double(ra), as.double(dec), as.double(ti), as.double(model_par$t0), as.double(model_par$tE), as.double(model_par$alpha), as.double(model_par$u0), as.double(model_par$piee), as.double(model_par$pien), as.double(array(double,dim=c(2,length(ti )[[12]] } Why as.double(array(double,dim=c(2,length(ti))? You are passing a character array to your fortran subroutine. Use matrix(0,nrow=2,ncol=length(ti)) which will be a matrix of double precision's. Berend The input arguments to the Fortran call include the number of rows of the output array, with the final argument in the .Fortran call intended to be the output array itself. ti is a vector of same length as the number of rows needed in the output array. Testing this R function calling the Fortran code returns; Error in xypos_parallax(year, ra, dec, ti, model_par) : NA/NaN/Inf in foreign function call (arg 12) In addition: Warning message: In xypos_parallax(year, ra, dec, ti, model_par) : NAs introduced by coercion The Fortran code does work as intended, so the problem must be with how I've written the R function making the Fortran call, but I can't see where I've gone wrong. Would anyone have any idea, or experience with returning multi-dimensional arrays from external code into R? Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/Trouble-returning-2D-array-into-R-from-Fortran-tp4646862.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] system.time question
You asked several questions. Elapsed: yes User + System = CPU: yes Finally: You have to look at the load and/or cpu core count. Unless you setup your code to take advantage of multiple cores, R runs on a single core. Also: Do you really need to ask that question? --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Mark Leeds marklee...@gmail.com wrote: Hi : I looked at the help for system.time but I still have the following question. Can someone explain the output following output of system.time : user system elapsed 12399.681 5632.352 56935.647 Here's my take based on the fact that I was doing ps -aux | grep R off and on and the total amount of CPU minutes that got allotted before the job ended was about 5 hours and the total actual time that the job took was about 15 hours. Does elapsed = total actual time job taken ? That seems to be the case or a strange coincidence. Does user + system = CPU time from ps -aux | grep R ? That seems to be the case also or a weird coincidence. Finally, why can't the CPU get a higher percentage ? It's seems like it's always around 30% which would make sense since 5 is ~ 30% of 15 hours. Also, assuming my take above is correct, when talking about timing of algorithms, in this case, does one say the job took 5 hours or 15 hours ? I'm trying to see how fast an algorithm is compared to others and I'm not sure what the standard is. I'm on fedora 16.0 and using R 2.15. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] system.time question
On 20/10/2012 17:16, Mark Leeds wrote: Hi : I looked at the help for system.time but I still have the following question. Can someone explain the output following output of system.time : user system elapsed 12399.681 5632.352 56935.647 Yes, the help page can, via ?proc.time. As it says, it depends on the OS Here's my take based on the fact that I was doing ps -aux | grep R off and on and the total amount of CPU minutes that got allotted before the job ended was about 5 hours and the total actual time that the job took was about 15 hours. Does elapsed = total actual time job taken ? That seems to be the case or a strange coincidence. Does user + system = CPU time from ps -aux | grep R ? That seems to be the case also or a weird coincidence. On Fedora Linux, yes. Not in general (and what ps gives is pretty OS-specific: for example, does it include time from child processes or not -- system.time should but the OS calls used do not always do so, I find less reliably so in Fedora 16 than 14). Finally, why can't the CPU get a higher percentage ? It's seems like it's always around 30% which would make sense since 5 is ~ 30% of 15 hours. Many, many reasons. Most likely - other things are running, and some of them have a higher priority, or equal or lower priority and get lots of time slices - R the process is waiting for resources, such as memory, discs, network access Also, assuming my take above is correct, when talking about timing of algorithms, in this case, does one say the job took 5 hours or 15 hours ? I'm trying to see how fast an algorithm is compared to others and I'm not sure what the standard is. I'm on fedora 16.0 and using R 2.15. Thanks. It depends on the purpose. CRAN's check farm cares most about CPU usage: someone waiting for results cares about elapsed time. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mark sections on a time chart
Hello Rui Barradas thank you very much. That is exactly what I was looking for. Christof Am 20-10-2012 14:19, schrieb Rui Barradas: library(zoo) arrowsRange - function(from, to, at = 1, labels = NULL, length = 1/8, horizontal = TRUE, border = FALSE, ...){ require(plotrix) f1 - function(){ if(horizontal) arrows(from, at, to, at, length = length, code = 3, ...) else arrows(at, from, at, to, length = length, code = 3, ...) } f2 - function(){ f1() if(horizontal){ x - rowMeans(cbind(from, to)) y - at }else{ x - at y - rowMeans(cbind(from, to)) } boxed.labels(x, y, labels = labels, border = border) } if(is.null(text)) f1() else f2() } # Make up some test data set.seed(9365) z - zoo(cumsum(rnorm(365))) index(z) - Sys.Date() + 1:365 start - as.Date(c(2012-12-21, 2013-03-21, 2013-06-22)) end - as.Date(c(2013-03-20, 2013-06-21, 2013-09-21)) seasons - c(Winter, Spring, Summer) plot(z) abline(v = c(start[1], end)) arrowsRange(start, end, at = 0, labels = seasons) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a new by variable in a dataframe
d$flag2 - unlist(lapply(unname(split(d[[3]], d$date), function(x)x==max(x I'm sorry, I stuck in the unname() in the mail but did not run it - its closing parenthesis should be after split's closing parenthisis, not at the end. d$flag2 - unlist(lapply(unname(split(d[[3]], d$date)), function(x)x==max(x))) identical(d$flag , d$flag2) [1] TRUE Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Saturday, October 20, 2012 9:29 AM To: William Dunlap Cc: R help; Flavio Barros; ramoss Subject: Re: [R] Creating a new by variable in a dataframe HI Bill, Thanks for the reply. It was unnecessarily complicated. d$flag-unlist(lapply(split(d,d$date),function(x) x[3]==max(x[3])),use.names=FALSE) #or d$flag-unlist(lapply(split(d,d$date),function(x) x[3]==max(x[3]))) should have done the same job. str(d) #'data.frame': 10 obs. of 4 variables: # $ transaction: chr T01 T02 T03 T04 ... # $ date : Date, format: 2012-10-19 2012-10-19 ... # $ time : int 8 9 10 11 12 13 14 15 16 17 #$ flag : logi FALSE FALSE FALSE TRUE TRUE FALSE ... I am getting error messages with: d$flag2 - unlist(lapply(unname(split(d[[3]], d$date), function(x)x==max(x Error in match.fun(FUN) : argument FUN is missing, with no default A.K. - Original Message - From: William Dunlap wdun...@tibco.com To: arun smartpink...@yahoo.com; Flavio Barros flaviomargar...@gmail.com Cc: R help r-help@r-project.org; ramoss ramine.mossad...@finra.org Sent: Saturday, October 20, 2012 12:04 PM Subject: RE: [R] Creating a new by variable in a dataframe d$flag-unlist(rbind(lapply(split(d,d$date),function(x) x[3]==max(x[3] I think that line is unnecessarily complicated. lapply() returns a list and rbind applied to one argument, L, mainly adds dimensions c(length(L),1) to it (it also changes its names to rownames). unlist doesn't care about the dimensions, so you may as well leave out the rbind. The only difference in the results with and without calling rbind is that the rbind version omits the names from flag. Use the more direct unname() on split's output or unlists's output if that concerns you. Also, if you are interested in saving time and memory when the input, d, is large, you will be better off applying split() to just the column of the data.frame that you want split instead of to the entire data.frame. d$flag2 - unlist(lapply(unname(split(d[[3]], d$date), function(x)x==max(x (I used d[[3]] instead of the more readable d$time to follow your original more closely.) You ought to check that the data is sorted by date: otherwise these give the wrong answer. What result do you want when there are several transactions at the last time in the day? Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arun Sent: Friday, October 19, 2012 7:49 PM To: Flavio Barros Cc: R help; ramoss Subject: Re: [R] Creating a new by variable in a dataframe HI, Without using ifelse() on the same example dataset. d - data.frame(stringsAsFactors = FALSE, transaction = c(T01, T02, T03, T04, T05, T06, T07, T08, T09, T10),date = c(2012-10-19, 2012-10-19, 2012-10-19, 2012-10-19, 2012-10-22, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23),time = c(08:00, 09:00, 10:00, 11:00, 12:00, 13:00, 14:00, 15:00, 16:00, 17:00)) d$date - as.Date(d$date,format=%Y-%m-%d) d$time-strptime(d$time,format=%H:%M)$hour d$flag-unlist(rbind(lapply(split(d,d$date),function(x) x[3]==max(x[3] d$datetime-as.POSIXct(paste(d$date,d$time, ),format=%Y-%m-%d %H) d1-d[,c(1,5,4)] d1 # transaction datetime flag #1 T01 2012-10-19 08:00:00 FALSE #2 T02 2012-10-19 09:00:00 FALSE #3 T03 2012-10-19 10:00:00 FALSE #4 T04 2012-10-19 11:00:00 TRUE #5 T05 2012-10-22 12:00:00 TRUE #6 T06 2012-10-23 13:00:00 FALSE #7 T07 2012-10-23 14:00:00 FALSE #8 T08 2012-10-23 15:00:00 FALSE #9 T09 2012-10-23 16:00:00 FALSE #10 T10 2012-10-23 17:00:00 TRUE str(d1) #'data.frame': 10 obs. of 3 variables: # $ transaction: chr T01 T02 T03 T04 ... # $ datetime : POSIXct, format: 2012-10-19 08:00:00 2012-10-19 09:00:00 ... # $ flag : logi FALSE FALSE FALSE TRUE TRUE FALSE ... A.K. - Original Message - From: Flavio Barros flaviomargar...@gmail.com To: William Dunlap wdun...@tibco.com Cc: r-help@r-project.org r-help@r-project.org; ramoss ramine.mossad...@finra.org Sent: Friday, October 19, 2012 4:24 PM Subject: Re: [R] Creating a new by variable in a dataframe I think i have a better solution *## Example data.frame* d - data.frame(stringsAsFactors = FALSE,
Re: [R] saving to docx
Just to let people know On the Omegahat site (and source on github), there are packages for working with Office Open documents (and LibreOffice too), includinging RWordXML, RExcelXML and the generic package OOXML on which they rely. These are prototypes in the sense that they do not comprehensively cover the entire OOXML specification. Instead, the packages do have functionality for some common things to get data in and out of OO documents, and they have foundation functions for building new features. D. On 10/19/12 3:19 PM, David Winsemius wrote: On Oct 19, 2012, at 2:48 PM, Daróczi Gergely wrote: Hi Javad, saving R output to jpeg depends on what you want to save. For example saving an `lm` object to an image would be fun :) But you could export that quite easily to e.g. docx after installing Pandoc[1] and pander[2] package. You can find some examples in the README[3]. Best, Gergely [1] http://johnmacfarlane.net/pandoc/installing.html [2] http://cran.r-project.org/web/packages/pander/index.html [3a] brew syntax: http://rapporter.github.com/pander/#brew-to-pandoc [3b] in a live R session: http://rapporter.github.com/pander/#live-report-generation I guess I need to retract my comment that such packages only existed on Windows. Despite 'pander' not passing its CRAN package check for Mac, it does build from source and the Pandoc installer does succeed inSnow Leapard and R 2.15.1. Thank you for writing the pander package, Daróczi. On Fri, Oct 19, 2012 at 9:54 PM, javad bayat j.bayat...@gmail.com wrote: hi all, how can i saving R output to docx or Jpeg format? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in integrate(integrand, 0, Inf) : non-finite function value
Dear R users, When I run the code below, I get the error Error in integrate(integrand, 0, Inf) : non-finite function value. The code works if the function returns only sum(integ). However, I want to add cmh to it. When I add cmh I get that error. I can't figure out why this is happening because my integrate function has nothing to do with cmh. I tried to integrate from 0 to 1000, and still same error. Any suggestion is greatly appreciated. Thank you in advance! d-matrix(c(1,1,0,0,0,0,0,0,2,1,0,0,1,1,0,1,2,2,1,0),nrow=10,ncol=2) h-matrix(runif(20,0,1),10) delta-matrix(c(2,1,0,1,0,1,0,0,2,1,0,0,1,1,1,1,0,2,1,0),nrow=10,ncol=2) out-vector(numeric,length(1:2)) integ-vector(numeric,length(1:2)) for (k in 1:2){ ll-function(p){ cmh-delta[,k]*(h[,k]*log(0.5))+p*log(gamma(1+1/p)) for(s in 1:10){ integrand-function(x) x^d[s,k]*exp(-x*gamma(1+1/p))^p*p*x^(p-1)*exp(-x*h[s,k]) } integ-integrate(integrand,0,Inf)$value cmhn-as.vector(cmh) lik-sum(cmhn+integ) -lik } initial-c(1) t-nlm(ll,initial) out[k]-t$estimate } est-as.vector(out) -- View this message in context: http://r.789695.n4.nabble.com/Error-in-integrate-integrand-0-Inf-non-finite-function-value-tp4646868.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] se's and CI's for fitted lines in multivariate regression analysis
Hi again, Thank you so much for the script. Unfortunately, I feel like I might not have explained things clearly enough from the start. What I’m looking for is the st. errors or CI intervals for the estimate the parameter for slope and intercept for each level of each factor. From the summary table I can find the intercept and slope for all treatments by adding them up. Here the printout from the summary table: (I’ll just do for level A and B to illustrate) Estimate Std. Error t value Pr(|t|) (Intercept) 18.00299 17.43720 1.032 0.307146 rowpos-2.887231.26369 -2.285 0.026886 * colpos-0.085663.19131 -0.027 0.978699 treatmentB 1.22690 22.73203 0.054 0.957186 : colpos:treatmentB 0.394024.50194 0.088 0.930628 From this I can find the lines of A and B by adding up the parameters A, intercept: 18.00299+(-2.88723)= *15.14267* A, slope: *-0.08566* B, intercept: 18.00299+(-2.88723)+1.22690)= *16.36957* B, slope: -0.08566+0.39402=-*0.46258* It would be nice if I could get the intercept and slope done for me, like this. lm(formula = decrease[treatment == B] ~ colpos[treatment == B]) Coefficients: (Intercept) colpos[treatment == B] 5.0.5833 but while I do that with my more complicated model, I get an error message. lm(decrease[treatment==B]~rowpos[treatment==B]+colpos[treatment==B]+treatment[treatment==B]+treatment:colpos[treatment==B]) Error in model.frame.default(formula = decrease[treatment == B] ~ rowpos[treatment == : variable lengths differ (found for 'treatment') So back to the actual problem; how do I get the standard error of intercept and slope or CI of those lines (AB) calculated above? As the coefficients given in the model are the difference between each factor from ‘A’ it does not make sense to add them up in the same way as the parameters themselves -- View this message in context: http://r.789695.n4.nabble.com/se-s-and-CI-s-for-fitted-lines-in-multivariate-regression-analysis-tp4645703p4646878.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a new by variable in a dataframe
HI Bill, Thanks for the reply. It was unnecessarily complicated. d$flag-unlist(lapply(split(d,d$date),function(x) x[3]==max(x[3])),use.names=FALSE) #or d$flag-unlist(lapply(split(d,d$date),function(x) x[3]==max(x[3]))) should have done the same job. str(d) #'data.frame': 10 obs. of 4 variables: # $ transaction: chr T01 T02 T03 T04 ... # $ date : Date, format: 2012-10-19 2012-10-19 ... # $ time : int 8 9 10 11 12 13 14 15 16 17 #$ flag : logi FALSE FALSE FALSE TRUE TRUE FALSE ... I am getting error messages with: d$flag2 - unlist(lapply(unname(split(d[[3]], d$date), function(x)x==max(x Error in match.fun(FUN) : argument FUN is missing, with no default A.K. - Original Message - From: William Dunlap wdun...@tibco.com To: arun smartpink...@yahoo.com; Flavio Barros flaviomargar...@gmail.com Cc: R help r-help@r-project.org; ramoss ramine.mossad...@finra.org Sent: Saturday, October 20, 2012 12:04 PM Subject: RE: [R] Creating a new by variable in a dataframe d$flag-unlist(rbind(lapply(split(d,d$date),function(x) x[3]==max(x[3] I think that line is unnecessarily complicated. lapply() returns a list and rbind applied to one argument, L, mainly adds dimensions c(length(L),1) to it (it also changes its names to rownames). unlist doesn't care about the dimensions, so you may as well leave out the rbind. The only difference in the results with and without calling rbind is that the rbind version omits the names from flag. Use the more direct unname() on split's output or unlists's output if that concerns you. Also, if you are interested in saving time and memory when the input, d, is large, you will be better off applying split() to just the column of the data.frame that you want split instead of to the entire data.frame. d$flag2 - unlist(lapply(unname(split(d[[3]], d$date), function(x)x==max(x (I used d[[3]] instead of the more readable d$time to follow your original more closely.) You ought to check that the data is sorted by date: otherwise these give the wrong answer. What result do you want when there are several transactions at the last time in the day? Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arun Sent: Friday, October 19, 2012 7:49 PM To: Flavio Barros Cc: R help; ramoss Subject: Re: [R] Creating a new by variable in a dataframe HI, Without using ifelse() on the same example dataset. d - data.frame(stringsAsFactors = FALSE, transaction = c(T01, T02, T03, T04, T05, T06, T07, T08, T09, T10),date = c(2012-10-19, 2012-10-19, 2012-10-19, 2012-10-19, 2012-10-22, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23),time = c(08:00, 09:00, 10:00, 11:00, 12:00, 13:00, 14:00, 15:00, 16:00, 17:00)) d$date - as.Date(d$date,format=%Y-%m-%d) d$time-strptime(d$time,format=%H:%M)$hour d$flag-unlist(rbind(lapply(split(d,d$date),function(x) x[3]==max(x[3] d$datetime-as.POSIXct(paste(d$date,d$time, ),format=%Y-%m-%d %H) d1-d[,c(1,5,4)] d1 # transaction datetime flag #1 T01 2012-10-19 08:00:00 FALSE #2 T02 2012-10-19 09:00:00 FALSE #3 T03 2012-10-19 10:00:00 FALSE #4 T04 2012-10-19 11:00:00 TRUE #5 T05 2012-10-22 12:00:00 TRUE #6 T06 2012-10-23 13:00:00 FALSE #7 T07 2012-10-23 14:00:00 FALSE #8 T08 2012-10-23 15:00:00 FALSE #9 T09 2012-10-23 16:00:00 FALSE #10 T10 2012-10-23 17:00:00 TRUE str(d1) #'data.frame': 10 obs. of 3 variables: # $ transaction: chr T01 T02 T03 T04 ... # $ datetime : POSIXct, format: 2012-10-19 08:00:00 2012-10-19 09:00:00 ... # $ flag : logi FALSE FALSE FALSE TRUE TRUE FALSE ... A.K. - Original Message - From: Flavio Barros flaviomargar...@gmail.com To: William Dunlap wdun...@tibco.com Cc: r-help@r-project.org r-help@r-project.org; ramoss ramine.mossad...@finra.org Sent: Friday, October 19, 2012 4:24 PM Subject: Re: [R] Creating a new by variable in a dataframe I think i have a better solution *## Example data.frame* d - data.frame(stringsAsFactors = FALSE, transaction = c(T01, T02, T03, T04, T05, T06, T07, T08, T09, T10),date = c(2012-10-19, 2012-10-19, 2012-10-19, 2012-10-19, 2012-10-22, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23),time = c(08:00, 09:00, 10:00, 11:00, 12:00, 13:00, 14:00, 15:00, 16:00, 17:00)) *## As date tranfomation* d$date - as.Date(d$date) d$time - strptime(d$time, format='%H') library(reshape) *## Create factor to split the data* fdate - factor(format(d$date, '%D')) *## Create a list with logical TRUE when is the last transaction* ex - sapply(split(d, fdate), function(x) ifelse(as.numeric(x[,'time'])==max(as.numeric(x[,'time'])),T,F)) *## Coerce to logical vector* flag - unlist(rbind(ex)) *## With reshape we have the transform function
Re: [R] Creating a new by variable in a dataframe
HI Bill, I figured it out. d$flag2-unlist(lapply(unname(split(d[[3]],d$date)),function(x) x==max(x))) # [1] FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE TRUE ) created the error. A.K. - Original Message - From: William Dunlap wdun...@tibco.com To: arun smartpink...@yahoo.com; Flavio Barros flaviomargar...@gmail.com Cc: R help r-help@r-project.org; ramoss ramine.mossad...@finra.org Sent: Saturday, October 20, 2012 12:04 PM Subject: RE: [R] Creating a new by variable in a dataframe d$flag-unlist(rbind(lapply(split(d,d$date),function(x) x[3]==max(x[3] I think that line is unnecessarily complicated. lapply() returns a list and rbind applied to one argument, L, mainly adds dimensions c(length(L),1) to it (it also changes its names to rownames). unlist doesn't care about the dimensions, so you may as well leave out the rbind. The only difference in the results with and without calling rbind is that the rbind version omits the names from flag. Use the more direct unname() on split's output or unlists's output if that concerns you. Also, if you are interested in saving time and memory when the input, d, is large, you will be better off applying split() to just the column of the data.frame that you want split instead of to the entire data.frame. d$flag2 - unlist(lapply(unname(split(d[[3]], d$date), function(x)x==max(x (I used d[[3]] instead of the more readable d$time to follow your original more closely.) You ought to check that the data is sorted by date: otherwise these give the wrong answer. What result do you want when there are several transactions at the last time in the day? Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arun Sent: Friday, October 19, 2012 7:49 PM To: Flavio Barros Cc: R help; ramoss Subject: Re: [R] Creating a new by variable in a dataframe HI, Without using ifelse() on the same example dataset. d - data.frame(stringsAsFactors = FALSE, transaction = c(T01, T02, T03, T04, T05, T06, T07, T08, T09, T10),date = c(2012-10-19, 2012-10-19, 2012-10-19, 2012-10-19, 2012-10-22, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23),time = c(08:00, 09:00, 10:00, 11:00, 12:00, 13:00, 14:00, 15:00, 16:00, 17:00)) d$date - as.Date(d$date,format=%Y-%m-%d) d$time-strptime(d$time,format=%H:%M)$hour d$flag-unlist(rbind(lapply(split(d,d$date),function(x) x[3]==max(x[3] d$datetime-as.POSIXct(paste(d$date,d$time, ),format=%Y-%m-%d %H) d1-d[,c(1,5,4)] d1 # transaction datetime flag #1 T01 2012-10-19 08:00:00 FALSE #2 T02 2012-10-19 09:00:00 FALSE #3 T03 2012-10-19 10:00:00 FALSE #4 T04 2012-10-19 11:00:00 TRUE #5 T05 2012-10-22 12:00:00 TRUE #6 T06 2012-10-23 13:00:00 FALSE #7 T07 2012-10-23 14:00:00 FALSE #8 T08 2012-10-23 15:00:00 FALSE #9 T09 2012-10-23 16:00:00 FALSE #10 T10 2012-10-23 17:00:00 TRUE str(d1) #'data.frame': 10 obs. of 3 variables: # $ transaction: chr T01 T02 T03 T04 ... # $ datetime : POSIXct, format: 2012-10-19 08:00:00 2012-10-19 09:00:00 ... # $ flag : logi FALSE FALSE FALSE TRUE TRUE FALSE ... A.K. - Original Message - From: Flavio Barros flaviomargar...@gmail.com To: William Dunlap wdun...@tibco.com Cc: r-help@r-project.org r-help@r-project.org; ramoss ramine.mossad...@finra.org Sent: Friday, October 19, 2012 4:24 PM Subject: Re: [R] Creating a new by variable in a dataframe I think i have a better solution *## Example data.frame* d - data.frame(stringsAsFactors = FALSE, transaction = c(T01, T02, T03, T04, T05, T06, T07, T08, T09, T10),date = c(2012-10-19, 2012-10-19, 2012-10-19, 2012-10-19, 2012-10-22, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23),time = c(08:00, 09:00, 10:00, 11:00, 12:00, 13:00, 14:00, 15:00, 16:00, 17:00)) *## As date tranfomation* d$date - as.Date(d$date) d$time - strptime(d$time, format='%H') library(reshape) *## Create factor to split the data* fdate - factor(format(d$date, '%D')) *## Create a list with logical TRUE when is the last transaction* ex - sapply(split(d, fdate), function(x) ifelse(as.numeric(x[,'time'])==max(as.numeric(x[,'time'])),T,F)) *## Coerce to logical vector* flag - unlist(rbind(ex)) *## With reshape we have the transform function e can add the flag column * d - transform(d, flag = flag) On Fri, Oct 19, 2012 at 3:51 PM, William Dunlap wdun...@tibco.com wrote: Suppose your data frame is d - data.frame( stringsAsFactors = FALSE, transaction = c(T01, T02, T03, T04, T05, T06, T07, T08, T09, T10), date = c(2012-10-19, 2012-10-19, 2012-10-19, 2012-10-19, 2012-10-22, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23, 2012-10-23), time = c(08:00, 09:00,
Re: [R] Trouble returning 2D array into R from Fortran
I will look into using inline, but since the Fortran code is several thousand lines long is comprised of multiple subroutines, compiling it into a shared object dynamically loading it into R is probably the easier solution. I have also noticed a strange numerical problem when calling the routine from within R as opposed to in its native Fortran. For example, for the same set of input parameters, R will output; (0.0315031507927081, 0.220339391742628) Whereas the Fortran code internally outputs; (0.0350479965640488472, 0.0883087569138) All variables are defined in the Fortran code as double, and are passed in as double in the .Fortran call in R. I am a bit puzzled at the discrepancy between the two calls' output. -- View this message in context: http://r.789695.n4.nabble.com/Trouble-returning-2D-array-into-R-from-Fortran-tp4646862p4646874.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] ggmcmc 0.2 has been released
Dear all, ggmcmc-0.2 has been released. ggmcmc is a tool for assessing and diagnosing convergence of Markov Chain Monte Carlo simulations, as well as for graphically display results from full MCMC analysis. The package also facilitates the graphical interpretation of models by providing flexible functions to plot the results against observed variables. There is an example of its use at: http://xavier-fim.net/packages/ggmcmc ggmcmc is being developed at github: https://github.com/xfim/ggmcmc Best regards, -- - Xavier - ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble returning 2D array into R from Fortran
Look at my comments in between your post. On 20-10-2012, at 19:18, paulfjbrowne paulfj.bro...@gmail.com wrote: I will look into using inline, but since the Fortran code is several thousand lines long is comprised of multiple subroutines, compiling it into a shared object dynamically loading it into R is probably the easier solution. You misread my see inline. I did not mean look at the inline package. I meant that my comments were inline (I try never to top post). I have also noticed a strange numerical problem when calling the routine from within R as opposed to in its native Fortran. For example, for the same set of input parameters, R will output; (0.0315031507927081, 0.220339391742628) Whereas the Fortran code internally outputs; (0.0350479965640488472, 0.0883087569138) All variables are defined in the Fortran code as double, and are passed in as double in the .Fortran call in R. I am a bit puzzled at the discrepancy between the two calls' output. Without further information and code, it is impossible to give any kind of sensible advice. In the Fortran code you could try to check the input arguments. Are you sure that from R you are passing exactly the same values as in its native Fortran. Berend -- View this message in context: http://r.789695.n4.nabble.com/Trouble-returning-2D-array-into-R-from-Fortran-tp4646862p4646874.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] se's and CI's for fitted lines in multivariate regression analysis
On Oct 16, 2012, at 11:58 AM, Sigrid wrote: Okay, I've now tried to the predict function and get the SE, although it seem to calculate SE for each observation from the line (I assume), while I want the CI-interval and SE for each line fitted line for the treatment. I do not really understand what parameter mean these SEs are calculated from when there would be several means along the line...?. This is what I get with predict: predict(model, se.fit = TRUE, interval = confidence) Another way I can think of to show how well the lines fit the data is to look at the intercepts and slopes instead. I can specify the line for each level and would then get the estimate of slope and intercept, although I do not know how I show the standard errors of the slope and intercept. lm(decrease[treatment==A]~colpos[treatment==A]) Call: lm(formula = decrease[treatment == A] ~ colpos[treatment == A]) Coefficients: (Intercept) colpos[treatment == A] 2.53570.4643 Please let me know if you know how to find st. errors for (or st. error for slope and intercept) of lines for each factor of a treatment. The SE's for treatment slope will vary depending on the colpos values. Using `predict`, pick the mid-point of the colpos and rowpos variables (which is where the SE of the estimates will be minimized). This should be covered in any basic regression text. model-lm(decrease ~ rowpos + colpos + treatment + treatment:colpos + 0, data=OrchardSprays) # I do not think the use of the non-intercept version matters here and it's not in general a good practice, but it allows all the parameters to be labeled as you apparently expect. predict(model, newdata=data.frame(colpos=4.5, treatment=unique(OrchardSprays$treatment), rowpos=mean(OrchardSprays$rowpos) ), se.fit = TRUE, interval = confidence) $fit fitlwr upr 1 35.000 20.331646 49.66835 2 63.125 48.456646 77.79335 3 7.625 -7.043354 22.29335 4 90.250 75.581646 104.91835 5 68.500 53.831646 83.16835 6 69.000 54.331646 83.66835 7 25.250 10.581646 39.91835 8 4.625 -10.043354 19.29335 $se.fit 12345678 7.291375 7.291375 7.291375 7.291375 7.291375 7.291375 7.291375 7.291375 $df [1] 47 $residual.scale [1] 20.62312 unique(OrchardSprays$treatment) [1] D E B H G F C A Levels: A B C D E F G H with(OrchardSprays, tapply(decrease, treatment, mean) ) A B C D E F G H 4.625 7.625 25.250 35.000 63.125 69.000 68.500 90.250 -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in integrate(integrand, 0, Inf) : non-finite function value
On Oct 20, 2012, at 9:23 AM, stats12 wrote: Dear R users, When I run the code below, I get the error Error in integrate(integrand, 0, Inf) : non-finite function value. The code works if the function returns only sum(integ). But you never showed us the working code. However, I want to add cmh to it. When I add cmh I get that error. I can't figure out why this is happening because my integrate function has nothing to do with cmh. I tried to integrate from 0 to 1000, and still same error. Any suggestion is greatly appreciated. Thank you in advance! d-matrix(c(1,1,0,0,0,0,0,0,2,1,0,0,1,1,0,1,2,2,1,0),nrow=10,ncol=2) h-matrix(runif(20,0,1),10) delta-matrix(c(2,1,0,1,0,1,0,0,2,1,0,0,1,1,1,1,0,2,1,0),nrow=10,ncol=2) out-vector(numeric,length(1:2)) integ-vector(numeric,length(1:2)) for (k in 1:2){ ll-function(p){ cmh-delta[,k]*(h[,k]*log(0.5))+p*log(gamma(1+1/p)) This inner loop appears to define a function 10 times, but does nothing else: for(s in 1:10){ integrand-function(x) x^d[s,k]*exp(-x*gamma(1+1/p))^p*p*x^(p-1)*exp(-x*h[s,k]) } # end of s-loop -- integ-integrate(integrand,0,Inf)$value cmhn-as.vector(cmh) lik-sum(cmhn+integ) -lik } # end of ll()-function initial-c(1) t-nlm(ll,initial) out[k]-t$estimate } # end of k-loop est-as.vector(out) Your uncommented code seems to have problems with organization, but since you never really described your overall strategy or goal was, it's hard to know. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Logistic regression/Cut point? predict ??
I am new to R and I am trying to do a monte carlo simulation where I generate data and interject error then test various cut points; however, my output was garbage (at x equal zero, I did not get .50) I am basically testing the performance of classifiers. Here is the code: n - 1000; # Sample size fitglm - function(sigma,tau){ x - rnorm(n,0,sigma) intercept - 0 beta - 5 * ystar - intercept+beta*x* * z - rbinom(n,1,plogis(ystar))**# I believe plogis accepts the a +bx augments and return the e^x/(1+e^x) which is then used to generate 0 and 1 data* xerr - x + rnorm(n,0,tau)# error is added here model-glm(z ~ xerr, family=binomial(logit)) int-coef(model)[1] slope-coef(model)[2] pred-predict(model) #this gives me the a+bx data for new error? I know I can add type= response to get the probab. but only e^x not *e^x/(1+e^x) * pi1hat-length(z[which(z==1)]/length(z)) My cut point is calculated is the proportion of 0s to 1. pi0hat-length(z[which(z==0)]/length(z)) cutmid - log(pi0hat/pi1hat) pred-ifelse(predcutmid,1,0) * I am not sure if I need to compare these two. I think this is an error. * accuracy-length(which(pred==z))/length(z) accuracy rocpreds-prediction(pred,z) auc-performance(rocpreds,auc)@y.values output-c(int,slope,cutmid,accuracy,auc) names(output)-c(Intercept,Slope,CutPoint,Accuracy,AUC) return(output) } y-fitglm(.05,1) y nreps - 500; output-data.frame(matrix(rep(NA),nreps,6,ncol=6)) mysigma-.5 mytau-.1 i-1 for(j in 1:nreps) { output[j,1:5]-fitglm(mysigma,mytau) output[j,6]-j } names(output)-c(Intercept,Slope,CutPoint,Accuracy,AUC,Iteration) apply(output,2, mean) apply(output,2, var) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Credit Scoring in R - Weight of Evidence
Dear List, I couldn't find any package that performs the weight of evidence of predictors (a transformation usually performed in credit scoring applications). Is there any that you know? Thanks, Axel. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Expected number of events, Andersen-Gill model fit via coxph in package survival
I have a follow-up question (for either Dr. Therneau, or anybody who might know). sum(zz) (see below) estimates the number of events for the cohort. Now, how can I compute a confidence interval for sum(zz)? Or a standard error? My obvious choice, square root of the sum of the squares of the standard errors for zz provided by predict.coxph, turns out to be too small. Thank you for any suggestions. Omar. On Thu, Oct 11, 2012 at 1:26 PM, Omar De la Cruz C. odelacr...@gmail.com wrote: Thank you, Dr. Therneau, that was very helpful. Best regards, Omar. On Mon, Oct 8, 2012 at 9:58 AM, Terry Therneau thern...@mayo.edu wrote: I am interested in producing the expected number of events, in a recurring events setting. I am using the Andersen-Gill model, as fit by the function coxph in the package survival. I need to produce expected numbers of events for a cohort, cumulatively, at several fixed times. My ultimate goal is: To fit an AG model to a reference sample, then use that fitted model to generate expected numbers of events for a new cohort; then, comparing the expected vs. the observed numbers of events would give us some idea of whether the new cohort differs from the reference one. From my reading of the documentation and the text by Therneau and Grambsch, it seems that the function survexp is what I need. But using it I am not able to obtain expected numbers of events that match reasonably well the observed numbers *even for the same reference population.* So, I think I am misunderstanding something quite badly. You've hit a common confusion. Observed versus expected events computations are done on a cumulative hazard scale H, not the surivival scale S; S = exp(-H). Relating this back to simple Poisson models H(t) would be the expected number of events by time t and S(t) the probability of no events before time t. G. Berry (Biometrics 1983) has a classic ane readable article on this (especially if you ignore the proofs). Using your example: cphfit - coxph(Surv(start,stop,event)~rx+number+size+cluster(id),data=bladder2) zz - predict(cphfit, type='expected') c(sum(zz), sum(bladder2$event)) [1] 112 112 tdata - bladder2[1:10] #new data set (lazy way) predict(cphfit, type='expected', newdata=tdata) [1] 0.0324089 0.3226540 0.4213402 1.0560768 0.6702130 0.2163531 0.6490665 [8] 0.8864808 0.2932915 0.5190647 You can also do this using survexp and the cohort=FALSE argument, which would return S(t) for each subject and we would then use -log(result) to get H. This is how it was done when I wrote the book, but the newer predict function is easier. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic regression/Cut point? predict ??
What do you mean by at x equal zero? On Sun, Oct 21, 2012 at 8:37 AM, Adel Powell powella...@gmail.com wrote: I am new to R and I am trying to do a monte carlo simulation where I generate data and interject error then test various cut points; however, my output was garbage (at x equal zero, I did not get .50) I am basically testing the performance of classifiers. Here is the code: n - 1000; # Sample size fitglm - function(sigma,tau){ x - rnorm(n,0,sigma) intercept - 0 beta - 5 * ystar - intercept+beta*x* * z - rbinom(n,1,plogis(ystar))**# I believe plogis accepts the a +bx augments and return the e^x/(1+e^x) which is then used to generate 0 and 1 data* xerr - x + rnorm(n,0,tau)# error is added here model-glm(z ~ xerr, family=binomial(logit)) int-coef(model)[1] slope-coef(model)[2] pred-predict(model) #this gives me the a+bx data for new error? I know I can add type= response to get the probab. but only e^x not *e^x/(1+e^x) * pi1hat-length(z[which(z==1)]/length(z)) My cut point is calculated is the proportion of 0s to 1. pi0hat-length(z[which(z==0)]/length(z)) cutmid - log(pi0hat/pi1hat) pred-ifelse(predcutmid,1,0) * I am not sure if I need to compare these two. I think this is an error. * accuracy-length(which(pred==z))/length(z) accuracy rocpreds-prediction(pred,z) auc-performance(rocpreds,auc)@y.values output-c(int,slope,cutmid,accuracy,auc) names(output)-c(Intercept,Slope,CutPoint,Accuracy,AUC) return(output) } y-fitglm(.05,1) y nreps - 500; output-data.frame(matrix(rep(NA),nreps,6,ncol=6)) mysigma-.5 mytau-.1 i-1 for(j in 1:nreps) { output[j,1:5]-fitglm(mysigma,mytau) output[j,6]-j } names(output)-c(Intercept,Slope,CutPoint,Accuracy,AUC,Iteration) apply(output,2, mean) apply(output,2, var) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Centering labels on X-axis
Thanks Jim and Rui. My apologies, i did not give enough info on my plot. I am using : plot(x,y) for a line plot. I want to center the labels on the x-axis for each tick. Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Centering-labels-on-X-axis-tp4646761p4646888.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rms plot.Predict question: swapping x- and y- axis for categorical predictors
Hello all, I'm trying to plot the effects of variables estimated by a regression model fit individually, and for categorical predictors, the independent variable shows up on the y-axis, with the dependent variable on the x-axis. Is there a way to prevent this reversal? Sample code with dummy data: # make dummy data set.seed(1) x1 - runif(200) x2 - sample(c(1,2),200, TRUE) x3 - sample(c(0,1),200,T) x4 - runif(200) # the dependent variable: distance - (x1/3 + x2 + rnorm(200)^2 - x3 - x4/2) # factor two vars, and add to datadist: x3 - factor(x3) x2 - factor(x2) d - datadist(x1,x2,x3,x4) options(datadist=d) # Make a simple model: f - ols(distance ~ x1 + x2 + x4+ x3, x=T) # plot variable effect of a categorical variable: plot(Predict(f, x2)) ^ above step generates a plot with x2 on the y-axis and distance on the x-axis, which is the opposite of what I'm aiming for. The continuous variables do not have this problem; nor does the plot(Predict(f)) function to plot all of the effects at once. Thank you so much in advance for your replies! My apologies if this question has been answered already; I've tried searching to no avail. Best, Stephanie (Stanford University, Department of Linguistics) -- View this message in context: http://r.789695.n4.nabble.com/rms-plot-Predict-question-swapping-x-and-y-axis-for-categorical-predictors-tp4646891.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] FreeBSD installation problems
R Compiliing or installation failure on FreeBSD 9.0-RELEASE FreeBSD ; I386 box. configuring with the default settings, no 'options' = ./configure ... ... ... checking whether gcc -std=gnu99 supports -c -o FILE.lo... yes checking for gcc -std=gnu99 option to support OpenMP... -fopenmp checking how to get verbose linking output from fc... configure: WARNING: compilation failed checking for Fortran 77 libraries of fc... checking how to get verbose linking output from gcc -std=gnu99... -v checking for C libraries of gcc -std=gnu99... -L/usr/local/lib -L/usr/lib -lgcc_s checking for dummy main to link with Fortran 77 libraries... none checking for Fortran 77 name-mangling scheme... configure: error: in `/usr/local/src/R-2.15.1': configure: error: cannot compile a simple Fortran program See `config.log' for more details jaymax-#393:# ls config.log config.log jaymax-#394:# wc config.log 3859 14227 118427 config.log jaymax-#395:# tail -30 config.log ... ... ... #define HAVE_INTPTR_T 1 #define HAVE_UINTPTR_T 1 #define R_INLINE inline #define SIZEOF_INT 4 #define INT_32_BITS 1 #define SIZEOF_LONG 4 #define SIZEOF_LONG_LONG 8 #define SIZEOF_DOUBLE 8 #define SIZEOF_LONG_DOUBLE 12 #define SIZEOF_SIZE_T 4 configure: exit 1 jaymax-#396:# I then provided the compilers explicitly ./configure cc=gcc47 F77=gfortran47 CXX=g++47 FC=gfortran47 which seemed to run to completion w/ 1 SNAFU R is now configured for i386-unknown-freebsd9.0 Source directory: . Installation directory:/usr/local C compiler:gcc -std=gnu99 -g -O2 Fortran 77 compiler: gfortran47 -g -O2 C++ compiler: g++47 -g -O2 Fortran 90/95 compiler:gfortran47 -g -O2 Obj-C compiler: Interfaces supported: X11 External libraries:readline, ICU, lzma Additional capabilities: PNG, JPEG, TIFF, NLS, cairo Options enabled: shared BLAS, R profiling, Java Recommended packages: yes configure: WARNING: inconsolata.sty not found: PDF vignettes and package manuals will not be rendered optimally However, jaymax-#492:# make install installing doc ... install: NEWS.rds: No such file or directory *** Error code 71 Stop in /usr/local/src/R-2.15.1/doc. *** Error code 1 Stop in /usr/local/src/R-2.15.1. And jaymax-#493:# make check ../../bin/R: not found *** Error code 127 Stop in /usr/local/src/R-2.15.1/tests/Examples. *** Error code 1 Stop in /usr/local/src/R-2.15.1/tests. *** Error code 1 Stop in /usr/local/src/R-2.15.1/tests. *** Error code 1 Stop in /usr/local/src/R-2.15.1. jaymax-#494:# I am at a bit of a loss figuring out the problem here, the log file does not seem to give any info Need some direction/help Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Centering labels on X-axis
On Oct 20, 2012, at 3:03 PM, YAddo wrote: Thanks Jim and Rui. My apologies, i did not give enough info on my plot. Nor do you even now. Offer a data example. I am using : plot(x,y) for a line plot. I want to center the labels on the x-axis for each tick. Code. We want code. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in integrate(integrand, 0, Inf) : non-finite function value
Hi, Thank you for your comment. I worked on the code again and was able to make it work. The only problem I am having right now is that nlm depends on the initial value. When the initial value is 1, I get the following estimates 0.1230414 19.6271029 when it is 2, I get the following 29.46874 20.01679 d-matrix(c(1,1,0,0,0,0,0,0,2,1,0,0,1,1,0,1,2,2,1,0),nrow=10,ncol=2) h-matrix(runif(20,0,1),10) delta-matrix(c(2,1,0,1,0,1,0,0,2,1,0,0,1,1,1,1,0,2,1,0),nrow=10,ncol=2) out-vector(numeric,length(1:2)) integ-vector(numeric,length(1:10)) for (k in 1:2){ ll-function(p){ cmh-delta[,k]*(h[,k]*log(0.5))+p*log(gamma(1+1/p)) for(s in 1:10){ integrand-function(x) x^d[s,k]*exp(-x*gamma(1+1/p))^p*p*x^(p-1)*exp(-x*h[s,k]) integ-integrate(integrand,0,Inf)$value return(integ) } lik-sum(cmh+log(integ)) -lik } initial-c(1) t-nlm(ll,initial) out[k]-t$estimate } est-as.vector(out) -- View this message in context: http://r.789695.n4.nabble.com/Error-in-integrate-integrand-0-Inf-non-finite-function-value-tp4646868p4646896.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.