[R] Profiler for R ?
Hi, is there such a thing as a profiler for R that informs about a) how much processing time is used by particular functions and commands and b) how much memory is used for creating how many objects (or types of data structures)? In a way I am looking for something similar to the java profiler (which is started by command line and provides profiling information collected from the run of a particular program). Is there such a tool through the R command line or RGUI ? Are there profilers available for the Eclipse StatET or though another package or extension? Thanks, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fast String operations in R ? Cost of String operations
Hi experts, currently developing some code that checks a large amount of Strings for the existence of sub-strings and pattern (detecting sub-strings within URLs). I wonder if there is information about how well particular String operations work in R together with comparisons. Are there recommendations (based on such information) regarding what operations should be used and what should be avoided? Are there libraries and functions that provide optimized String operations for such needs or is R simply not the right choice for that? Best, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] UK map in R
On Sun, Jul 4, 2010 at 9:10 PM, happy naren narender.ku...@gmail.com wrote: Hi, i am currently working on a problem where i need to plot latitude and longitude data on a respective county of UK. After this i want to plot altitude data to make a 3d surface on which then i have to plot my corresponding data. Have you read the Spatial Task View on CRAN? http://ftp.heanet.ie/mirrors/cran.r-project.org/web/views/Spatial.html Do you want to plot point data on a map of the whole of the UK with county boundaries? These boundaries do change so you'll need to specify a year if you need that precision. There's a set here: http://gadm.org/ but they might not be counties, they may be EU NUTS areas which are close to counties at one level. Obviously Scotland (still part of the UK) doesn't have counties at all. See: http://en.wikipedia.org/wiki/Counties_of_the_United_Kingdom for all the details on UK administrative regions, historic counties etc etc. For your 3d plot, do you need an accurate elevation model of the UK? At what precision? The best free elevation model I know is the SRTM data, and you'd have to download the relevant section. To do 3d plots, use the rgl package - do: library(rgl) ; example(terrain3d) for the kind of thing. Any other questions are probably best sent to the R-spatial mailing list, but you should probably do a bit more research first. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LatticeExtra Parallel
Hi Ben, You can also experiment with matlines Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Mon, Jul 5, 2010 at 5:49 AM, Deepayan Sarkar deepayan.sar...@gmail.comwrote: On Sun, Jul 4, 2010 at 12:59 PM, Ben Wilkinson bjlwilkin...@gmail.com wrote: I have put together a chart of 1,000 monthly data series using parallel and I really like the way it displays the data. Is there a way to achieve something similar in terms of display using the actual scale ( consistently across all the data) as opposed to min/max ? You mean like parallel(iris, common.scale = TRUE) ? -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Profiler for R ?
Perhaps ?Rprof HTH, Josh On Sun, Jul 4, 2010 at 11:26 PM, Ralf B ralf.bie...@gmail.com wrote: Hi, is there such a thing as a profiler for R that informs about a) how much processing time is used by particular functions and commands and b) how much memory is used for creating how many objects (or types of data structures)? In a way I am looking for something similar to the java profiler (which is started by command line and provides profiling information collected from the run of a particular program). Is there such a tool through the R command line or RGUI ? Are there profilers available for the Eclipse StatET or though another package or extension? Thanks, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] timeseries
Dear useRs, I am trying to construct a time series using as.ts function, surprisingly when I plot the data the x axis do not show the time in years, however if I use ts(data), time in years are shown in the x axis. Why such difference in the results of both the commands Thanks nuncio -- Nuncio.M Research Scientist National Center for Antarctic and Ocean research Head land Sada Vasco da Gamma Goa-403804 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] repeated measures with missing data
Dear R help group, I am teaching myself linear mixed models with missing data since I would like to analyze a stats design with these kind of models. The textbook example is for the procedure proc MIXED in SAS, but I would like to know if there is an equivalent in R. This example only includes two time-measurements across subjects (a t-test with missing values), but I will need to to this with three time-measurements (repeated measures ANOVA with missing values): Patient Treatment A B 1 20 12 2 26 24 3 16 17 4 29 21 5 22 N/A 6 N/A 12 I have tried this analysis using using the instructions below with the help of Mixed-Effects Models in S and S-Plus, but have not been able to go around the missing data issue as follows: tmtA - c(20,26, 16,29,22,NA) tmtB - c(12,24,17,21,NA,17) require(lme4) dv - c(20,12,26,24,16,17,29,21,22,17) subject - rep(c(s1,s2,s3,s4,s5,s6),each=2) subject - subject[-c(10,11)] myfactor - rep(c(f1,f2), 6) myfactor - myfactor[-c(10,11)] mydata - data.frame(dv, subject, myfactor) am2 - lmer(dv ~ myfactor + (1|subject)), data = mydata) summary(am2) anova(am2) subject - subject[-c(10,11)] Any help would be greatly appreciated. Thank you, Rafael Diaz Assistant Professor Math and Stats Dept California State University Sacramento [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data Labels in a barchart (Lattice or otherwise)
Hi, Can anyone please help me with how I could add labels with the value for each bar in a barchart? (similar to how data labels can be added in Excel) I have done a lot of searching but havent been lucky. Thanks, Raoul -- View this message in context: http://r.789695.n4.nabble.com/Data-Labels-in-a-barchart-Lattice-or-otherwise-tp2278027p2278027.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] repeated measures with missing data
Dear Rafael, The line below had one closing bracket to much. The line below should work. am2 - lmer(dv ~ myfactor + (1|subject), data = mydata) Furthermore I would advise to change myfactor for a character variable to a factor. HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Rafael Diaz Verzonden: maandag 5 juli 2010 3:37 Aan: r-help@r-project.org Onderwerp: [R] repeated measures with missing data Dear R help group, I am teaching myself linear mixed models with missing data since I would like to analyze a stats design with these kind of models. The textbook example is for the procedure proc MIXED in SAS, but I would like to know if there is an equivalent in R. This example only includes two time-measurements across subjects (a t-test with missing values), but I will need to to this with three time-measurements (repeated measures ANOVA with missing values): Patient Treatment A B 1 20 12 2 26 24 3 16 17 4 29 21 5 22 N/A 6 N/A 12 I have tried this analysis using using the instructions below with the help of Mixed-Effects Models in S and S-Plus, but have not been able to go around the missing data issue as follows: tmtA - c(20,26, 16,29,22,NA) tmtB - c(12,24,17,21,NA,17) require(lme4) dv - c(20,12,26,24,16,17,29,21,22,17) subject - rep(c(s1,s2,s3,s4,s5,s6),each=2) subject - subject[-c(10,11)] myfactor - rep(c(f1,f2), 6) myfactor - myfactor[-c(10,11)] mydata - data.frame(dv, subject, myfactor) am2 - lmer(dv ~ myfactor + (1|subject)), data = mydata) summary(am2) anova(am2) subject - subject[-c(10,11)] Any help would be greatly appreciated. Thank you, Rafael Diaz Assistant Professor Math and Stats Dept California State University Sacramento [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question concerning VGAM
== Martin Spindler martin.spind...@gmx.de on Mon, 5 Jul 2010 07:48:42 +0200 writes: Hello everyone, using the VGAM package and the following code library(VGAM) bp1 - vglm(cbind(daten$anzahl_b, daten$deckung_b) ~ ., binom2.rho, data=daten1) summary(bp1) coef(bp1, matrix=TRUE) produced this error message: error in object$coefficients : $ operator not defined for this S4 class I am bit confused because some day ago this error message did not show up and everything was fine. Thank you very much in advance for your help. Best, Martin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help --- PLEASE do read the posting guide http://www.R-project.org/posting-guide.html --- and provide commented, minimal, self-contained, reproducible code. Hmm, and which part of the two lines above did you not understand? example(vglm) already contains uses of coef() which do work fine; so it must be you, or your setup which breaks things. Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Command run
Hi Sir/Madam, I am calling R from an ASP.NET project. The outputs are a chart and cor(P1,P2). The exe file is in c:\program files\R\R-2.11.1\bin\rscript.exe, when I the asp.net project opens cmd.exe, and run r - -vanilla r_sample.r , it says r is not recognized as defined command. (r_sample.r location is in bin of R). I also register Rscript.exe as exe file in windows, but it still shows an error. Really appreciate your hint. Regards Mahdieh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lc2 Model
Dear developping team, I am a graduate student trying to fit a dose response curve for my thesis. I found one publication talking about the lc2-Modell in the drc Function (drm package), but I didn't find any related info how to create my data. fct = lc.2() was not found by my R. How do I get any info on that please? First, I tried the LL.2 model, but it doesnot really fit as my data are not logistically, but on a linear scale (values 4, 8, 12, 16, 20, 24, 28). Trying to contact the author of the publication failed so I couldn't think of any other way, I hope you can help me. Thanks a lot, Sincerely, Kathrin Schreglmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R squared from cv.lm
Hello, I used the cv.lm function to validate a linear regression model fit-lm(y ~ x1+x2+x3+x4+0, data=mydata) without intercept I tried to validate the model by performing a leave one out cross validation procedure usinfg the cvlm function: CVlm(df=mydata, fit, m=196) But how can I get the adjusted R² from the output of this function. Or is there any other function to perform a leave one out cross validation procedure. Thank you very much in advance. A GRATIS für alle WEB.DE Nutzer: Die maxdome Movie-FLAT! Jetzt freischalten unter http://movieflat.web.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Stoch Prog in R
Can you please let know if there are any packages for stochastic linear programming (SLP) in R? Thanks in advance Sudhakar Achath __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] r code exchange site?
Does there exist a site where snippets of r code examples can be deposited, such as the one that exists for matlab? http://www.mathworks.com/matlabcentral/fileexchange/ ps I also noted from the main r site http://www.r-project.org/ when you click on the nabble link under the search link, I end up here http://e-nvf.vvvay.net/-td13672.html#a13819 which I don't think is anything to do with R as far as I can tell (but my Russian is not that hot) Yours Hopefully, pb -- View this message in context: http://r.789695.n4.nabble.com/r-code-exchange-site-tp2278205p2278205.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] r code exchange site?
Hello There is http://www.r-cookbook.com/, but I'm not sure that it is what you're looking for. Liviu On Mon, Jul 5, 2010 at 10:54 AM, pdb ph...@philbrierley.com wrote: Does there exist a site where snippets of r code examples can be deposited, such as the one that exists for matlab? http://www.mathworks.com/matlabcentral/fileexchange/ ps I also noted from the main r site http://www.r-project.org/ when you click on the nabble link under the search link, I end up here http://e-nvf.vvvay.net/-td13672.html#a13819 which I don't think is anything to do with R as far as I can tell (but my Russian is not that hot) Yours Hopefully, pb -- View this message in context: http://r.789695.n4.nabble.com/r-code-exchange-site-tp2278205p2278205.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] repeated measures with missing data
On Mon, Jul 5, 2010 at 4:00 AM, ONKELINX, Thierry thierry.onkel...@inbo.be wrote: Dear Rafael, The line below had one closing bracket to much. The line below should work. am2 - lmer(dv ~ myfactor + (1|subject), data = mydata) Furthermore I would advise to change myfactor for a character variable to a factor. In addition, you can simplify the data manipulation: wide - matrix(c(20, 26, 16, 29, 22, NA, 12, 24, 17, 21, NA, 17), nrow = 6, dimnames = list(subject = paste(s, 1:6, sep = ), myfactor = c(f1, f2))) long - as.data.frame.table(wide, responseName = dv) am2 - lmer(dv ~ myfactor + (1|subject), data = long) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice xyplot with bty=l
Hi all, Back in 2007, Deepayan and Patrick had an exchange about how to modify axes for lattice plots (pasted below). I need something similar, but I also need to produce ticks on the axes. Deepayan quickly coded up substitute gridlines because they needed to make the default box transparent. The code works, but it lacks the ticks, and I could not google up how to add them. (I would be taken aback if you could even help me but this option into a theme for the latticeExtra package -- the grid options there do not seem to allow left and bottom axes at the same time. Not even asTheEconomist command, though I thought I have seen plot with bty=l in The Economist...) I understand that panels in lattice were not intended for such use originally, but for many advanced features of the lattice package, my team wants to use this. The specific goal would be to produce graphs in R like those here: http://obs.rc.fas.harvard.edu/chetty/denmark_adjcost_slides.pdf . Any help would be greatly appreciated. Thank you very much, Laszlo László Sándor graduate student Department of Economics Harvard University Deepayan Sarkar [EMAIL PROTECTED] writes: On 9/4/07, Patrick Drechsler [EMAIL PROTECTED] wrote: what is the correct way of removing the top and right axes completely from a lattice xyplot? I would like to have a plot similar to using the bty=l option for traditional plots. There is no direct analog (and I think it would be weird in a multipanel plot). I agree that this is not very useful for multipanel plots. Combining a few different features, you can do: library(grid) xyplot(1:10 ~ 1:10, scales = list(col = black, tck = c(1, 0)), par.settings = list(axis.line = list(col = transparent)), axis = function(side, ...) { if (side == left) grid.lines(x = c(0, 0), y = c(0, 1), default.units = npc) else if (side == bottom) grid.lines(x = c(0, 1), y = c(0, 0), default.units = npc) axis.default(side = side, ...) }) -Deepayan Thank you very much Deepayan, this is exactly what I was looking for! Cheers, Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Execute commands in 'R' within PERL Program
Hi, I wrote a program in PERL which creates a file with .net extension (.. xyz.net). I want to call R from within my PERL program, execute 3-line command in 'R', store the output and get back to PERL program. RSPerl is of omegahat is a good software which creates interface between R and PERL, but unfortunately it doesn't work on my Windows XP. I used the following command in PERL to call R @args = ('C:\Program Files\R\R-2.9.0\bin\Rgui.exe'); system(@args) == 0 or die system @args failed: $!; which opens R console. Now, I want to run the following command in R, close R and get back to original PERL program g - read.graph(f://xyz.net, pajek) d-degree(g,mode=in) power.law.fit(d+1) I am struck at this point, could any one help me out ? In the past, there was a discussion in this forum titled Problem calling R from within perl script on Windows .., from there I picked up the system command to call R from within PERL. Thanks barrry ! I am hoping that somebody from that thread could help me.. ! Thanks Chakri -- View this message in context: http://r.789695.n4.nabble.com/Execute-commands-in-R-within-PERL-Program-tp2278255p2278255.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] adding a row of names to data.frame
Relative noob here, I have a data.frame and simply want to add an explicit column of names in column 1 of the form trial_number01 for row 1, trial_number02 for row 2 etc. It is simply for visual purposes and to explain data to others. I've tried Using row.names and other but still no luck, am sure it has been covered but I can't find it, can you please point me in the right direction? Thanks Jim === Dr. Jim Maas University of East Anglia Norwich, UK [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating DataFrame of Vectors Data Structure for Classification
Dear Experts, I have a input file that looks like this -0.438185,svm,1 -0.766791,svm,1 0.695282,svm,-1 0.759100,svm,-1 0.034400,svm,1 0.524807,svm,1 -0.27647800,nn,1 -0.16120810,nn,-1 0.63911350,nn,1 0.400554110,nn,1 0.429192240,nn,-1 0.454239140,nn,1 How can I create a data structure in R so that it gives this: print(a_data_structure) $hiv.svm $hiv.svm$predictions $hiv.svm$predictions[[1]] [1] -0.438185 -0.766791 0.695282 $hiv.svm$predictions[[2]] [1] 0.759100 0.034400 0.524807 $hiv.svm$labels $hiv.svm$labels[[1]] [1] 1 1 -1 $hiv.svm$labels[[2]] [1] -1 1 1 $hiv.nn $hiv.nn$predictions $hiv.nn$predictions[[1]] [1] -0.27647800 -0.16120810 0.63911350 $hiv.nn$predictions[[2]] [1] 0.400554110 0.429192240 0.454239140 $hiv.nn$labels $hiv.nn$labels[[1]] [1] 1 -1 1 $hiv.nn$labels[[2]] [1] 1 -1 1 I'm new in R. Truly need help Regards, G. Viswanath __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning entries to categories
OK, thanks for the help! Here a more complex example: a=c(x,y,z) b=c(8,14,19) c=c(200010,535388,19929) data=data.frame(a,b,c) d=c(cat1,cat2,cat3,cat4,cat5,cat6) b1=c(14,5,8,20,19,1) c_start=c(50,50,20,20,18000,60) c_stop=c(55,55,201000,201000,2,70) category=data.frame(d,b1,c_start,c_stop) Again I want to create a new variable, which automatically assigns the category to the data based on matching b = b1 and c = c_start and =c_stop. I hope this explains my problem more explicit. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Assigning-entries-to-categories-tp2272697p2278334.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adding a row of names to data.frame
Jim Is this what you need? #create data Lines - Drug1 Drug2 Drug3 Drug4 153 133 145 111 189 177 200 170 221 241 187 243 215 228 201 178 302 283 292 248 223 255 220 202 201 238 233 163 173 164 172 139 121 128 119 120 100 200 300 400 # read in data d - read.table(textConnection(Lines), header = TRUE) #add row.names row.names(d)=paste(trial_number,sprintf(%02d,as.numeric(row.names(d))),sep=) # view output print(d) HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/adding-a-row-of-names-to-data-frame-tp2278278p2278345.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Linux-Windows problem
Dear All, I faced the following problem. With the same data.frame the results are different under Linux and Windows. Could you help on this topic? Thanks in advance, Ildiko Linux: d = read.csv(CRP.csv) d$drugCode = as.numeric(d$drug) cor(d, use=pairwise.complete.obs) PATIENT BL.CRP X24HR.CRP X48HR.CRP drug drugCode PATIENTNA NA NA NA NA NA BL.CRP NA 1.000 0.84324880 -0.05699590 NA -0.3367147 X24HR.CRP NA 0.8432488 1. -0.06162383 NA -0.3557316 X48HR.CRP NA -0.0569959 -0.06162383 1. NA 0.1553356 drug NA NA NA NA NA NA drugCode NA -0.3367147 -0.35573159 0.15533562 NA 1.000 Warning message: In cor(d, use = pairwise.complete.obs) : NAs introduced by coercion str(d) 'data.frame': 41 obs. of 6 variables: $ PATIENT : Factor w/ 41 levels RV13,RV14,..: 2 3 4 6 7 12 13 14 15 17 ... $ BL.CRP : num 7.3 31.2 4.2 6.7 1.6 7.7 5.3 38.9 1 7.3 ... $ X24HR.CRP: num 6.1 24.9 11.1 4.9 1 5 3.7 18 1 7.3 ... $ X48HR.CRP: num 121.5 40 28.4 34.5 33.3 ... $ drug : Factor w/ 2 levels active,placebo: 1 1 1 1 1 1 1 1 1 1 ... $ drugCode : num 1 1 1 1 1 1 1 1 1 1 ... Windows: d = read.csv(CRP.csv) d$drugCode = as.numeric(d$drug) cor(d, use=pairwise.complete.obs) Error in cor(d, use = pairwise.complete.obs) : 'x' must be numeric str(d) 'data.frame': 41 obs. of 6 variables: $ PATIENT : Factor w/ 41 levels RV13,RV14,..: 2 3 4 6 7 12 13 14 15 17 ... $ BL.CRP : num 7.3 31.2 4.2 6.7 1.6 7.7 5.3 38.9 1 7.3 ... $ X24HR.CRP: num 6.1 24.9 11.1 4.9 1 5 3.7 18 1 7.3 ... $ X48HR.CRP: num 121.5 40 28.4 34.5 33.3 ... $ drug : Factor w/ 2 levels active,placebo: 1 1 1 1 1 1 1 1 1 1 ... $ drugCode : num 1 1 1 1 1 1 1 1 1 1 ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linux-Windows problem
On 05.07.2010 14:31, Ildiko Varga wrote: Dear All, I faced the following problem. With the same data.frame the results are different under Linux and Windows. Could you help on this topic? I guess you read in the data differently since you have different default encodings on both platforms (e.g. latin1 vs. UTF-8) and you data is probably not plain ASCII. Best, Uwe Ligges Thanks in advance, Ildiko Linux: d = read.csv(CRP.csv) d$drugCode = as.numeric(d$drug) cor(d, use=pairwise.complete.obs) PATIENT BL.CRP X24HR.CRP X48HR.CRP drug drugCode PATIENTNA NA NA NA NA NA BL.CRP NA 1.000 0.84324880 -0.05699590 NA -0.3367147 X24HR.CRP NA 0.8432488 1. -0.06162383 NA -0.3557316 X48HR.CRP NA -0.0569959 -0.06162383 1. NA 0.1553356 drug NA NA NA NA NA NA drugCode NA -0.3367147 -0.35573159 0.15533562 NA 1.000 Warning message: In cor(d, use = pairwise.complete.obs) : NAs introduced by coercion str(d) 'data.frame': 41 obs. of 6 variables: $ PATIENT : Factor w/ 41 levels RV13,RV14,..: 2 3 4 6 7 12 13 14 15 17 ... $ BL.CRP : num 7.3 31.2 4.2 6.7 1.6 7.7 5.3 38.9 1 7.3 ... $ X24HR.CRP: num 6.1 24.9 11.1 4.9 1 5 3.7 18 1 7.3 ... $ X48HR.CRP: num 121.5 40 28.4 34.5 33.3 ... $ drug : Factor w/ 2 levels active,placebo: 1 1 1 1 1 1 1 1 1 1 ... $ drugCode : num 1 1 1 1 1 1 1 1 1 1 ... Windows: d = read.csv(CRP.csv) d$drugCode = as.numeric(d$drug) cor(d, use=pairwise.complete.obs) Error in cor(d, use = pairwise.complete.obs) : 'x' must be numeric str(d) 'data.frame': 41 obs. of 6 variables: $ PATIENT : Factor w/ 41 levels RV13,RV14,..: 2 3 4 6 7 12 13 14 15 17 ... $ BL.CRP : num 7.3 31.2 4.2 6.7 1.6 7.7 5.3 38.9 1 7.3 ... $ X24HR.CRP: num 6.1 24.9 11.1 4.9 1 5 3.7 18 1 7.3 ... $ X48HR.CRP: num 121.5 40 28.4 34.5 33.3 ... $ drug : Factor w/ 2 levels active,placebo: 1 1 1 1 1 1 1 1 1 1 ... $ drugCode : num 1 1 1 1 1 1 1 1 1 1 ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Counting defined character within String
Dear list, I'm looking for a way to count the number of | within an object. The character | is used to separated ids. Assume a data (d) structure like Var NA NA NA NA NA 1 1|2 1|22|45 3 4b|24789 I need to know the maximum number of ids within one object. In this case 3 (1|22|45) Does anybody know a better way? Thanks Mit freundlichen Grüßen Andreas Kunzler Bundeszahnärztekammer (BZÄK) Chausseestraße 13 10115 Berlin Tel.: 030 40005-113 Fax: 030 40005-119 E-Mail: a.kunz...@bzaek.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting defined character within String
Try this: sapply(strsplit(as.character(Var$Var), \\|), length) On Mon, Jul 5, 2010 at 11:04 AM, Kunzler, Andreas a.kunz...@bzaek.dewrote: Dear list, I'm looking for a way to count the number of | within an object. The character | is used to separated ids. Assume a data (d) structure like Var NA NA NA NA NA 1 1|2 1|22|45 3 4b|24789 I need to know the maximum number of ids within one object. In this case 3 (1|22|45) Does anybody know a better way? Thanks Mit freundlichen Grüßen Andreas Kunzler Bundeszahnärztekammer (BZÄK) Chausseestraße 13 10115 Berlin Tel.: 030 40005-113 Fax: 030 40005-119 E-Mail: a.kunz...@bzaek.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] XTFEVD implementation in R
Is there a package in R for XTFEVD procedure (Plümper Troeger)? Also, are there any examples of Hausman-Taylor implementation in R? I understand that it can be done using plm package but could not find examples with actual data Thank you, Suresh Singh Fisher College of Business The Ohio State University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting defined character within String
On Jul 5, 2010, at 9:04 AM, Kunzler, Andreas wrote: Dear list, I'm looking for a way to count the number of | within an object. The character | is used to separated ids. Assume a data (d) structure like Var NA NA NA NA NA 1 1|2 1|22|45 3 4b|24789 I need to know the maximum number of ids within one object. In this case 3 (1|22|45) Does anybody know a better way? Thanks Presuming that your column is in a data frame called 'DF', where the 'Var' column is likely imported as a factor: DF Var 1 NA 2 NA 3 NA 4 NA 5 NA 6 1 7 1|2 8 1|22|45 9 3 10 4b|24789 max(sapply(strsplit(as.character(DF$Var), split = \\|), length)) [1] 3 The above uses strsplit() to split each line using the | as the split character. Since | has a special meaning for regular expressions, it needs to be escaped using the double backslash: strsplit(as.character(DF$Var), split = \\|) [[1]] [1] NA [[2]] [1] NA [[3]] [1] NA [[4]] [1] NA [[5]] [1] NA [[6]] [1] 1 [[7]] [1] 1 2 [[8]] [1] 1 22 45 [[9]] [1] 3 [[10]] [1] 4b24789 Then you just loop through each line getting the length: sapply(strsplit(as.character(DF$Var), split = \\|), length) [1] 1 1 1 1 1 1 2 3 1 2 and of course get the max value. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Labels in a barchart (Lattice or otherwise)
On Jul 4, 2010, at 11:43 PM, RaoulD wrote: Hi, Can anyone please help me with how I could add labels with the value for each bar in a barchart? (similar to how data labels can be added in Excel) I have done a lot of searching but havent been lucky. This is generally pretty easy with text() at least if you are using base graphics. If it is not clear after reading the help page then post an examply with whatever barchart function you have chosen to use. If it's the lattice barchart there is an ltext example immediately before the barchart example that quickly can be grafted into the barchart code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] r code exchange site?
On Jul 5, 2010, at 5:54 AM, pdb wrote: Does there exist a site where snippets of r code examples can be deposited, such as the one that exists for matlab? From the main R-project page wht Wiki link takes you here: http://rwiki.sciviews.org/doku.php There is also an R section on stack overflow. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] r code exchange site?
I've used www.pastebin.com before, with C as the code. Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ --- On Mon, 7/5/10, David Winsemius dwinsem...@comcast.net wrote: From: David Winsemius dwinsem...@comcast.net Subject: Re: [R] r code exchange site? To: pdb ph...@philbrierley.com Cc: r-help@r-project.org Date: Monday, July 5, 2010, 4:25 PM On Jul 5, 2010, at 5:54 AM, pdb wrote: Does there exist a site where snippets of r code examples can be deposited, such as the one that exists for matlab? From the main R-project page wht Wiki link takes you here: http://rwiki.sciviews.org/doku.php There is also an R section on stack overflow. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning entries to categories
On Jul 5, 2010, at 8:54 AM, LogLord wrote: OK, thanks for the help! Here a more complex example: a=c(x,y,z) b=c(8,14,19) c=c(200010,535388,19929) data=data.frame(a,b,c) d=c(cat1,cat2,cat3,cat4,cat5,cat6) b1=c(14,5,8,20,19,1) c_start=c(50,50,20,20,18000,60) c_stop=c(55,55,201000,201000,2,70) category=data.frame(d,b1,c_start,c_stop) Again I want to create a new variable, which automatically assigns the category to the data based on matching b = b1 and c = c_start and =c_stop. Probably not the most elegant solution. For each data row, see which one or more rows of category satisfies. Not tested for possibility of non-hit: for (i in 1:nrow(data)) print( category[ which(apply(category[, -1], 1, function(x) {data$b[i]==x[1] data $c[i] x[2] x[3] data$c[i]})), 1] ) [1] cat3 Levels: cat1 cat2 cat3 cat4 cat5 cat6 [1] cat1 Levels: cat1 cat2 cat3 cat4 cat5 cat6 [1] cat5 Levels: cat1 cat2 cat3 cat4 cat5 cat6 A couple of points. Bad practice to name variables or objects with the name c. Also bad practice to name objects data. Both at common R function names. I hope this explains my problem more explicit. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Assigning-entries-to-categories-tp2272697p2278334.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning entries to categories
On Mon, Jul 5, 2010 at 8:54 AM, LogLord nils.sch...@web.de wrote: OK, thanks for the help! Here a more complex example: a=c(x,y,z) b=c(8,14,19) c=c(200010,535388,19929) data=data.frame(a,b,c) d=c(cat1,cat2,cat3,cat4,cat5,cat6) b1=c(14,5,8,20,19,1) c_start=c(50,50,20,20,18000,60) c_stop=c(55,55,201000,201000,2,70) category=data.frame(d,b1,c_start,c_stop) Again I want to create a new variable, which automatically assigns the category to the data based on matching b = b1 and c = c_start and =c_stop. Try this: library(sqldf) sqldf(select data.*, d from data, category where data.b = category.b1 and c = c_start and c = c_stop) a b cd 1 x 8 200010 cat3 2 y 14 535388 cat1 3 z 19 19929 cat5 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning entries to categories
Gabor Grothendieck wrote: On Mon, Jul 5, 2010 at 8:54 AM, LogLord nils.sch...@web.de wrote: OK, thanks for the help! Here a more complex example: a=c(x,y,z) b=c(8,14,19) c=c(200010,535388,19929) data=data.frame(a,b,c) d=c(cat1,cat2,cat3,cat4,cat5,cat6) b1=c(14,5,8,20,19,1) c_start=c(50,50,20,20,18000,60) c_stop=c(55,55,201000,201000,2,70) category=data.frame(d,b1,c_start,c_stop) Again I want to create a new variable, which automatically assigns the category to the data based on matching b = b1 and c = c_start and =c_stop. Try this: library(sqldf) sqldf(select data.*, d from data, category where data.b = category.b1 and c = c_start and c = c_stop) a b cd 1 x 8 200010 cat3 2 y 14 535388 cat1 3 z 19 19929 cat5 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Great! That's what I need! Seems like I need to extend my sql knowledge urgently... Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Assigning-entries-to-categories-tp2272697p2278524.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot with whispers
Hello! I need to make a plot with whispers that does the following. Reads in 50 files, each file containing 200 data points. A file looks like this: base100.log Send Receive 10.5 100.3 15.0 102.4 ... There are 100 lines, each with two data points. I need to read in the 50 files, and plot three lines The first line is the mean of the send column with whiskers indicating standard deviation (Each file represents one data point) The second line is the mean of the receive column, as above. the final plot is the mean of the two summed, with whiskers as above. There will be 50 data points on the final graph, one for each file. I've done this sort of a thing before, but I really can't figure out how to handle the different Columns. If I use read.table: x1 - read.table(updateToSink1010.log) then x1 becomes a matrix, with two columns and 101 rows. -- including Send, Receive. Anyways, I'd appreciate a push in some direction - hopefully the right one :). -- Ian Bentley M.Sc. Candidate Queen's University Kingston, Ontario [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot with whispers
It looks like read.table is reading the first line as a data value, which is the default for read.table. Try using read.table with the argument header=TRUE. Also, consider using a box and whiskers plot for these data (?boxplot, ?lattice::bwplot). -Matt On Mon, 2010-07-05 at 12:08 -0400, Ian Bentley wrote: Hello! I need to make a plot with whispers that does the following. Reads in 50 files, each file containing 200 data points. A file looks like this: base100.log Send Receive 10.5 100.3 15.0 102.4 ... There are 100 lines, each with two data points. I need to read in the 50 files, and plot three lines The first line is the mean of the send column with whiskers indicating standard deviation (Each file represents one data point) The second line is the mean of the receive column, as above. the final plot is the mean of the two summed, with whiskers as above. There will be 50 data points on the final graph, one for each file. I've done this sort of a thing before, but I really can't figure out how to handle the different Columns. If I use read.table: x1 - read.table(updateToSink1010.log) then x1 becomes a matrix, with two columns and 101 rows. -- including Send, Receive. Anyways, I'd appreciate a push in some direction - hopefully the right one :). -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can anybody help me understand AIC and BIC and devise a new metric?
Hi all, Could anybody please help me understand AIC and BIC and especially why do they make sense? Furthermore, I am trying to devise a new metric related to the model selection in the financial asset management industry. As you know the industry uses Sharpe Ratio as the main performance benchmark, which is the annualized mean of returns divided by the annualized standard deviation of returns. In model selection, we would like to choose a model that yields the highest Sharpe Ratio. However, the more parameters you use, the higher Sharpe Ratio you might potentially get, and the higher risk that your model is overfitted. I am trying to think of a AIC or BIC version of the Sharpe Ratio that facilitates the model selection... Anybody could you please give me some pointers? Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Can-anybody-help-me-understand-AIC-and-BIC-and-devise-a-new-metric-tp2278448p2278448.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unable to get bigglm working, ATTN: Thomas Lumley
the model fails to converge after more than 3 hours ( I went home so don't know how long it took) bigglm (formula = resp ~ relage+I(relage^2)+termfac+ri , + data = a, family = binomial(link='logit')); Large data regression model: bigglm(formula = resp ~ relage + I(relage^2) + termfac + ri, data = a, family = binomial(link = logit)) Sample size = 12758187 failed to converge after 8 iterations Warning message: In bigglm.function(formula = resp ~ relage + I(relage^2) + termfac + : ran out of iterations and failed to converge SAS converges NOTE: PROC LOGISTIC is modeling the probability that resp='1'. NOTE: Convergence criterion (GCONV=1E-8) satisfied. NOTE: There were 12758187 observations read from the data set SRVRUSER.COMMIT. NOTE: The data set WORK.OUT3 has 11 observations and 15 variables. NOTE: PROCEDURE LOGISTIC used (Total process time): real time 2:25.42 cpu time1:16.79 I did not see a trace argument in bigglm. is there another way to see what is happening? Thank you. -- View this message in context: http://r.789695.n4.nabble.com/unable-to-get-bigglm-working-ATTN-Thomas-Lumley-tp2276524p2278381.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotting data with ellipse confidence intervals
Hi, I would like to plot a set of paired means (as X Y data) with unique confidence intervals for each (creating a set of ellipses, each with it's own centre point and shape). Would appreciate any advice out there! Cheers, Ged [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help reg Genome view
Hi, I have a set of genes and its chromosomal physical position in a text file. I want to view those genes in the chromosome using R package GenePlotter. Could any one please tell how to view this. Thanks in advance. Yours sincerely, S.Mahalakshmi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Issue with write.table and read.table : I'm not getting out what I put in
Hello, I am trying to save a large matrix of values in a file. My problem is that I am writing write.table(allpos,'control_chr1.txt', dec=.) and then I want to check it with test2=read.table('control_chr1.txt') sum(test2[,2]==allpos[,2]) This last number is lower than the length of the test2[,2] vector. This is really annoying me because I can't figure out why I don't get out the same thing that I put in. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Passing the parameter (file name) to png()
Thanks a lot! Regards, Maulik On Sat, Jun 26, 2010 at 4:43 AM, jim holtman jholt...@gmail.com wrote: b - paste(C:\\rphp\\,arg, sep='') On Sat, Jun 26, 2010 at 12:55 AM, Maulik Shah maulik.shah2...@gmail.com wrote: I am fitting 3 parameter model to my response matrix and want to generate item characterstic curve. I want to specify file name to save item characterstic curve by passing it as external parameter to the R batch script. The following is the code I have written for this. *R Script:* library(ltm) cmd_args = commandArgs(); for (arg in cmd_args) cat( , arg, \n, sep=) respmat - read.table(C:\\rphp\\responsedata.txt) fit3pl - tpm(respmat) cat( , arg, \n, sep=) b - c(C:\\rphp\\,arg) png(file=b, bg=transparent) plot(fit3pl,items=c,lwd=3) dev.off() rm(respmat,fit3pl,b) q() Could you please help me in doing so? I get an error message when R executes png(). Thanks and Regards, Maulik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Patch for legend.position={left,top,bottom} in ggplot2
Thank you for this nice patch! To incorporate it you have to open the ggplot2 file in path to your R packages\ggplot2\R, search for the first line of code and replace it with the patch. Don't forget to delete the lines with - and the + in front of the new code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] To detect the location of duplicate values
Dear R family, I have a question about how to detect some duplicate numeric observations. Suppose that I have two variables dataset. order value 1 0.52 2 0.23 3 0.43 4 0.21 5 0.32 6 0.32 7 0.32 8 0.32 9 0.32 10 0.12 11 0.46 12 0.09 13 0.32 14 0.25 ; Could you help me indicate where the duplicate observations in a row (e.g., 0.32) are? best, moohwan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue with write.table and read.table : I'm not getting out what I put in
Why not use 'save' 'load'? On Mon, Jul 5, 2010 at 11:49 AM, Irina irina.kr...@epfl.ch wrote: Hello, I am trying to save a large matrix of values in a file. My problem is that I am writing write.table(allpos,'control_chr1.txt', dec=.) and then I want to check it with test2=read.table('control_chr1.txt') sum(test2[,2]==allpos[,2]) This last number is lower than the length of the test2[,2] vector. This is really annoying me because I can't figure out why I don't get out the same thing that I put in. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] To detect the location of duplicate values
Try this: DF[duplicated(DF$value),] On Mon, Jul 5, 2010 at 1:31 PM, Moohwan Kim kmhl...@gmail.com wrote: Dear R family, I have a question about how to detect some duplicate numeric observations. Suppose that I have two variables dataset. order value 1 0.52 2 0.23 3 0.43 4 0.21 5 0.32 6 0.32 7 0.32 8 0.32 9 0.32 10 0.12 11 0.46 12 0.09 13 0.32 14 0.25 ; Could you help me indicate where the duplicate observations in a row (e.g., 0.32) are? best, moohwan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] To detect the location of duplicate values
try this x order value 1 1 0.52 2 2 0.23 3 3 0.43 4 4 0.21 5 5 0.32 6 6 0.32 7 7 0.32 8 8 0.32 9 9 0.32 1010 0.12 1111 0.46 1212 0.09 1313 0.32 1414 0.25 # go both ways to capture all duplicates which(duplicated(x$value) | duplicated(x$value, fromLast=TRUE)) [1] 5 6 7 8 9 13 On Mon, Jul 5, 2010 at 12:31 PM, Moohwan Kim kmhl...@gmail.com wrote: Dear R family, I have a question about how to detect some duplicate numeric observations. Suppose that I have two variables dataset. order value 1 0.52 2 0.23 3 0.43 4 0.21 5 0.32 6 0.32 7 0.32 8 0.32 9 0.32 10 0.12 11 0.46 12 0.09 13 0.32 14 0.25 ; Could you help me indicate where the duplicate observations in a row (e.g., 0.32) are? best, moohwan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] To detect the location of duplicate values
Hello Moohwan, Look at ?duplicated for example: x [1] 1 1 2 2 3 3 duplicated(x) [1] FALSE TRUE FALSE TRUE FALSE TRUE If your end goal is to get rid of the duplicates, take a look at ?unique unique(x) [1] 1 2 3 Best Regards, Josh On Mon, Jul 5, 2010 at 9:31 AM, Moohwan Kim kmhl...@gmail.com wrote: Dear R family, I have a question about how to detect some duplicate numeric observations. Suppose that I have two variables dataset. order value 1 0.52 2 0.23 3 0.43 4 0.21 5 0.32 6 0.32 7 0.32 8 0.32 9 0.32 10 0.12 11 0.46 12 0.09 13 0.32 14 0.25 ; Could you help me indicate where the duplicate observations in a row (e.g., 0.32) are? best, moohwan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help reg Genome view
On 07/05/2010 08:51 AM, mahalakshmi sivamani wrote: Hi, I have a set of genes and its chromosomal physical position in a text file. I want to view those genes in the chromosome using R package GenePlotter. Could any one please tell how to view this. Hi S. Mahalakshmi, The package is geneplotter. a) read the vignette browseVignettes('GenePlotter') b) ask on the Bioconductor mailing list http://bioconductor.org/docs/mailList.html c) see additional packages, esp. GenomeGraphs at http://bioconductor.org/packages/release/Software.html Martin Thanks in advance. Yours sincerely, S.Mahalakshmi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting data with ellipse confidence intervals
On 2010-07-05 7:48, web reg wrote: Hi, I would like to plot a set of paired means (as X Y data) with unique confidence intervals for each (creating a set of ellipses, each with it's own centre point and shape). Would appreciate any advice out there! Cheers, Ged If you have only the means, then you can't plot CIs and ellipses. If you have the original paired data, then have a look at the car::ellipse function and friends. -Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can anybody help me understand AIC and BIC and devise a new metric?
On Jul 5, 2010, at 10:35 AM, LosemindL wrote: Hi all, Could anybody please help me understand AIC and BIC and especially why do they make sense? Furthermore, I am trying to devise a new metric related to the model selection in the financial asset management industry. As you know the industry uses Sharpe Ratio as the main performance benchmark, which is the annualized mean of returns divided by the annualized standard deviation of returns. In model selection, we would like to choose a model that yields the highest Sharpe Ratio. However, the more parameters you use, the higher Sharpe Ratio you might potentially get, and the higher risk that your model is overfitted. I am trying to think of a AIC or BIC version of the Sharpe Ratio that facilitates the model selection... Anybody could you please give me some pointers? From: http://www.R-project.org/posting-guide.html Basic statistics and classroom homework: R-help is not intended for these. Perhaps following some link on Wikipedia, instead? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue with write.table and read.table : I'm not getting out what I put in
On Jul 5, 2010, at 11:49 AM, Irina wrote: Hello, I am trying to save a large matrix of values in a file. My problem is that I am writing write.table(allpos,'control_chr1.txt', dec=.) and then I want to check it with test2=read.table('control_chr1.txt') sum(test2[,2]==allpos[,2]) This last number is lower than the length of the test2[,2] vector. This is really annoying me because I can't figure out why I don't get out the same thing that I put in. Many potential problems could underly getting FALSE for an == test. One might be FAQ 7.31. Another might be encoding or locale issues related to the decimal separator. Why not post a reproducible example? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Patch for legend.position={left,top,bottom} in ggplot2
Or wait a couple of days for the next release of ggplot2... Hadley On Mon, Jul 5, 2010 at 11:28 AM, Sebastian Wurster sebastian.wurs...@gmx.de wrote: Thank you for this nice patch! To incorporate it you have to open the ggplot2 file in path to your R packages\ggplot2\R, search for the first line of code and replace it with the patch. Don't forget to delete the lines with - and the + in front of the new code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue with write.table and read.table : I'm not getting out what I put in
On 2010-07-05 11:30, David Winsemius wrote: On Jul 5, 2010, at 11:49 AM, Irina wrote: Hello, I am trying to save a large matrix of values in a file. My problem is that I am writing write.table(allpos,'control_chr1.txt', dec=.) and then I want to check it with test2=read.table('control_chr1.txt') sum(test2[,2]==allpos[,2]) This last number is lower than the length of the test2[,2] vector. This is really annoying me because I can't figure out why I don't get out the same thing that I put in. Many potential problems could underly getting FALSE for an == test. One might be FAQ 7.31. Another might be encoding or locale issues related to the decimal separator. Why not post a reproducible example? David's advice is spot-on. (As is Jim's: using save/load is better.) I have no problem replicating your 'problem' with random data, showing once again the futility of using == in situations such as this. Try instead: all.equal(test2, allpos, check.attributes = FALSE) (why the check.attributes argument may be needed is left as an exercise) -Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting defined character within String
On Mon, 5 Jul 2010, Kunzler, Andreas wrote: Dear list, I'm looking for a way to count the number of | within an object. The character | is used to separated ids. Assume a data (d) structure like Var NA NA NA NA NA 1 1|2 1|22|45 3 4b|24789 I need to know the maximum number of ids within one object. In this case 3 (1|22|45) Does anybody know a better way? See ?max ?count.fields and, if you are noit using this on a text file, ?textConnection count.fields(textConnection( + Var + NA + NA + NA + NA + NA + 1 + 1|2 + 1|22|45 + 3 + 4b|24789 + ),sep=|) [1] 1 1 1 1 1 1 1 2 3 1 2 HTH, Chuck Thanks Mit freundlichen Grüßen Andreas Kunzler Bundeszahnärztekammer (BZÄK) Chausseestraße 13 10115 Berlin Tel.: 030 40005-113 Fax: 030 40005-119 E-Mail: a.kunz...@bzaek.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] To detect the location of duplicate values
On Mon, 5 Jul 2010, Moohwan Kim wrote: Dear R family, I have a question about how to detect some duplicate numeric observations. Suppose that I have two variables dataset. order value 1 0.52 2 0.23 3 0.43 4 0.21 5 0.32 6 0.32 7 0.32 8 0.32 9 0.32 10 0.12 11 0.46 12 0.09 13 0.32 14 0.25 ; Could you help me indicate where the duplicate observations in a row (e.g., 0.32) are? I see you already have replies about duplicate() and unique(), which are very handy for the 'detect' part of your query. But to list the locations of the duplciated elements, you might also benefit from using split() and Filter() like this: Filter( function(x) length(x)1, split(order, value) ) $`0.32` [1] 5 6 7 8 9 13 HTH, Chuck best, moohwan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] export VTK from R : impossible to write data as float
Hello, I've written a short code (below) to write 3D unstructured grid to binary VTK files from R. Problem : I can only write integers with the command : data-as.numeric(c(3.3)) storage.mode(data)-'integer' writeBin(data,bfile_celldata,endian=swap) the function storage.mode(data)-'long' looks fine, but the VTK file is not readable. thanks for any help or comments, #- R SCRIPT TO WRITE A CUBE IN VTK FORMAT cat('# vtk DataFile Version 3.0\n',file=vtk_header) cat('R Binary Export v3.0 of inversion model\nBINARY\n',file=vtk_header, append=TRUE) cat('DATASET UNSTRUCTURED_GRID\n',file=vtk_header, append=TRUE) #placed here instead of top of vtk_points, since npoints is not known before cat('POINTS', 8,'int\n',file=vtk_header,append=TRUE) #write points writeBin(as.integer(c(0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1)),bfile_points,endian=swap) #cells header cat('\nCELLS', 1,9,'\n',file=cells_header) #write cells writeBin(as.integer(c(8, 0, 1, 3, 2, 4, 5, 7, 6)),bfile_cells,endian=swap) #cell types header cat('\nCELL_TYPES', 1,'\n',file=celltypes_header) #write cell types writeBin(as.integer(c(12)),bfile_celltypes,endian=swap) #cell data header cat('\nCELL_DATA',1,'\n',file=celldata_header) cat('SCALARS R float 1','\n',file=celldata_header, append=TRUE) cat('LOOKUP_TABLE default\n',file=celldata_header, append=TRUE) #write cell data data-as.numeric(c(3.3)) storage.mode(data)-'integer' writeBin(data,bfile_celldata,endian=swap) #close binary connections close(bfile_points) close(bfile_cells) close(bfile_celltypes) close(bfile_celldata) #concatenate files to produce VTK system(paste('cat',vtk_header,bfile_points,cells_header,bfile_cells,celltypes_header,bfile_celltypes,celldata_header,bfile_celldata,,testb_unstructured.vtk,sep= )) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot with whispers
Thanks Matt, I've been trying to get the data into a format that boxplot will accept, but I'm having trouble. If I read in my file directly data - read.table(base100.log) plot(data) It plots the Send data against the Receive data, using one as x, and one as y. That's not too surprising, so I tried to just plot the Send data plot(data[1]) It plots all the points horizontally along the x axis, with no associated y data. This is very similar to what I would like to do - associate all 100 points with a y co-ordinate. I couldn't get this step to work though, I tried a number of things ydata - seq(length=100, from=1, by=0) p1 - c(data[1], ydata) seems to be the close to what I want, but it isn't quite right. Can anyone give me an idea how to associate the 100 data points with a y-coord, so that I can then use them in a boxplot/whiskerplot? Thanks ian On 5 July 2010 12:31, Matt Shotwell shotw...@musc.edu wrote: It looks like read.table is reading the first line as a data value, which is the default for read.table. Try using read.table with the argument header=TRUE. Also, consider using a box and whiskers plot for these data (?boxplot, ?lattice::bwplot). -Matt On Mon, 2010-07-05 at 12:08 -0400, Ian Bentley wrote: Hello! I need to make a plot with whispers that does the following. Reads in 50 files, each file containing 200 data points. A file looks like this: base100.log Send Receive 10.5 100.3 15.0 102.4 ... There are 100 lines, each with two data points. I need to read in the 50 files, and plot three lines The first line is the mean of the send column with whiskers indicating standard deviation (Each file represents one data point) The second line is the mean of the receive column, as above. the final plot is the mean of the two summed, with whiskers as above. There will be 50 data points on the final graph, one for each file. I've done this sort of a thing before, but I really can't figure out how to handle the different Columns. If I use read.table: x1 - read.table(updateToSink1010.log) then x1 becomes a matrix, with two columns and 101 rows. -- including Send, Receive. Anyways, I'd appreciate a push in some direction - hopefully the right one :). -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com -- Ian Bentley M.Sc. Candidate Queen's University Kingston, Ontario [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot with whispers
Hi: This sounds like your standard error bar plot. Here's one way to get it, using lists, melt() from the reshape package and ggplot2. # Generate 50 fake data sets with 200 rows and variables send, receive: for(i in seq_len(50)) assign(paste('df', i, sep = ''), data.frame(send = rnorm(200, 10, 5), receive = rnorm(200, 105, 10))) # Generate a vector of data frame names dnames - paste('df', 1:50, sep = '') # Create the function for processing one data frame. In this case, we # want to melt the data first so that the variable names become factor # levels and the data values are correspondingly stacked. We then use # ddply() from package plyr to produce the mean and standard deviation # from each variable. f - function(df) { u - melt(df) ddply(u, .(variable), summarise, m = mean(value), s = sd(value)) } # Slurp the data frames into a list and then add send and receive together # This can probably be done without a loop using sapply, but the loop # should be about as fast. l - vector('list', length(dnames)) for(i in seq_along(dnames)) {l[[i]] - get(dnames[[i]]) l[[i]]$both = with(l[[i]], send + receive)} # Now, apply the function f to each data frame in the list, and rbind the # results together. Afterward, create dsn to distinguish the different # data frames (I chose to use the numbers only as they can be used # as the x-axis in the plots below.) out - do.call(rbind, lapply(l, f)) out$dsn - rep(1:length(dnames), each = 3) # Create the error bar plots for each of the 50 data frames by each # variable, where the dot represents the mean and the ends of the # segments represent a 1 SD distance from the mean. p - ggplot(out, aes(x = dsn, y = m, ymin = m - s, ymax = m + s)) p + geom_point(size = 2) + geom_errorbar(width = 0) + facet_grid(variable ~ ., scales = 'free_y') + xlab('Data set number') Substitute your actual data frames for the fake ones (in particular, redefine dnames) and you should be good to go if you like the plot. HTH, Dennis On Mon, Jul 5, 2010 at 9:08 AM, Ian Bentley ian.bent...@gmail.com wrote: Hello! I need to make a plot with whispers that does the following. Reads in 50 files, each file containing 200 data points. A file looks like this: base100.log Send Receive 10.5 100.3 15.0 102.4 ... There are 100 lines, each with two data points. I need to read in the 50 files, and plot three lines The first line is the mean of the send column with whiskers indicating standard deviation (Each file represents one data point) The second line is the mean of the receive column, as above. the final plot is the mean of the two summed, with whiskers as above. There will be 50 data points on the final graph, one for each file. I've done this sort of a thing before, but I really can't figure out how to handle the different Columns. If I use read.table: x1 - read.table(updateToSink1010.log) then x1 becomes a matrix, with two columns and 101 rows. -- including Send, Receive. Anyways, I'd appreciate a push in some direction - hopefully the right one :). -- Ian Bentley M.Sc. Candidate Queen's University Kingston, Ontario [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Some questions about R's modelling algebra
On Fri, Jul 2, 2010 at 8:16 AM, Hadley Wickham had...@rice.edu wrote: ?formula in R 2.9.2 says in para 2: The %in% operator indicates that the terms on its left are nested within those on the right. For example a + b %in% a expands to the formula a + a:b. Ooops, missed that. So b %in% a = a:b, and that's what's meant by different coding. Or would this be true only if b %in% a was preceded by a? attr(terms(y ~ B %in% A), 'term.labels') #[1] B:A attr(terms(y ~ B + B %in% A), 'term.labels') #[1] B B:A attr(terms(y ~ A + B %in% A), 'term.labels') #[1] A A:B suggesting a documentation buglet in Sec 11.1 of An Introduction to R, where it states: \begin{quote} y ~ A*B y ~ A + B + A:B y ~ B %in% A y ~ A/B Two factor non-additive model of y on A and B. The first two specify the same crossed classification and the second two specify the same nested classification. In abstract terms all four specify the same model subspace. \end{quote} I think y ~ B %in% A should be changed to y ~ A + B %in% A since attr(terms(y ~ A/B), 'term.labels') #[1] A A:B Or am I missing something? Kingsford Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] selection of optim parameters
Hi all, I am trying to rebuild the results of a study using a different data set. I'm using about 450 observations. The code I've written seems to work well, but I have some troubles minimizing the negative of the LogLikelyhood function using 5 free parameters. As starting values I am using the result of the paper I am rebuiling. The system.time of the calculation of the function is about 0.65 sec. Since the free parameters should be within some boundaries I am using the following command: optim(fn=calculateLogLikelyhood, c(0.4, 2, 0.4, 8, 0.8), lower=c(0,0,0,0,0), upper=c(1, 5, Inf, Inf, 1), control=list(trace=1, maxit=1000)) Unfortunately the result doesn't seem to be reasonable. 3 of the optimized parameters are on the boundaries. Unfortunately I don't have much experience using optimizatzion methods. That's why I am asking you. Do you have any hints for me what should be taken into account when doing such an optimization. Is there a good way to implement the boundaries into the code (instead of doing it while optimizing)? I've read about parscale in the help-section. Unfortunately I don't really know how to use it. And anyways, could this help? What other points/controls should be taken into account? I know that this might be a bit little information about my current code. But I don't know what you need to give me some advise. Just let me know what you need to know. Thankds __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data.frame: adding a column that is based on ranges of values in another column
Dear List, I've been looking tirelessly for a solution to this dilemma but without success. Perhaps someone has an idea that will guide me in the right direction. Suppose I have the following data.frame: DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828, 114.8903, 114.9519, 114.8842, 114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265, 46.80584, 46.67022, 46.53264, 46.47727, 46.46457, 46.47032), Date = as.Date(c('2009-01-01', '2009-01-03', '2009-01-05', '2009-01-10', '2009-01-14', '2009-01-15', '2009-01-16', '2009-01-17', '2009-01-22', '2009-01-29'))) DF XY Date 1 114.5508 47.14094 2009-01-01 2 114.6468 46.98874 2009-01-03 3 114.6596 46.91235 2009-01-05 4 114.6957 46.88265 2009-01-10 5 114.6828 46.80584 2009-01-14 6 114.8903 46.67022 2009-01-15 7 114.9519 46.53264 2009-01-16 8 114.8842 46.47727 2009-01-17 9 114.8579 46.46457 2009-01-22 10 114.8489 46.47032 2009-01-29 I also have two objects that contain the dates of the first and last fortnight of the month of January 2009. s.d1 = '2009-01-01' e.d1 = '2009-01-14' f.n1 = seq(from = as.Date(s.d1) , to = as.Date(e.d1), by = 1) f.n1 [1] 2009-01-01 2009-01-02 2009-01-03 2009-01-04 2009-01-05 2009-01-06 2009-01-07 2009-01-08 2009-01-09 2009-01-10 2009-01-11 2009-01-12 2009-01-13 2009-01-14 s.d2 = '2009-01-15' e.d2 = '2009-01-31' f.n2 = seq(from = as.Date(s.d2) , to = as.Date(e.d2), by = 1) f.n2 [1] 2009-01-15 2009-01-16 2009-01-17 2009-01-18 2009-01-19 2009-01-20 2009-01-21 2009-01-22 2009-01-23 2009-01-24 2009-01-25 2009-01-26 2009-01-27 2009-01-28 2009-01-29 2009-01-30 2009-01-31 I'm trying to add a column called Fortnight to the existing data.frame. The components of the new Fortnight column are based on the existing Date column so that if the value in Date falls within the first fortnight (f.n1) then the value of the new Fortnight column would be FN1, and if the value of the Date column falls within the second fortnight (f.n2), then the value of the Fortnight column would be FN2, and so on. The end result should look like: XY Date Fortnight 1 114.5508 47.14094 2009-01-01 FN1 2 114.6468 46.98874 2009-01-03 FN1 3 114.6596 46.91235 2009-01-05 FN1 4 114.6957 46.88265 2009-01-10 FN1 5 114.6828 46.80584 2009-01-14 FN1 6 114.8903 46.67022 2009-01-15 FN2 7 114.9519 46.53264 2009-01-16 FN2 8 114.8842 46.47727 2009-01-17 FN2 9 114.8579 46.46457 2009-01-22 FN2 10 114.8489 46.47032 2009-01-29 FN2 I manually entered the above values for the Fortnight column to illustrate my point, however, that would be quite tiresome for 500+ rows of data ;-) The only other similar issue I found on the list was https://stat.ethz.ch/pipermail/r-help/2008-February/153995.html but that particular problem is slightly different than what I'm trying to accomplish here. I appreciate your time and assistance. Thanks in advance. Regards, Hakim Abdi _ Abdulhakim Abdi, M.Sc. Research Intern Conservation GIS/Remote Sensing Lab Smithsonian Conservation Biology Institute 1500 Remount Road Front Royal, VA 22630 phone: +1 540 635 6578 mobile: +1 747 224 7006 fax: +1 540 635 6506 (Attn:GIS Lab) email: ab...@si.edu http://nationalzoo.si.edu/SCBI/ConservationGIS/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can anybody help me understand AIC and BIC and devise a new metric?
Hi: On Mon, Jul 5, 2010 at 7:35 AM, LosemindL comtech@gmail.com wrote: Hi all, Could anybody please help me understand AIC and BIC and especially why do they make sense? Any good text that discusses model selection in detail will have some discussion of AIC and BIC. Frank Harrell's book 'Regression Modeling Strategies' comes immediately to mind, along with Hastie, Tibshirani and Friedman (Elements of Statistical Learning) and Burnham and Anderson's book (Model Selection and Multi-Model Inference), but there are many other worthy texts that cover the topic. The gist is that AIC and BIC penalize the log likelihood of a model by subtracting different functions of its number of parameters. David's suggestion of Wikipedia is also on target. Furthermore, I am trying to devise a new metric related to the model selection in the financial asset management industry. As you know the industry uses Sharpe Ratio as the main performance benchmark, which is the annualized mean of returns divided by the annualized standard deviation of returns. I didn't know, but thank you for the information. Isn't this simply a signal-to-noise ratio quantified on an annual basis? In model selection, we would like to choose a model that yields the highest Sharpe Ratio. However, the more parameters you use, the higher Sharpe Ratio you might potentially get, and the higher risk that your model is overfitted. I am trying to think of a AIC or BIC version of the Sharpe Ratio that facilitates the model selection... You might be able to make some progress if you can express the (penalized) log likelihood as a function of the Sharpe ratio. But if you have several years of data in your model and the ratio is computed annually, then isn't it a random variable rather than a parameter? If so, it changes the nature of the problem, no? (Being unfamiliar with the Sharpe ratio, I fully recognize that I may be completely off-base in this suggestion, but I'll put it out there anyway :) BTW, you might find the R-sig-finance list to be a more productive resource in this problem than R-help due to the specialized nature of the question. HTH, Dennis Anybody could you please give me some pointers? Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Can-anybody-help-me-understand-AIC-and-BIC-and-devise-a-new-metric-tp2278448p2278448.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Labels in a barchart (Lattice or otherwise)
Thank You David. Yes, I am using the lattice barchart and have managed to add data labels, however, they tend to be on the tip of each bar and are difficult to read as they are partially on the bar. Any help would be greatly appreciated. This is the code I am using: levels(PR_SUMMARY$Bucket)=c(0-3 months,3-9 months,9-15 months,15-18 months) barchart(PrimaryReason ~ cInteractions| Bucket + Type, data = PR_SUMMARY, layout = c(4, 2),col=lightgreen,main=COMPARISON - PRIMARY REASON, sub=L R,xlab=Number of Customers,ylab=Primary Reasons, auto.key = list(title = COMPARISON - PRIMARY REASON,columns=2,points = FALSE, rectangles = TRUE,space= right ),scales = list(x = list(abbreviate=TRUE,minlength=5,rot=45)), panel = function(x,y,subscripts,groups,...){ panel.barchart(x,y,...) ltext(x,y,label=round(PR_SUMMARY$cInteractions,1), cex=.99,rot=45) border=transparent}) I dont really understand the ltext part and found it with some other code, but it works. Thanks again, Raoul -- View this message in context: http://r.789695.n4.nabble.com/Data-Labels-in-a-barchart-Lattice-or-otherwise-tp2278027p2278646.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to determine if R is 64 bit compiled under Unix-alike?
Under MacOS I had R64 executive and it was clear. Under Ubuntu, which I do not have administrative rights to, there is only R executive. It seems that I can allocate more than 3GB of memory, however not everything seems to work the same/right as with R64 under MacOS. Pms. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unable to get bigglm working, ATTN: Thomas Lumley
I decided to give it 1 more variable, which is strongly significant to help the optimization and it throws: bigglm (formula = resp ~ relage+relage2+termfac+ri+sn , + data = a, family = binomial(link='logit')); Error in bigglm.function(formula = resp ~ relage + relage2 + termfac + : model matrices incompatible -- View this message in context: http://r.789695.n4.nabble.com/unable-to-get-bigglm-working-ATTN-Thomas-Lumley-tp2276524p2278734.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] selection of optim parameters
On Mon, 5 Jul 2010, Fabian Gehring wrote: Hi all, I am trying to rebuild the results of a study using a different data set. I'm using about 450 observations. The code I've written seems to work well, but I have some troubles minimizing the negative of the LogLikelyhood function using 5 free parameters. As starting values I am using the result of the paper I am rebuiling. The system.time of the calculation of the function is about 0.65 sec. Since the free parameters should be within some boundaries I am using the following command: optim(fn=calculateLogLikelyhood, c(0.4, 2, 0.4, 8, 0.8), lower=c(0,0,0,0,0), upper=c(1, 5, Inf, Inf, 1), control=list(trace=1, maxit=1000)) Unfortunately the result doesn't seem to be reasonable. 3 of the optimized parameters are on the boundaries. You haven't said why this is unreasonable. A suggestion: Profile the loglikelihood around the starting value and around the putative maximum. (This might help with the parscale issue.) Also, you might try something like apply(rbind(0,eps*as.matrix(expand.grid(rep(list(c(-1,0,1)),5,1, function(x) calculateLogLikelyhood( x + y ) ) where y is the starting value (or the value achieved by optim()) and 'eps' is small enough to make small changes in the function value might help you see what gives. (It might be necessary to scale each column in as.matrix(...) separately, 'though.) In addition to inspecting the results by eyeball, you can fit the results to a quadratic form in rbind(...) using lm() and then figure out roughly where to go to find the maximum of your function. If this isn't enough to get you started, at least it might enable you to say more clearly what is not reasonable about your results. HTH, Chuck Unfortunately I don't have much experience using optimizatzion methods. That's why I am asking you. Do you have any hints for me what should be taken into account when doing such an optimization. Is there a good way to implement the boundaries into the code (instead of doing it while optimizing)? I've read about parscale in the help-section. Unfortunately I don't really know how to use it. And anyways, could this help? What other points/controls should be taken into account? I know that this might be a bit little information about my current code. But I don't know what you need to give me some advise. Just let me know what you need to know. Thankds __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can anybody help me understand AIC and BIC and devise a new metric?
You should have a look at: Model Selection and Model Averaging Gerda Claeskens K.U. Leuven Nils Lid Hjort University of Oslo Among other this will explain that AIC and BIC really aims at different goals. On Mon, Jul 5, 2010 at 4:20 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: On Mon, Jul 5, 2010 at 7:35 AM, LosemindL comtech@gmail.com wrote: Hi all, Could anybody please help me understand AIC and BIC and especially why do they make sense? Any good text that discusses model selection in detail will have some discussion of AIC and BIC. Frank Harrell's book 'Regression Modeling Strategies' comes immediately to mind, along with Hastie, Tibshirani and Friedman (Elements of Statistical Learning) and Burnham and Anderson's book (Model Selection and Multi-Model Inference), but there are many other worthy texts that cover the topic. The gist is that AIC and BIC penalize the log likelihood of a model by subtracting different functions of its number of parameters. David's suggestion of Wikipedia is also on target. Furthermore, I am trying to devise a new metric related to the model selection in the financial asset management industry. As you know the industry uses Sharpe Ratio as the main performance benchmark, which is the annualized mean of returns divided by the annualized standard deviation of returns. I didn't know, but thank you for the information. Isn't this simply a signal-to-noise ratio quantified on an annual basis? In model selection, we would like to choose a model that yields the highest Sharpe Ratio. However, the more parameters you use, the higher Sharpe Ratio you might potentially get, and the higher risk that your model is overfitted. I am trying to think of a AIC or BIC version of the Sharpe Ratio that facilitates the model selection... You might be able to make some progress if you can express the (penalized) log likelihood as a function of the Sharpe ratio. But if you have several years of data in your model and the ratio is computed annually, then isn't it a random variable rather than a parameter? If so, it changes the nature of the problem, no? (Being unfamiliar with the Sharpe ratio, I fully recognize that I may be completely off-base in this suggestion, but I'll put it out there anyway :) BTW, you might find the R-sig-finance list to be a more productive resource in this problem than R-help due to the specialized nature of the question. HTH, Dennis Anybody could you please give me some pointers? Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Can-anybody-help-me-understand-AIC-and-BIC-and-devise-a-new-metric-tp2278448p2278448.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data.frame: adding a column that is based on ranges of values in another column
use 'merge': DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828, 114.8903, 114.9519, 114.8842, + 114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265, 46.80584, 46.67022, 46.53264, 46.47727, + 46.46457, 46.47032), Date = as.Date(c('2009-01-01', '2009-01-03', '2009-01-05', '2009-01-10', '2009-01-14', + '2009-01-15', '2009-01-16', '2009-01-17', '2009-01-22', '2009-01-29'))) s.d1 = '2009-01-01' e.d1 = '2009-01-14' f.n1 = seq(from = as.Date(s.d1) , to = as.Date(e.d1), by = 1) s.d2 = '2009-01-15' e.d2 = '2009-01-31' f.n2 = seq(from = as.Date(s.d2) , to = as.Date(e.d2), by = 1) x.new - data.frame(Date=c(f.n1, f.n2), + Fortnight=c(rep(FN1, length(f.n1)), rep(FN2, length(f.n2 merge(DF, x.new, all.x=TRUE) DateXY Fortnight 1 2009-01-01 114.5508 47.14094 FN1 2 2009-01-03 114.6468 46.98874 FN1 3 2009-01-05 114.6596 46.91235 FN1 4 2009-01-10 114.6957 46.88265 FN1 5 2009-01-14 114.6828 46.80584 FN1 6 2009-01-15 114.8903 46.67022 FN2 7 2009-01-16 114.9519 46.53264 FN2 8 2009-01-17 114.8842 46.47727 FN2 9 2009-01-22 114.8579 46.46457 FN2 10 2009-01-29 114.8489 46.47032 FN2 On Mon, Jul 5, 2010 at 4:01 PM, Abdi, Abdulhakim ab...@si.edu wrote: Dear List, I've been looking tirelessly for a solution to this dilemma but without success. Perhaps someone has an idea that will guide me in the right direction. Suppose I have the following data.frame: DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828, 114.8903, 114.9519, 114.8842, 114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265, 46.80584, 46.67022, 46.53264, 46.47727, 46.46457, 46.47032), Date = as.Date(c('2009-01-01', '2009-01-03', '2009-01-05', '2009-01-10', '2009-01-14', '2009-01-15', '2009-01-16', '2009-01-17', '2009-01-22', '2009-01-29'))) DF X Y Date 1 114.5508 47.14094 2009-01-01 2 114.6468 46.98874 2009-01-03 3 114.6596 46.91235 2009-01-05 4 114.6957 46.88265 2009-01-10 5 114.6828 46.80584 2009-01-14 6 114.8903 46.67022 2009-01-15 7 114.9519 46.53264 2009-01-16 8 114.8842 46.47727 2009-01-17 9 114.8579 46.46457 2009-01-22 10 114.8489 46.47032 2009-01-29 I also have two objects that contain the dates of the first and last fortnight of the month of January 2009. s.d1 = '2009-01-01' e.d1 = '2009-01-14' f.n1 = seq(from = as.Date(s.d1) , to = as.Date(e.d1), by = 1) f.n1 [1] 2009-01-01 2009-01-02 2009-01-03 2009-01-04 2009-01-05 2009-01-06 2009-01-07 2009-01-08 2009-01-09 2009-01-10 2009-01-11 2009-01-12 2009-01-13 2009-01-14 s.d2 = '2009-01-15' e.d2 = '2009-01-31' f.n2 = seq(from = as.Date(s.d2) , to = as.Date(e.d2), by = 1) f.n2 [1] 2009-01-15 2009-01-16 2009-01-17 2009-01-18 2009-01-19 2009-01-20 2009-01-21 2009-01-22 2009-01-23 2009-01-24 2009-01-25 2009-01-26 2009-01-27 2009-01-28 2009-01-29 2009-01-30 2009-01-31 I'm trying to add a column called Fortnight to the existing data.frame. The components of the new Fortnight column are based on the existing Date column so that if the value in Date falls within the first fortnight (f.n1) then the value of the new Fortnight column would be FN1, and if the value of the Date column falls within the second fortnight (f.n2), then the value of the Fortnight column would be FN2, and so on. The end result should look like: X Y Date Fortnight 1 114.5508 47.14094 2009-01-01 FN1 2 114.6468 46.98874 2009-01-03 FN1 3 114.6596 46.91235 2009-01-05 FN1 4 114.6957 46.88265 2009-01-10 FN1 5 114.6828 46.80584 2009-01-14 FN1 6 114.8903 46.67022 2009-01-15 FN2 7 114.9519 46.53264 2009-01-16 FN2 8 114.8842 46.47727 2009-01-17 FN2 9 114.8579 46.46457 2009-01-22 FN2 10 114.8489 46.47032 2009-01-29 FN2 I manually entered the above values for the Fortnight column to illustrate my point, however, that would be quite tiresome for 500+ rows of data ;-) The only other similar issue I found on the list was https://stat.ethz.ch/pipermail/r-help/2008-February/153995.html but that particular problem is slightly different than what I'm trying to accomplish here. I appreciate your time and assistance. Thanks in advance. Regards, Hakim Abdi _ Abdulhakim Abdi, M.Sc. Research Intern Conservation GIS/Remote Sensing Lab Smithsonian Conservation Biology Institute 1500 Remount Road Front Royal, VA 22630 phone: +1 540 635 6578 mobile: +1 747 224 7006 fax: +1 540 635 6506 (Attn:GIS Lab) email: ab...@si.edu http://nationalzoo.si.edu/SCBI/ConservationGIS/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
Re: [R] data.frame: adding a column that is based on ranges of values in another column
Hi: Since you've been looking tirelessly :) For your stated problem, the following will work: DF$Fortnight - with(DF, ifelse(Date %in% f.n1, 'FN1', ifelse(Date %in% f.n2, 'FN2', 'FN3'))) However, if you have a number of fortnights (perhaps stretching over several years), you need a different approach. The idea is to use the last day of 2008 as the origin in your example and compute the number of days past it for each observation. We then integer divide the number of days past the the origin by 14 and add one to get the fortnight number. origin - as.Date('2008-12-31') # set origin DF$doy - DF$Date - origin # days past origin DF$FN - 1 + as.numeric(DF$doy)%/% 14 # fortnight number past origin DF XY Date Fortnight doy FN 1 114.5508 47.14094 2009-01-01 FN1 1 days 1 2 114.6468 46.98874 2009-01-03 FN1 3 days 1 3 114.6596 46.91235 2009-01-05 FN1 5 days 1 4 114.6957 46.88265 2009-01-10 FN1 10 days 1 5 114.6828 46.80584 2009-01-14 FN1 14 days 2 6 114.8903 46.67022 2009-01-15 FN2 15 days 2 7 114.9519 46.53264 2009-01-16 FN2 16 days 2 8 114.8842 46.47727 2009-01-17 FN2 17 days 2 9 114.8579 46.46457 2009-01-22 FN2 22 days 2 10 114.8489 46.47032 2009-01-29 FN3 29 days 3 This should be less tedious than the ifelse() approach. HTH, Dennis On Mon, Jul 5, 2010 at 1:01 PM, Abdi, Abdulhakim ab...@si.edu wrote: Dear List, I've been looking tirelessly for a solution to this dilemma but without success. Perhaps someone has an idea that will guide me in the right direction. Suppose I have the following data.frame: DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828, 114.8903, 114.9519, 114.8842, 114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265, 46.80584, 46.67022, 46.53264, 46.47727, 46.46457, 46.47032), Date = as.Date(c('2009-01-01', '2009-01-03', '2009-01-05', '2009-01-10', '2009-01-14', '2009-01-15', '2009-01-16', '2009-01-17', '2009-01-22', '2009-01-29'))) DF XY Date 1 114.5508 47.14094 2009-01-01 2 114.6468 46.98874 2009-01-03 3 114.6596 46.91235 2009-01-05 4 114.6957 46.88265 2009-01-10 5 114.6828 46.80584 2009-01-14 6 114.8903 46.67022 2009-01-15 7 114.9519 46.53264 2009-01-16 8 114.8842 46.47727 2009-01-17 9 114.8579 46.46457 2009-01-22 10 114.8489 46.47032 2009-01-29 I also have two objects that contain the dates of the first and last fortnight of the month of January 2009. s.d1 = '2009-01-01' e.d1 = '2009-01-14' f.n1 = seq(from = as.Date(s.d1) , to = as.Date(e.d1), by = 1) f.n1 [1] 2009-01-01 2009-01-02 2009-01-03 2009-01-04 2009-01-05 2009-01-06 2009-01-07 2009-01-08 2009-01-09 2009-01-10 2009-01-11 2009-01-12 2009-01-13 2009-01-14 s.d2 = '2009-01-15' e.d2 = '2009-01-31' f.n2 = seq(from = as.Date(s.d2) , to = as.Date(e.d2), by = 1) f.n2 [1] 2009-01-15 2009-01-16 2009-01-17 2009-01-18 2009-01-19 2009-01-20 2009-01-21 2009-01-22 2009-01-23 2009-01-24 2009-01-25 2009-01-26 2009-01-27 2009-01-28 2009-01-29 2009-01-30 2009-01-31 I'm trying to add a column called Fortnight to the existing data.frame. The components of the new Fortnight column are based on the existing Date column so that if the value in Date falls within the first fortnight (f.n1) then the value of the new Fortnight column would be FN1, and if the value of the Date column falls within the second fortnight (f.n2), then the value of the Fortnight column would be FN2, and so on. The end result should look like: XY Date Fortnight 1 114.5508 47.14094 2009-01-01 FN1 2 114.6468 46.98874 2009-01-03 FN1 3 114.6596 46.91235 2009-01-05 FN1 4 114.6957 46.88265 2009-01-10 FN1 5 114.6828 46.80584 2009-01-14 FN1 6 114.8903 46.67022 2009-01-15 FN2 7 114.9519 46.53264 2009-01-16 FN2 8 114.8842 46.47727 2009-01-17 FN2 9 114.8579 46.46457 2009-01-22 FN2 10 114.8489 46.47032 2009-01-29 FN2 I manually entered the above values for the Fortnight column to illustrate my point, however, that would be quite tiresome for 500+ rows of data ;-) The only other similar issue I found on the list was https://stat.ethz.ch/pipermail/r-help/2008-February/153995.html but that particular problem is slightly different than what I'm trying to accomplish here. I appreciate your time and assistance. Thanks in advance. Regards, Hakim Abdi _ Abdulhakim Abdi, M.Sc. Research Intern Conservation GIS/Remote Sensing Lab Smithsonian Conservation Biology Institute 1500 Remount Road Front Royal, VA 22630 phone: +1 540 635 6578 mobile: +1 747 224 7006 fax: +1 540 635 6506 (Attn:GIS Lab) email: ab...@si.edu http://nationalzoo.si.edu/SCBI/ConservationGIS/ [[alternative HTML version deleted]] __
[R] nested for loops
Dear Admin, I will appreciate if you advise me an effective way to write the following R code including nested for loops. I cannot do it by using expand.grid function because it results with memory allocation problems. Thanks for your time and consideration. for(d1 in 0:n){ for(d2 in 0:n){ for(d3 in 0:n){ for(d4 in 0:n){ for(d5 in 0:n){ for(d6 in 0:n){ for(d7 in 0:n){ for(d8 in 0:n){ for(d9 in 0:n){ for(d10 in 0:n){ for(d11 in 0:n){ for(d12 in 0:n){ for(d13 in 0:n){ for(d14 in 0:n){ for(d15 in 0:n){ for(d16 in 0:n){ for(d17 in 0:n){ for(d18 in 0:n){ for(d19 in 0:n){ for(d20 in 0:n){ list=c(d1,d2,d3,d4,d5,d6,d7,d8,d9,d10,d11,d12,d13,d14,d15,d16,d17,d18,d19,d20) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to determine if R is 64 bit compiled under Unix-alike?
On Mon, 2010-07-05 at 19:25 +0200, Przemek Grabowicz wrote: Under MacOS I had R64 executive and it was clear. Under Ubuntu, which I do not have administrative rights to, there is only R executive. It seems that I can allocate more than 3GB of memory, however not everything seems to work the same/right as with R64 under MacOS. Pms. Type .Machine$sizeof.pointer If respond is 8 your R is 64 bits -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] To detect the location of duplicate values
Charles C. Berry cberry at tajo.ucsd.edu writes: On Mon, 5 Jul 2010, Moohwan Kim wrote: Dear R family, I have a question about how to detect some duplicate numeric observations. Suppose that I have two variables dataset. order value 1 0.52 2 0.23 3 0.43 4 0.21 5 0.32 6 0.32 7 0.32 8 0.32 9 0.32 10 0.12 11 0.46 12 0.09 13 0.32 14 0.25 ; Could you help me indicate where the duplicate observations in a row (e.g., 0.32) are? I see you already have replies about duplicate() and unique(), which are very handy for the 'detect' part of your query. But to list the locations of the duplciated elements, you might also benefit from using split() and Filter() like this: Filter( function(x) length(x)1, split(order, value) ) $`0.32` [1] 5 6 7 8 9 13 Mark Leeds kindly pointed out (in private correspondence) that this needs a bit more explanation. If the above 'dataset' is in fact a data.frame called 'dat' then either attach(dat) Filter( function(x) length(x)1, split(order, value) ) or Filter( function(x) length(x)1, split(dat$order, dat$value) ) or with( dat, Filter( function(x) length(x)1, split(order, value) ) ) should do it. Thanks Mark! [snip] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Function to compute the multinomial beta function?
Dear R-users, Is there an R function to compute the multinomial beta function? That is, the normalizing constant that arises in a Dirichlet distribution. For example, with three parameters the beta function is Beta(n1,n2,n2) = Gamma(n1)*Gamma(n2)*Gamma(n3)/Gamma(n1+n2+n3) Thanks in advance for any assisstance. Regards, Greg [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nested for loops
Le 05/07/10 23:06, Senay ASMA a écrit : Dear Admin, I will appreciate if you advise me an effective way to write the following R code including nested for loops. I cannot do it by using expand.grid function because it results with memory allocation problems. Thanks for your time and consideration. for(d1 in 0:n){ for(d2 in 0:n){ for(d3 in 0:n){ for(d4 in 0:n){ for(d5 in 0:n){ for(d6 in 0:n){ for(d7 in 0:n){ for(d8 in 0:n){ for(d9 in 0:n){ for(d10 in 0:n){ for(d11 in 0:n){ for(d12 in 0:n){ for(d13 in 0:n){ for(d14 in 0:n){ for(d15 in 0:n){ for(d16 in 0:n){ for(d17 in 0:n){ for(d18 in 0:n){ for(d19 in 0:n){ for(d20 in 0:n){ list=c(d1,d2,d3,d4,d5,d6,d7,d8,d9,d10,d11,d12,d13,d14,d15,d16,d17,d18,d19,d20) Probably not what you want, but this should replicate the same effect as the code you posted: list - rep( n, 20 ) Romain -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/98Uf7u : Rcpp 0.8.1 |- http://bit.ly/c6YnCi : graph gallery collage `- http://bit.ly/bZ7ltC : inline 0.3.5 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to determine if R is 64 bit compiled under Unix-alike?
On 月, 2010-07-05 at 19:25 +0200, Przemek Grabowicz wrote: Under MacOS I had R64 executive and it was clear. Under Ubuntu, which I do not have administrative rights to, there is only R executive. It seems that I can allocate more than 3GB of memory, however not everything seems to work the same/right as with R64 under MacOS. If you can locate the R executable then the command file will tell you right away. On my system (Gentoo), the R command is actually a shell script that sets a number of environment variables, etc. and then calls the actual R executable, which is /usr/lib64/R/bin/exec/R (don't know if this is the same in Ubuntu). File then gives: file /usr/lib64/R/bin/exec/R /usr/lib64/R/bin/exec/R: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.9, stripped -- Stuart Luppescu -*-*- slu at ccsr dot uchicago dot edu CCSR in UEI at U of C __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] to remove duplicate values
Dear R family, Suppose I have two series. order value 1 0.52 2 0.23 3 0.43 4 0.21 5 0.32 6 0.32 7 0.32 8 0.32 9 0.32 10 0.12 11 0.46 12 0.09 13 0.32 14 0.25 For these two series, I figured out the way to detect the locations of duplicate values. The next thing to do is remove the repeated values except for a value that would not be next to each other. In other words, while keeping the 13th value, I want to remove observations from 6th to 9th. That is my end goal. Could you help me reach the goal? best moohwan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to determine if R is 64 bit compiled under Unix-alike?
On 07/05/2010 10:52 PM, Marcin Jaworski wrote: Try: .Machine$sizeof.pointer If you get 8, you are riding 64 bit R. If you get 4, your R is 32-bit one. I got 8, so should be 64 bits. But I have problems with some package, could it be that it is 32-bit? It was installed using: R CMD INSTALL foobar.tar.gz On MacOS using R64 CMD INSTALL foobar.tar.gz gave proper effect. But here on Ubuntu it seems that objects from that package are not able to load much data. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Memory problem in multinomial logistic regression
Dear All I am trying to fit a multinomial logistic regression to a data set with a size of 94279 by 14 entries. The data frame has one sample column which is the categorical variable, and the number of different categories is 9. The size of the data set (as a csv file) is less than 10 MB. I tried to fit a multinomial logistic regression, either using vglm() from the VGAM package or mlogit() from the mlogit package. In both cases the estimation crashes because I do not have enough memory, although the free memory before starting the regression is more than 2GB. The regression functions eat up all of my memory. Does anyone know why this relatively small data set leads to memory problems, and how I could work around my problem? thank you for your help, Daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function to compute the multinomial beta function?
How about this? mbeta - function(...) { exp(sum(lgamma(c(...)))-lgamma(sum(c(... } gamma(5)*gamma(6)*gamma(7)/gamma(18) [1] 5.829838e-09 mbeta(5,6,7) [1] 5.829838e-09 On Mon, 2010-07-05 at 17:10 -0400, Gregory Gentlemen wrote: Dear R-users, Is there an R function to compute the multinomial beta function? That is, the normalizing constant that arises in a Dirichlet distribution. For example, with three parameters the beta function is Beta(n1,n2,n2) = Gamma(n1)*Gamma(n2)*Gamma(n3)/Gamma(n1+n2+n3) Thanks in advance for any assisstance. Regards, Greg [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to determine if R is 64 bit compiled under Unix-alike?
On 月, 2010-07-05 at 23:05 +0200, Przemek Grabowicz wrote: On 07/05/2010 10:52 PM, Marcin Jaworski wrote: Try: .Machine$sizeof.pointer If you get 8, you are riding 64 bit R. If you get 4, your R is 32-bit one. I got 8, so should be 64 bits. But I have problems with some package, could it be that it is 32-bit? It was installed using: R CMD INSTALL foobar.tar.gz On MacOS using R64 CMD INSTALL foobar.tar.gz gave proper effect. But here on Ubuntu it seems that objects from that package are not able to load much data. I think when you install a package, the source files are compiled using the development tools on your system. (I don't know -- are there any binary packages for Linux?) Do you have the necessary C and Fortran compilers in your system? If you can find the object files, you can test them with file as before, for example: file /usr/lib64/R/library/MASS/libs/MASS.so /usr/lib64/R/library/MASS/libs/MASS.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, stripped -- Stuart Luppescu -=- slu .at. ccsr.uchicago.edu University of Chicago -=- CCSR 才文と智奈美の父 -=-Kernel 2.6.31-gentoo-r6 Andrew Thomas: ...and if something goes wrong here it is probably not WinBUGS since that has been running for more than 10 years... Peter Green (from the back): ... and it still hasn't converged!-- Andrew Thomas and Peter Green (during the talk about 'BRugs') gR 2003, Aalborg (September 2003) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nested for loops
What do you want to do with the data being genereated? In the loop you have, it will just return the last value generated. Let me ask my favorite question: What is the problem you are trying to solve. If you get a memory problem with expand.grid, then if you are trying to store the values in the 'for' loop, you will have the same problem. How big is 'n'? If it is 3, you will have this many values: 3^20 [1] 3486784401 What are you going to do with them? On Mon, Jul 5, 2010 at 5:06 PM, Senay ASMA senaya...@gmail.com wrote: Dear Admin, I will appreciate if you advise me an effective way to write the following R code including nested for loops. I cannot do it by using expand.grid function because it results with memory allocation problems. Thanks for your time and consideration. for(d1 in 0:n){ for(d2 in 0:n){ for(d3 in 0:n){ for(d4 in 0:n){ for(d5 in 0:n){ for(d6 in 0:n){ for(d7 in 0:n){ for(d8 in 0:n){ for(d9 in 0:n){ for(d10 in 0:n){ for(d11 in 0:n){ for(d12 in 0:n){ for(d13 in 0:n){ for(d14 in 0:n){ for(d15 in 0:n){ for(d16 in 0:n){ for(d17 in 0:n){ for(d18 in 0:n){ for(d19 in 0:n){ for(d20 in 0:n){ list=c(d1,d2,d3,d4,d5,d6,d7,d8,d9,d10,d11,d12,d13,d14,d15,d16,d17,d18,d19,d20) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] to remove duplicate values
Some further tricks will (probably) lead you to your goal. I suppose you use duplicated() or something similar to get an array of locations of the duplicated values: pos.dup - whcih(duplicated(value)) then do diff.pos.dup - diff(pos.dup) and you get the indices to delete: pos.delete - order[diff.pos.dup[which(diff.pos.dup==1)]] I leave some tweaking to you as you perhaps have to adjust some indices slightly by adding or substracting 1 (I am never exactly sure how this diff() function turns out). HTH Jannis Moohwan Kim schrieb: Dear R family, Suppose I have two series. order value 1 0.52 2 0.23 3 0.43 4 0.21 5 0.32 6 0.32 7 0.32 8 0.32 9 0.32 10 0.12 11 0.46 12 0.09 13 0.32 14 0.25 For these two series, I figured out the way to detect the locations of duplicate values. The next thing to do is remove the repeated values except for a value that would not be next to each other. In other words, while keeping the 13th value, I want to remove observations from 6th to 9th. That is my end goal. Could you help me reach the goal? best moohwan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Labels in a barchart (Lattice or otherwise)
On Jul 5, 2010, at 1:14 PM, RaoulD wrote: Thank You David. Yes, I am using the lattice barchart and have managed to add data labels, however, they tend to be on the tip of each bar and are difficult to read as they are partially on the bar. Any help would be greatly appreciated. This is the code I am using: levels(PR_SUMMARY$Bucket)=c(0-3 months,3-9 months,9-15 months,15-18 months) barchart(PrimaryReason ~ cInteractions| Bucket + Type, data = PR_SUMMARY, layout = c(4, 2),col=lightgreen,main=COMPARISON - PRIMARY REASON, sub=L R,xlab=Number of Customers,ylab=Primary Reasons, auto.key = list(title = COMPARISON - PRIMARY REASON,columns=2,points = FALSE, rectangles = TRUE,space= right ),scales = list(x = list(abbreviate=TRUE,minlength=5,rot=45)), panel = function(x,y,subscripts,groups,...){ panel.barchart(x,y,...) ltext(x,y,label=round(PR_SUMMARY$cInteractions,1), cex=.99,rot=45) # if you add or subtract a small amount from y in the prior line it will move the labels up or down. border=transparent}) I dont really understand the ltext part and found it with some other code, but it works. Thanks again, Raoul -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculation on series with different time-steps
Your question seem to me to be not precise enough for us to provide help. Do you need help with the if() syntax? If yes I would advice you to read some introductory R tutorial (like introduction to R (pdf, freely avalable on the net)) or some descent textbook. On quick hint for the correct syntax: | denotes OR denotes AND you need to write == if you want to use the logical equals If you do not know how to write the results to some dataframe you should really invest the time to get to know to R by some basic tutorial to understand the basics. By the way, if your timeseries is ordered in the way you describe, a more elegant way would be to create a series consisting of the pressure value belonging to each entry in the stream stage vector (by repeating each single value of pressure 12 times (see ?rep ), and then just substract the two. HTH Jannis Jeana Lee schrieb: Hello, I have two series, one with stream stage measurements every 5 minutes, and the other with barometric pressure measurements every hour. I want to subtract each barometric pressure measurement from the 12 stage measurements closest in time to it (6 stage measurements on either side of the hour). I want to do something like the following, but I don't know the syntax. If the Julian day of the stage measurement is equal to the Julian day of the pressure measurement, AND the absolute value of the difference between the time of the stage measurement and the hour of the pressure measurement is less than or equal to 30 minutes, then subtract the pressure measurement from the stage measurement (and put it in a new column in the stage data frame). if ( stage$julian_day = baro$julian_day |stage$time - baro$hour| = 30 ) then (stage$stage.cm - baro$pressure) Can you help me? Thanks, JL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] timeseries
Please have a look at the posting guide of the list. How shall we help you withou an idea of what you have done? Please include reproducible code and sample data! nuncio m schrieb: Dear useRs, I am trying to construct a time series using as.ts function, surprisingly when I plot the data the x axis do not show the time in years, however if I use ts(data), time in years are shown in the x axis. Why such difference in the results of both the commands Thanks nuncio __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Profiler for R ? (HFWUtils package)
Message: 21 Date: Mon, 5 Jul 2010 02:26:29 -0400 From: Ralf B ralf.bie...@gmail.com To: r-help@r-project.org r-help@r-project.org Subject: [R] Profiler for R ? Hi, is there such a thing as a profiler for R that informs about a) how much processing time is used by particular functions and commands and b) how much memory is used for creating how many objects (or types of data structures)? Haven't tried it; but stumbled across Profiling() function in the HFWUtils package. Starting at bottom of page 29-30 of HFWUtils package user manual: profiling plots tree of execution times Description determines how much time a function its and sub-functions (and sub-functions thereof etc) take to run (‘profiling’). Also draws picture of this using the interrelations of functions. HTH, Jim Callahan Orlando, FL In a way I am looking for something similar to the java profiler (which is started by command line and provides profiling information collected from the run of a particular program). Is there such a tool through the R command line or RGUI ? Are there profilers available for the Eclipse StatET or though another package or extension? Thanks, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data.frame: adding a column that is based on ranges of values in another column
Here is one way checkList - data.frame(Day = c(f.n1, f.n2), + FN = rep(c(FN1,FN2), + c(length(f.n1), length(f.n2 m - match(DF$Date, checkList$Day) DF - cbind(DF, Fortnight = checkList$FN[m]) DF XY Date Fortnight 1 114.5508 47.14094 2009-01-01 FN1 2 114.6468 46.98874 2009-01-03 FN1 3 114.6596 46.91235 2009-01-05 FN1 4 114.6957 46.88265 2009-01-10 FN1 5 114.6828 46.80584 2009-01-14 FN1 6 114.8903 46.67022 2009-01-15 FN2 7 114.9519 46.53264 2009-01-16 FN2 8 114.8842 46.47727 2009-01-17 FN2 9 114.8579 46.46457 2009-01-22 FN2 10 114.8489 46.47032 2009-01-29 FN2 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Abdi, Abdulhakim Sent: Tuesday, 6 July 2010 6:01 AM To: r-help@r-project.org Subject: [R] data.frame: adding a column that is based on ranges of values in another column Dear List, I've been looking tirelessly for a solution to this dilemma but without success. Perhaps someone has an idea that will guide me in the right direction. Suppose I have the following data.frame: DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828, 114.8903, 114.9519, 114.8842, 114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265, 46.80584, 46.67022, 46.53264, 46.47727, 46.46457, 46.47032), Date = as.Date(c('2009-01-01', '2009-01-03', '2009-01-05', '2009-01-10', '2009-01-14', '2009-01-15', '2009-01-16', '2009-01-17', '2009-01-22', '2009-01-29'))) DF XY Date 1 114.5508 47.14094 2009-01-01 2 114.6468 46.98874 2009-01-03 3 114.6596 46.91235 2009-01-05 4 114.6957 46.88265 2009-01-10 5 114.6828 46.80584 2009-01-14 6 114.8903 46.67022 2009-01-15 7 114.9519 46.53264 2009-01-16 8 114.8842 46.47727 2009-01-17 9 114.8579 46.46457 2009-01-22 10 114.8489 46.47032 2009-01-29 I also have two objects that contain the dates of the first and last fortnight of the month of January 2009. s.d1 = '2009-01-01' e.d1 = '2009-01-14' f.n1 = seq(from = as.Date(s.d1) , to = as.Date(e.d1), by = 1) f.n1 [1] 2009-01-01 2009-01-02 2009-01-03 2009-01-04 2009-01-05 2009-01-06 2009-01-07 2009-01-08 2009-01-09 2009-01-10 2009-01-11 2009-01-12 2009-01-13 2009-01-14 s.d2 = '2009-01-15' e.d2 = '2009-01-31' f.n2 = seq(from = as.Date(s.d2) , to = as.Date(e.d2), by = 1) f.n2 [1] 2009-01-15 2009-01-16 2009-01-17 2009-01-18 2009-01-19 2009-01-20 2009-01-21 2009-01-22 2009-01-23 2009-01-24 2009-01-25 2009-01-26 2009-01-27 2009-01-28 2009-01-29 2009-01-30 2009-01-31 I'm trying to add a column called Fortnight to the existing data.frame. The components of the new Fortnight column are based on the existing Date column so that if the value in Date falls within the first fortnight (f.n1) then the value of the new Fortnight column would be FN1, and if the value of the Date column falls within the second fortnight (f.n2), then the value of the Fortnight column would be FN2, and so on. The end result should look like: XY Date Fortnight 1 114.5508 47.14094 2009-01-01 FN1 2 114.6468 46.98874 2009-01-03 FN1 3 114.6596 46.91235 2009-01-05 FN1 4 114.6957 46.88265 2009-01-10 FN1 5 114.6828 46.80584 2009-01-14 FN1 6 114.8903 46.67022 2009-01-15 FN2 7 114.9519 46.53264 2009-01-16 FN2 8 114.8842 46.47727 2009-01-17 FN2 9 114.8579 46.46457 2009-01-22 FN2 10 114.8489 46.47032 2009-01-29 FN2 I manually entered the above values for the Fortnight column to illustrate my point, however, that would be quite tiresome for 500+ rows of data ;-) The only other similar issue I found on the list was https://stat.ethz.ch/pipermail/r-help/2008-February/153995.html but that particular problem is slightly different than what I'm trying to accomplish here. I appreciate your time and assistance. Thanks in advance. Regards, Hakim Abdi _ Abdulhakim Abdi, M.Sc. Research Intern Conservation GIS/Remote Sensing Lab Smithsonian Conservation Biology Institute 1500 Remount Road Front Royal, VA 22630 phone: +1 540 635 6578 mobile: +1 747 224 7006 fax: +1 540 635 6506 (Attn:GIS Lab) email: ab...@si.edu http://nationalzoo.si.edu/SCBI/ConservationGIS/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help in the legend()
Hi R-users, I was plotting the differences of the variances of the three estimators- T^(1), T^(2), T^(3), ofcourse taking two at a time. I was using the expression() in the legend function in order to show which line correspond to which of the difference, but the following that I had used didn't gave desired result. I would be grateful, if you help me out. plot(n, pg, type=l,xlab=n,ylab=Differences of the variances,ylim=c(-0.0012,0.0023), xlim=c(0,60)); lines(gs,lty = 2) lines(ps,lty=5) legend(30, 0.0021, expression( c ( var(t^(3))-var(t^(2)), var(t^(2))-var(t^(1))), var(t^(3))-var(t^(1)) ) ), lty=c(1,2,5)) Thanks. Shant [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function to compute the multinomial beta function?
At 05:10 PM 7/5/2010, Gregory Gentlemen wrote: Dear R-users, Is there an R function to compute the multinomial beta function? That is, the normalizing constant that arises in a Dirichlet distribution. For example, with three parameters the beta function is Beta(n1,n2,n2) = Gamma(n1)*Gamma(n2)*Gamma(n3)/Gamma(n1+n2+n3) beta3- function (n1, n2, n3) exp(lgamma(n1)+lgamma(n2)+lgamma(n3)-lgamma(n1+n2+n3)) beta3(5,3,8) [1] 1.850002e-07 Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: r...@lcfltd.com Least Cost Formulations, Ltd.URL: http://lcfltd.com/ 824 Timberlake Drive Tel: 757-467-0954 Virginia Beach, VA 23464-3239Fax: 757-467-2947 Vere scire est per causas scire __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help in the legend()
On Jul 5, 2010, at 8:06 PM, Shant Ch wrote: Hi R-users, I was plotting the differences of the variances of the three estimators- T^(1), T^(2), T^(3), ofcourse taking two at a time. I was using the expression() in the legend function in order to show which line correspond to which of the difference, but the following that I had used didn't gave desired result. I would be grateful, if you help me out. plot(n, pg, type=l,xlab=n,ylab=Differences of the variances,ylim=c(-0.0012,0.0023), xlim=c(0,60)); lines(gs,lty = 2) lines(ps,lty=5) legend(30, 0.0021, expression( c ( var(t^(3))-var(t^(2)), var(t^(2))- var(t^(1))), var(t^(3))-var(t^(1)) ) ), lty=c(1,2,5)) Have you consider offering a toy set of objects which defines t, n, and pg. Thanks. Shant [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] to remove duplicate values
On Mon, 5 Jul 2010, Moohwan Kim wrote: Dear R family, Suppose I have two series. order value 1 0.52 2 0.23 3 0.43 4 0.21 5 0.32 6 0.32 7 0.32 8 0.32 9 0.32 10 0.12 11 0.46 12 0.09 13 0.32 14 0.25 For these two series, I figured out the way to detect the locations of duplicate values. You _asked how_ to do it on R-help and got several answers showing how to do it. That doesn't count as 'figured out how to do it'. You should give credit where it is warranted. The next thing to do is remove the repeated values except for a value that would not be next to each other. Well, that is what you should have asked in the first place. The answer is actually simpler and need not involve duplicated(). Use one each of these operations head tail != c [ in that order and you have a neat one-liner that returns the original data.frame without the adjacent duplicates. And since I did not say exactly how to do it, you will be able to claim that you figured out the way albeit with assistance. ;-) In other words, while keeping the 13th value, I want to remove observations from 6th to 9th. That is my end goal. Could you help me reach the goal? best moohwan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory problem in multinomial logistic regression
On Mon, 5 Jul 2010, Daniel Wiesmann wrote: Dear All I am trying to fit a multinomial logistic regression to a data set with a size of 94279 by 14 entries. The data frame has one sample column which is the categorical variable, and the number of different categories is 9. The size of the data set (as a csv file) is less than 10 MB. First, do str( your.data.frame ) so we can be sure that you do not have a factor lurking among your regressors. Then report the calls you used for vglm() and mlogit(). It might not hurt to construct the model.matrix() first and check on it with object.size() Also try for (i in levels(your.data.frame$sample)){ print( glm(I(sample==i) ~. , your.data.,frame, family=binomial) )} just to check on your data. If that loop fails all bets are off. HTH, Chuck I tried to fit a multinomial logistic regression, either using vglm() from the VGAM package or mlogit() from the mlogit package. In both cases the estimation crashes because I do not have enough memory, although the free memory before starting the regression is more than 2GB. The regression functions eat up all of my memory. Does anyone know why this relatively small data set leads to memory problems, and how I could work around my problem? thank you for your help, Daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help in the legend()
Thanks Dr. Winsemius. Here's the toy data set. Basically pg = var(t^(3))-var(t^(2), gs = var(t^(2))-var(t^(1))and ps=var(t^(3))-var(t^(1)). The revised code and the data set is as follows: n-seq(4:13) pg-c(-1.241394e-03, -9.738079e-04, -7.158755e-04, -5.343962e-04, -4.088778e-04, -3.202068e-04, -2.558709e-04, -2.079914e-04, -1.715435e-04, -1.432430e-04) gs-c(0.0022520038, 0.0020060234, 0.0017601434, 0.0015519810, 0.0013810851,0.0012407732, 0.0011245410, 0.0010271681, 0.0009446642, 0.0008740083) ps-c( 0.0010106098, 0.0010322155, 0.0010442678, 0.0010175848, 0.0009722074,0.0009205665, 0.0008686700, 0.0008191768, 0.0007731207, 0.0007307653) plot(n, pg, type=l,xlab=n,ylab=Differences of the variances,ylim=c(-0.0012,0.0023) ); lines(gs,lty = 2) lines(ps,lty=5) legend(30, 0.0021, expression( c ( var(t^(3))-var(t^(2)), var(t^(2))-var(t^(1))), var(t^(3))-var(t^(1)) ) ), lty=c(1,2,5)). From: David Winsemius dwinsem...@comcast.net Cc: r-help@r-project.org Sent: Mon, July 5, 2010 9:43:19 PM Subject: Re: [R] Help in the legend() On Jul 5, 2010, at 8:06 PM, Shant Ch wrote: Hi R-users, I was plotting the differences of the variances of the three estimators- T^(1), T^(2), T^(3), ofcourse taking two at a time. I was using the expression() in the legend function in order to show which line correspond to which of the difference, but the following that I had used didn't gave desired result. I would be grateful, if you help me out. plot(n, pg, type=l,xlab=n,ylab=Differences of the variances,ylim=c(-0.0012,0.0023), xlim=c(0,60)); lines(gs,lty = 2) lines(ps,lty=5) legend(30, 0.0021, expression( c ( var(t^(3))-var(t^(2)), var(t^(2))-var(t^(1))), var(t^(3))-var(t^(1)) ) ), lty=c(1,2,5)) Have you consider offering a toy set of objects which defines t, n, and pg. Thanks. Shant [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nls + quasi-poisson distribution
Hello R-helpers, I would like to fit a non-linear function to data (Discrete X axis, over-dispersed Poisson values on the Y axis). I found the functions gnlr in the gnlm package from Jim Lindsey: this can handle nonlinear regression equations for the parameters of Poisson and negative binomial distributions, among others. I also found the function nls2 in the software package accompanying the book Statistical tools for nonlinear regression by Huet et al: this can handle nonlinear regression with Poisson distributed Y-axis values. I was wondering if there was any other option: specifically, any option that handled nonlinear fitting with quasi-Poisson distributions (to handle the overdispersion). This is a very new area for me, and I am still trying to figure out the best way to do this, so I would appreciate any and all pointers. Thanks much, Suresh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to Plot With Different Marker ( ‘x ’ and ‘o’) Based on Condition in R
Dear Expert, I have a data that looks like this: for_y_axis -c(0.49534,0.80796,0.93970,0.8) for_x_axis -c(1,2,3,4) count -c(0,33,0,4) What I want to do is to plot the graph using for_x_axis and for_y_axis but will mark each point with o if the value is equal to 0(zero) and with x if the count value is greater than zero. Is there a simple way to achieve that in R? Regards, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.