[R] comparing ARIMA model to data
hi, i am trying to teach myself about ARIMA models. i have followed examples from a number of sources and have more or less got the hang of how it works. i would like to compare the output from the fitted model to the original data. is this possible? or even a meaningful thing to do? to be clear, for example, having generated a fit to some data using fit - arima(LakeHuron, order = c(1, 0, 1)) and then plotting the data with plot(LakeHuron) is it possible to overlay the output of the model on the original data to compare how well it captures the variations in the data? i know that predict can be used to extrapolate beyond the end of the data series, but i want to evaluate the model within (not beyond) the original data. best regards, andrew. -- Andrew B. Collier Physicist Waves and Space Plasmas Group Hermanus Magnetic Observatory Honorary Senior Lecturer tel: +27 31 2601157 Space Physics Research Institute fax: +27 31 2607795 University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] asterisk in subscript
hi, i am trying to label a plot axis with the equivalent of the latex $n_*$. i initially tried expression(paste(italic(n)[*])) but this made the * absolutely tiny and centred about midway wrt the n. then expression(paste(italic(n)[textstyle(*)])) made the * about the right size but now it looks more like a superscript than a subscript. does anyone have an idea of how to get the * to the right subscript position (ie. somewhere near the baseline of the n)? thanks! best regards, andrew. -- Andrew B. Collier Physicist Waves and Space Plasmas Group Hermanus Magnetic Observatory Honorary Senior Lecturer tel: +27 31 2601157 Space Physics Research Institute fax: +27 31 2607795 University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] asterisk in subscript
thanks for the rapid response. yes, in x11 your suggestion works perfectly. i have never thought of the asterisk as being a superscript... to me it has always been the mulitply sign which is centred. thanks for the education! however, my plot is being sent to postscript, which i guess does not support unicode because i get a whole flurry of warnings and the text on the plot is not correct. Warning messages: 1: In title(...) : font metrics unknown for Unicode character U+2217 2: In title(...) : font metrics unknown for Unicode character U+2217 3: In title(...) : font metrics unknown for Unicode character U+2217 4: In title(...) : font metrics unknown for Unicode character U+2217 5: In title(...) : conversion failure on '∗' in 'mbcsToSbcs': dot substituted for e2 6: In title(...) : conversion failure on '∗' in 'mbcsToSbcs': dot substituted for 88 7: In title(...) : conversion failure on '∗' in 'mbcsToSbcs': dot substituted for 97 8: In title(...) : conversion failure on '∗' in 'mbcsToSbcs': dot substituted for e2 9: In title(...) : conversion failure on '∗' in 'mbcsToSbcs': dot substituted for 88 10: In title(...) : conversion failure on '∗' in 'mbcsToSbcs': dot substituted for 97 the plotting command is plot(NA, xlim = c(0,10), ylim = c(0, 20), ylab = expression(paste(symbol(\341), italic(n), symbol(\361))), xlab = expression(paste(italic(n)[\u2217]))) i am just using the plain vanilla font (no changes). it is being run on R version 2.11.1 (2010-05-31) under ubuntu. my locale is LC_CTYPE=en_ZA.utf8;LC_NUMERIC=C;LC_TIME=en_ZA.utf8;LC_COLLATE=en_ZA.utf8;LC_MONETARY=C;LC_MESSAGES=en_ZA.utf8;LC_PAPER=en_ZA.utf8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_ZA.utf8;LC_IDENTIFICATION=C -- Andrew B. Collier Physicist Waves and Space Plasmas Group Hermanus Magnetic Observatory Honorary Senior Lecturer tel: +27 31 2601157 Space Physics Research Institute fax: +27 31 2607795 University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655 On Fri, 2011-01-28 at 09:48 +, Prof Brian Ripley wrote: On Fri, 28 Jan 2011, Andrew Collier wrote: hi, i am trying to label a plot axis with the equivalent of the latex $n_*$. i initially tried expression(paste(italic(n)[*])) but this made the * absolutely tiny and centred about midway wrt the n. then expression(paste(italic(n)[textstyle(*)])) made the * about the right size but now it looks more like a superscript than a subscript. does anyone have an idea of how to get the * to the right subscript position (ie. somewhere near the baseline of the n)? thanks! I think these *are* correct: remember that an asterisk is a superscript. However, what you see depends on the graphics device and font you used, and you have not told us (pace the posting guide). If your OS and device support Unicode, try \u2217: expression(paste(italic(n)[\u2217])) looks about right to me (X11() on Linux). best regards, andrew. -- Andrew B. Collier Physicist Waves and Space Plasmas Group Hermanus Magnetic Observatory Honorary Senior Lecturer tel: +27 31 2601157 Space Physics Research Institute fax: +27 31 2607795 University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] asterisk in subscript
magic! this does the trick: expression(paste(italic(n)[symbol(\052)])) thanks for the hint, ted! -- Andrew B. Collier Physicist Waves and Space Plasmas Group Hermanus Magnetic Observatory Honorary Senior Lecturer tel: +27 31 2601157 Space Physics Research Institute fax: +27 31 2607795 University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lines and points without margin
hi, i am sure that this is a trivial question but i have not been able to find an answer by searching the mailing lists. i want to plot points on a graph, joined by lines. the command that i am using is points(x, y, type = b, pch = 21) this plots nice open circles at the data points and draws lines between them. however, the lines do not come all the way up to the edge of the circles but stop some small distance away so that there is an empty margin around the circles. is there a way to get rid of this margin? my first guess was that there would be an option to par() but i did not find anything there. any suggestions would be appreciated. thanks! best regards, andrew. -- Andrew B. Collier Physicist Waves and Space Plasmas Group Hermanus Magnetic Observatory Honorary Senior Lecturer tel: +27 31 2601157 Space Physics Research Institute fax: +27 31 2607795 University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scale caption on levelplot
hi, i am trying to figure out how to put a caption on the colour scale of a levelplot. there does not seem to be an option for this in levelplot(). i tried using mtext() but as soon as you put the text far out enough on the right of the plot, it goes beyond the plot boundary. so i tried to extend the margin on the right of the plot using par(mar) but this did not have any effect on the plot area. i would really appreciate some help with this because having a caption on a colour scale is rather fundamental and certainly something that a journal referee is going to pick up on! best regards, andrew. -- Andrew B. Collier Physicist Waves and Space Plasmas Group Hermanus Magnetic Observatory Honorary Senior Lecturer tel: +27 31 2601157 Space Physics Research Institute fax: +27 31 2607795 University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scale caption on levelplot
hi peter and david, thanks for the excellent suggestions. here is something like what i am finally using (those fancy fonts were really tempting, but i chose something a little more mundane!): library(lattice) x - sort(rnorm(100,50,10)) y - sort(runif(100,0,20)) d - expand.grid(x=x, y=y) d$z - x + y plot.new() p = levelplot(z ~ x*y, d, par.settings=list( layout.widths=list(right.padding=4)), colorkey = TRUE) print(p) mtext(CAPTION, 4, 1) your help really appreciated! best regards, andrew. -- Andrew B. Collier Physicist Waves and Space Plasmas Group Hermanus Magnetic Observatory Honorary Senior Lecturer tel: +27 31 2601157 Space Physics Research Institute fax: +27 31 2607795 University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] save() with 64 bit and 32 bit R
hi, i have been using a 64 bit desktop machine to process a whole lot of data which i have then subsequently used save() to store. i am now wanting to use this data on my laptop machine, which is a 32 bit install. i suppose that i should not be surprised that the 64 bit data files do not open on my 32 bit machine! does anyone have a smart idea as to how these data can be reformatted for 32 bits? unfortunately the data processing that i did on the 64 bit machine took just under 20 days to complete, so i am not very keen to just throw away this data and begin again on the 32 bit machine. sorry, in retrospect this all seems rather idiotic, but i assumed that the data stored by save() would be compatible between 64 bit and 32 bit (there is no warning in the manual). thanks, andrew. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ts subscripting problem
hi, i am having trouble getting a particular time series to plot. this is what i have: class(irradiance) [1] ts irradiance[1:30] 197811 197812 197901 197902 197903 197904 197905 197906 1366.679 1366.729 1367.476 1367.739 1368.339 1367.883 1367.916 1367.055 197907 197908 197909 197910 197911 197912 198001 198002 1367.484 1366.887 1366.935 1367.034 1366.997 1367.310 1367.041 1366.459 198003 198004 198005 198006 198007 198008 198009 198010 1367.143 1366.553 1366.597 1366.854 1366.814 1366.901 1366.622 1366.669 198011 198012 198101 198102 198103 198104 1365.874 1366.098 1367.141 1366.239 1366.323 1366.388 plot(irradiance[1:30]) plot(irradiance) Error in dn[[2]] : subscript out of bounds so, if i plot a subset of the data it works fine. but if i try to plot the whole thing it breaks. the ts object was created using: irradiance = ts(tapply(d$number, f, mean), freq = 12, start = c(1978, 11)) and other ts objects that i have defined using basically the same approach work fine. any ideas greatly appreciated! cheers, andrew. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] p values for polychor
hello, i have been using cor.test() for calculating the correlation coefficient and p values for some data. however, since the data consist of two dichotomous sequences (actually just binary data), i understand that simply using the pearson correlation is not sufficient. however, having done a bit of research i found that the tetrachoric correlation is what i am after. found the polycor package and the polychor routine, which seem to do precisely what i want. however, i don't get p values out of polychor, just the standard deviation. so, in a rather naive way i have tried to write a function which will return a list with similar fields as what one gets from cor.test(). not being terribly strong with statistics though, i am not sure whether this is entirely correct. could someone tell me if i am on the right track... or point out where i am going wrong? tetrachoric.test - function(x, y) { p - polychor(x, y, std.err = TRUE) # p$statistic - p$rho / sqrt(c(p$var)) # p$estimate - p$rho p$p.value = 2 * (1 - pnorm(abs(p$statistic))) p } the assumption is that the p value is the integration of the two tails of the distribution? x - as.integer(runif(20) 0.5) y - as.integer(runif(20) 0.5) p - tetrachoric.test(x, y) p$statistic [1] -0.2616866 p$p.value [1] 0.7935631 p$var [,1] [1,] 0.1452105 thanks for any help! best regards, andrew. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cor.test() and binary sequences
peter, thanks for your help with my questions regarding cor.test(). i have another question though: does this function make any assumptions about the underlying distribution of the two sequences? does it assume that they have a gaussian distribution? i ask because the data that i am working with is two binary sequences. just series of 0 and 1. will the confidence intervals and p-values generated by cor.test() still apply? cheers, andrew. -- Get a free email account with anti spam protection. http://www.bluebottle.com/tag/2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
hello, i am a bit of a statistical neophyte and currently trying to make some sense of confidence intervals for correlation coefficients. i am using the cor.test() function. the documentation is quite terse and i am having trouble tieing up the output from this function with stuff that i have read in the literature. so, for example, i make two sequences and calculate the correlation coefficient: x - runif(20) y - jitter(x, amount = 0.7) cor(x, y) [1] 0.5198252 now i want to establish that confidence i can attach to this value. from the table i retrieved from the article Understanding Correlation by r. j. rummel [online] i get that the probability of a correlation coefficient of 0.5198252 arising by chance from two sequences of length 20 is less than 0.01. so this seems like i can attach some significance to the result. i still don't understand where the table comes from and it only goes up as far as sequences of length 1000. the data i am wanting to analyse has length of more than 7, so i need to calculate these confidence levels myself. i assume that cor.test() is the way to do this. so i tried: cor.test(x, y, greater, conf.level = 0.95) Pearson's product-moment correlation data: x and y t = 2.5816, df = 18, p-value = 0.009405 alternative hypothesis: true correlation is greater than 0 95 percent confidence interval: 0.1753340 1.000 sample estimates: cor 0.5198252 cor.test(x, y, less, conf.level = 0.95) Pearson's product-moment correlation data: x and y t = 2.5816, df = 18, p-value = 0.9906 alternative hypothesis: true correlation is less than 0 95 percent confidence interval: -1.000 0.7509089 sample estimates: cor 0.5198252 cor.test(x, y, two.sided, conf.level = 0.95) Pearson's product-moment correlation data: x and y t = 2.5816, df = 18, p-value = 0.01881 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: 0.1003997 0.7823738 sample estimates: cor 0.5198252 i reckon that the first invocation of the function is closest to what i am looking for. now the rest of the output from the function is a total mystery to me. could anyone please tell me: o what is a p-value? o how to interpret the quoted confidence interval? i do see that as i increase the conf.level input parameter to cov.test() the lower bound of the confidence interval gets lower: 0.95- 0.1753340 1.000 0.975 - 0.1003997 1.000 0.995 - -0.04859184 1. does this mean that with 99.5% certainty the correlation coefficient should lie in the range -0.04859184 to 1.? hmmm. i am doubtful. plus this doesn't really answer my question, which is more about what confidence i can assign to the measured correlation coefficient (0.5198252). an alternative question would be: given two sequences and a calculated correlation coefficient, with what probability could i assert that the underlying processes are indeed correlated and that the calculated correlation coefficient does not simply arise by chance. please forgive my ignorance. any help will be vastly appreciated. thanks! best regards, andrew. -- Get a free email account with anti spam protection. http://www.bluebottle.com/tag/2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.