[R] comparing ARIMA model to data

2011-04-07 Thread Andrew Collier
hi,

i am trying to teach myself about ARIMA models. i have followed examples
from a number of sources and have more or less got the hang of how it
works. i would like to compare the output from the fitted model to the
original data. is this possible? or even a meaningful thing to do?

to be clear, for example, having generated a fit to some data using

 fit - arima(LakeHuron, order = c(1, 0, 1))

and then plotting the data with

 plot(LakeHuron)

is it possible to overlay the output of the model on the original data
to compare how well it captures the variations in the data? i know that
predict can be used to extrapolate beyond the end of the data series,
but i want to evaluate the model within (not beyond) the original data.

best regards,
andrew.

-- 
Andrew B. Collier

Physicist
Waves and Space Plasmas Group
Hermanus Magnetic Observatory

Honorary Senior Lecturer tel: +27 31 2601157
Space Physics Research Institute fax: +27 31 2607795
University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] asterisk in subscript

2011-01-28 Thread Andrew Collier
hi,

i am trying to label a plot axis with the equivalent of the latex $n_*$.
i initially tried

expression(paste(italic(n)[*]))

but this made the * absolutely tiny and centred about midway wrt the n.
then

expression(paste(italic(n)[textstyle(*)]))

made the * about the right size but now it looks more like a superscript
than a subscript.

does anyone have an idea of how to get the * to the right subscript
position (ie. somewhere near the baseline of the n)? thanks!

best regards,
andrew.

-- 
Andrew B. Collier

Physicist
Waves and Space Plasmas Group
Hermanus Magnetic Observatory

Honorary Senior Lecturer tel: +27 31 2601157
Space Physics Research Institute fax: +27 31 2607795
University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] asterisk in subscript

2011-01-28 Thread Andrew Collier
thanks for the rapid response.

yes, in x11 your suggestion works perfectly. i have never thought of the
asterisk as being a superscript... to me it has always been the
mulitply sign which is centred. thanks for the education!

however, my plot is being sent to postscript, which i guess does not
support unicode because i get a whole flurry of warnings and the text on
the plot is not correct.

Warning messages:
1: In title(...) : font metrics unknown for Unicode character U+2217
2: In title(...) : font metrics unknown for Unicode character U+2217
3: In title(...) : font metrics unknown for Unicode character U+2217
4: In title(...) : font metrics unknown for Unicode character U+2217
5: In title(...) :
  conversion failure on '∗' in 'mbcsToSbcs': dot substituted for e2
6: In title(...) :
  conversion failure on '∗' in 'mbcsToSbcs': dot substituted for 88
7: In title(...) :
  conversion failure on '∗' in 'mbcsToSbcs': dot substituted for 97
8: In title(...) :
  conversion failure on '∗' in 'mbcsToSbcs': dot substituted for e2
9: In title(...) :
  conversion failure on '∗' in 'mbcsToSbcs': dot substituted for 88
10: In title(...) :
  conversion failure on '∗' in 'mbcsToSbcs': dot substituted for 97

the plotting command is

plot(NA, xlim = c(0,10), ylim = c(0, 20), ylab =
expression(paste(symbol(\341), italic(n), symbol(\361))), xlab =
expression(paste(italic(n)[\u2217])))

i am just using the plain vanilla font (no changes).

it is being run on R version 2.11.1 (2010-05-31) under ubuntu.

my locale is

LC_CTYPE=en_ZA.utf8;LC_NUMERIC=C;LC_TIME=en_ZA.utf8;LC_COLLATE=en_ZA.utf8;LC_MONETARY=C;LC_MESSAGES=en_ZA.utf8;LC_PAPER=en_ZA.utf8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_ZA.utf8;LC_IDENTIFICATION=C


-- 
Andrew B. Collier

Physicist
Waves and Space Plasmas Group
Hermanus Magnetic Observatory

Honorary Senior Lecturer tel: +27 31 2601157
Space Physics Research Institute fax: +27 31 2607795
University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655



On Fri, 2011-01-28 at 09:48 +, Prof Brian Ripley wrote: 
 On Fri, 28 Jan 2011, Andrew Collier wrote:
 
  hi,
 
  i am trying to label a plot axis with the equivalent of the latex $n_*$.
  i initially tried
 
  expression(paste(italic(n)[*]))
 
  but this made the * absolutely tiny and centred about midway wrt the n.
  then
 
  expression(paste(italic(n)[textstyle(*)]))
 
  made the * about the right size but now it looks more like a superscript
  than a subscript.
 
  does anyone have an idea of how to get the * to the right subscript
  position (ie. somewhere near the baseline of the n)? thanks!
 
 I think these *are* correct: remember that an asterisk is a 
 superscript.  However, what you see depends on the graphics device and 
 font you used, and you have not told us (pace the posting guide).  If 
 your OS and device support Unicode, try \u2217:
 
 expression(paste(italic(n)[\u2217]))
 
 looks about right to me (X11() on Linux).
 
  best regards,
  andrew.
 
  -- 
  Andrew B. Collier
 
  Physicist
  Waves and Space Plasmas Group
  Hermanus Magnetic Observatory
 
  Honorary Senior Lecturer tel: +27 31 2601157
  Space Physics Research Institute fax: +27 31 2607795
  University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] asterisk in subscript

2011-01-28 Thread Andrew Collier
magic! this does the trick:

expression(paste(italic(n)[symbol(\052)]))

thanks for the hint, ted!

-- 
Andrew B. Collier

Physicist
Waves and Space Plasmas Group
Hermanus Magnetic Observatory

Honorary Senior Lecturer tel: +27 31 2601157
Space Physics Research Institute fax: +27 31 2607795
University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lines and points without margin

2010-12-14 Thread Andrew Collier
hi,

i am sure that this is a trivial question but i have not been able to
find an answer by searching the mailing lists. i want to plot points on
a graph, joined by lines. the command that i am using is

points(x, y, type = b, pch = 21)

this plots nice open circles at the data points and draws lines between
them. however, the lines do not come all the way up to the edge of the
circles but stop some small distance away so that there is an empty
margin around the circles. is there a way to get rid of this margin?
my first guess was that there would be an option to par() but i did not
find anything there. any suggestions would be appreciated.

thanks!

best regards,
andrew.

-- 
Andrew B. Collier

Physicist
Waves and Space Plasmas Group
Hermanus Magnetic Observatory

Honorary Senior Lecturer tel: +27 31 2601157
Space Physics Research Institute fax: +27 31 2607795
University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] scale caption on levelplot

2010-12-04 Thread Andrew Collier
hi,

i am trying to figure out how to put a caption on the colour scale of a
levelplot. there does not seem to be an option for this in levelplot().
i tried using mtext() but as soon as you put the text far out enough on
the right of the plot, it goes beyond the plot boundary. so i tried to
extend the margin on the right of the plot using par(mar) but this did
not have any effect on the plot area.

i would really appreciate some help with this because having a caption
on a colour scale is rather fundamental and certainly something that a
journal referee is going to pick up on!

best regards,
andrew.


-- 
Andrew B. Collier

Physicist
Waves and Space Plasmas Group
Hermanus Magnetic Observatory

Honorary Senior Lecturer tel: +27 31 2601157
Space Physics Research Institute fax: +27 31 2607795
University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scale caption on levelplot

2010-12-04 Thread Andrew Collier
hi peter and david,

thanks for the excellent suggestions. here is something like what i am
finally using (those fancy fonts were really tempting, but i chose
something a little more mundane!):

library(lattice)

x - sort(rnorm(100,50,10))
y - sort(runif(100,0,20))
d - expand.grid(x=x, y=y)
d$z - x + y
plot.new()
p = levelplot(z ~ x*y, d,
   par.settings=list(
 layout.widths=list(right.padding=4)),
   colorkey = TRUE)
print(p)

mtext(CAPTION, 4, 1)

your help really appreciated!

best regards,
andrew.

-- 
Andrew B. Collier

Physicist
Waves and Space Plasmas Group
Hermanus Magnetic Observatory

Honorary Senior Lecturer tel: +27 31 2601157
Space Physics Research Institute fax: +27 31 2607795
University of KwaZulu-Natal, Durban, South Africagsm: +27 83 3813655

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] save() with 64 bit and 32 bit R

2010-11-03 Thread Andrew Collier
hi,

i have been using a 64 bit desktop machine to process a whole lot of
data which i have then subsequently used save() to store. i am now
wanting to use this data on my laptop machine, which is a 32 bit
install. i suppose that i should not be surprised that the 64 bit data
files do not open on my 32 bit machine! does anyone have a smart idea as
to how these data can be reformatted for 32 bits? unfortunately the data
processing that i did on the 64 bit machine took just under 20 days to
complete, so i am not very keen to just throw away this data and begin
again on the 32 bit machine.

sorry, in retrospect this all seems rather idiotic, but i assumed that
the data stored by save() would be compatible between 64 bit and 32 bit
(there is no warning in the manual).

thanks,
andrew.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ts subscripting problem

2008-11-26 Thread andrew collier
hi,

i am having trouble getting a particular time series to plot. this is what i
have:

 class(irradiance)
[1] ts
 irradiance[1:30]
  197811   197812   197901   197902   197903   197904   197905   197906
1366.679 1366.729 1367.476 1367.739 1368.339 1367.883 1367.916 1367.055
  197907   197908   197909   197910   197911   197912   198001   198002
1367.484 1366.887 1366.935 1367.034 1366.997 1367.310 1367.041 1366.459
  198003   198004   198005   198006   198007   198008   198009   198010
1367.143 1366.553 1366.597 1366.854 1366.814 1366.901 1366.622 1366.669
  198011   198012   198101   198102   198103   198104
1365.874 1366.098 1367.141 1366.239 1366.323 1366.388
 plot(irradiance[1:30])
 plot(irradiance)
Error in dn[[2]] : subscript out of bounds

so, if i plot a subset of the data it works fine. but if i try to plot the
whole thing it breaks. the ts object was created using:

irradiance = ts(tapply(d$number, f, mean), freq = 12, start = c(1978, 11))

and other ts objects that i have defined using basically the same approach
work fine.

any ideas greatly appreciated!

cheers,
andrew.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] p values for polychor

2008-05-07 Thread andrew collier
hello,

i have been using cor.test() for calculating the correlation coefficient  and p 
values for some data. however, since the data consist of two dichotomous 
sequences (actually just binary data), i understand that simply using the 
pearson correlation is not sufficient. however, having done a bit of research i 
found that the 
tetrachoric correlation is what i am after. found the polycor package and the 
polychor routine, which seem to do precisely what i want. however, i don't get 
p values out of polychor, just the standard deviation.

so, in a rather naive way i have tried to write a function which will return a 
list with similar fields as what one gets from cor.test(). not being terribly 
strong with statistics though, i am not sure whether this is entirely correct. 
could someone tell me if i am on the right track... or point out where i am 
going wrong?

tetrachoric.test - function(x, y) {
p - polychor(x, y, std.err = TRUE)
#
p$statistic - p$rho / sqrt(c(p$var))
#
p$estimate - p$rho
p$p.value = 2 * (1 - pnorm(abs(p$statistic)))

p
}

the assumption is that the p value is the integration of the two tails of the 
distribution?

 x - as.integer(runif(20)  0.5)
 y - as.integer(runif(20)  0.5)
 p - tetrachoric.test(x, y)   
 p$statistic
[1] -0.2616866
 p$p.value
[1] 0.7935631
 p$var
  [,1]
[1,] 0.1452105

thanks for any help!

best regards,
andrew.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cor.test() and binary sequences

2007-11-12 Thread andrew collier
peter, thanks for your help with my questions regarding cor.test(). i have 
another question though: does this function make any assumptions about the 
underlying distribution of the two sequences? does it assume that they have a 
gaussian distribution?

i ask because the data that i am working with is two binary sequences. just 
series of 0 and 1. will the confidence intervals and p-values generated by 
cor.test() still apply?

cheers,
andrew.

--
Get a free email account with anti spam protection.
http://www.bluebottle.com/tag/2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2007-11-07 Thread andrew collier
hello,

i am a bit of a statistical neophyte and currently trying to make some sense of 
confidence intervals for correlation coefficients. i am using the cor.test() 
function. the documentation is quite terse and i am having trouble tieing up 
the output from this function with stuff that i have read in the literature. 
so, for example, i make two sequences and calculate the correlation coefficient:

 x - runif(20)
 y - jitter(x, amount = 0.7)
 cor(x, y)
[1] 0.5198252

now i want to establish that confidence i can attach to this value. from the 
table i retrieved from the article Understanding Correlation by r. j. rummel 
[online] i get that the probability of a correlation coefficient of 0.5198252 
arising by chance from two sequences of length 20 is less than 0.01. so this 
seems like i can attach some significance to the result. i still don't 
understand where the table comes from and it only goes up as far as sequences 
of length 1000. the data i am wanting to analyse has length of more than 7, 
so i need to calculate these confidence levels myself. i assume that cor.test() 
is the way to do this. so i tried:

 cor.test(x, y, greater, conf.level = 0.95)

Pearson's product-moment correlation

data:  x and y 
t = 2.5816, df = 18, p-value = 0.009405
alternative hypothesis: true correlation is greater than 0 
95 percent confidence interval:
 0.1753340 1.000 
sample estimates:
  cor 
0.5198252 

 cor.test(x, y, less, conf.level = 0.95)

Pearson's product-moment correlation

data:  x and y 
t = 2.5816, df = 18, p-value = 0.9906
alternative hypothesis: true correlation is less than 0 
95 percent confidence interval:
 -1.000  0.7509089 
sample estimates:
  cor 
0.5198252 

 cor.test(x, y, two.sided, conf.level = 0.95)

Pearson's product-moment correlation

data:  x and y 
t = 2.5816, df = 18, p-value = 0.01881
alternative hypothesis: true correlation is not equal to 0 
95 percent confidence interval:
 0.1003997 0.7823738 
sample estimates:
  cor 
0.5198252

i reckon that the first invocation of the function is closest to what i am 
looking for. now the rest of the output from the function is a total mystery to 
me. could anyone please tell me:

o what is a p-value?
o how to interpret the quoted confidence interval?

i do see that as i increase the conf.level input parameter to cov.test() the 
lower bound of the confidence interval gets lower:

0.95-  0.1753340 1.000
0.975   -  0.1003997 1.000
0.995   -  -0.04859184  1.

does this mean that with 99.5% certainty the correlation coefficient should lie 
in the range -0.04859184 to 1.? hmmm. i am doubtful. plus this doesn't 
really answer my question, which is more about what confidence i can assign to 
the measured correlation coefficient (0.5198252).

an alternative question would be: given two sequences and a calculated 
correlation coefficient, with what probability could i assert that the 
underlying processes are indeed correlated and that the calculated correlation 
coefficient does not simply arise by chance.

please forgive my ignorance. any help will be vastly appreciated. thanks!

best regards,
andrew.

--
Get a free email account with anti spam protection.
http://www.bluebottle.com/tag/2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.