Hi all, I am modifying a program I wrote before to perform smallest canonical (SCAN) correlation method for identification of ARMA(p,q) orders in Time Series, but when I compared the output with SAS, there are some differences.
My SCAN R code can be downloaded in the following URL: http://netstat.stat.tku.edu.tw/download/arma_scan_R.txt I used Series_R (LA Ozone data) in Box and Jenkins(4th edition) as example. A sample run can be done via # ozone = scan("http://netstat.stat.tku.edu.tw/download/box_ozone.txt") # source("http://netstat.stat.tku.edu.tw/download/arma_scan_R.txt") # arma.scan(ozone) First is the output of Squared Canonical Correlation Estimates: SAS: Squared Canonical Correlation Estimates Lags MA 0 MA 1 MA 2 MA 3 MA 4 MA 5 AR 0 0.5352 0.2423 0.0696 0.0035 0.0112 0.0183 AR 1 0.0074 0.0199 0.0304 0.0399 0.0185 0.0052 AR 2 0.0173 0.0005 0.0003 0.0167 0.0123 0.0198 AR 3 0.0190 0.0003 0.0002 0.0230 0.0026 0.0287 AR 4 0.0130 0.0262 0.0214 0.0054 0.0206 0.0302 AR 5 0.0143 0.0068 0.0229 0.0230 0.0171 0.0187 My R-code: MA-0 MA-1 MA-2 MA-3 MA-4 MA-5 AR-0 0.5264 0.2342 0.0668 0.0033 0.0105 0.0000 AR-1 0.0080 0.0197 0.0299 0.0399 0.0183 0.0052 AR-2 0.0158 0.0005 0.0003 0.0167 0.0122 0.0198 AR-3 0.0153 0.0003 0.0002 0.0229 0.0025 0.0283 AR-4 0.0099 0.0262 0.0214 0.0054 0.0204 0.0302 AR-5 0.0116 0.0066 0.0225 0.0229 0.0174 0.0190 The results are similar. The main differences is in the Chi-Square P-values: SAS: SCAN Chi-Square[1] Probability Values Lags MA 0 MA 1 MA 2 MA 3 MA 4 MA 5 AR 0 <.0001 <.0001 0.0148 0.6003 0.3472 0.2307 AR 1 0.2073 0.0407 0.0164 0.0183 0.2326 0.3313 AR 2 0.0532 0.7927 0.8537 0.1190 0.1934 0.2555 AR 3 0.0435 0.8326 0.8736 0.1273 0.5318 0.0537 AR 4 0.0960 0.0356 0.1365 0.4074 0.1100 0.0910 AR 5 0.0812 0.3110 0.0288 0.0997 0.1517 0.1440 My R-code: Chi-Square(1) Test p-value MA-0 MA-1 MA-2 MA-3 MA-4 MA-5 AR-0 0.0000 0.0004 0.0749 0.6971 0.4903 0.0000 AR-1 0.1880 0.2355 0.1496 0.1129 0.3625 0.5475 AR-2 0.0648 0.8515 0.9024 0.3151 0.3900 0.3592 AR-3 0.0696 0.8813 0.9112 0.2237 0.6875 0.1978 AR-4 0.1458 0.1738 0.2666 0.5628 0.2827 0.2174 AR-5 0.1168 0.4962 0.2103 0.2507 0.3148 0.3021 I check the original paper by Tsay and Tiao: Tsay, R.S. and Tiao, G.C. (1985). Use of Canonical Analysis in Time Series Model Identification. Biometrika,72 ,299-315. and comapre the formula with SAS ETS manual, e.g. http://support.sas.com/documentation/cdl/en/etsug/60372/HTML/default/etsug_arima_sect031.htm I found that the formula of d(m,j) in SAS manual is wrong. The correct fomula for d(m,j) should be something like d(m,j) = 1 + 2*(r_1^2 + r_2^2 + ... + r_j^2) but in SAS ETS manual, it is d(m,j) = 1 + 2*(r_1 + r_2 + ... + r_(j-1)) I plan to wrap my SCAN code and some other R codes for Time Series into a package, but with the P-value difference from SAS output, I am not sure whether my R-code for SCAN is fine enough for real application. Any suggestion ? Thank you in advance. Steve Chen Associate Professor, Department of Statistics Tamkang University, Taiwan ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.