[R] shapiro.test() output

2006-07-12 Thread Matthew.Findley
R Users:

My question is probably more about elementary statistics than the
mechanics of using R, but I've been dabbling in R (version 2.2.0) and
used it recently  to test some data . 

I have a relatively small set of observations (n = 12) of arsenic
concentrations in background groundwater and wanted to test my
assumption of normality.  I used the Shapiro-Wilk test (by calling
shapiro.test() in R) and I'm not sure how to interpret the output.
Here's the input/output from the R console:

>As = c(13, 17, 23, 9.5, 20, 15, 11, 17, 21, 14, 22, 13)
>shapiro.test(As)

  Shapiro-Wilk normality test

data:  As 
W = 0.9513, p-value = 0.6555

How do I interpret this?  I understand, from poking around the internet,
that the higher the W statistic the "more normal" the data.

What is the null hypothesis - that the data is normally distributed?  

What does the p-value tell me?  65.55% chance of what - getting
W-statistic greater than or equal to 0.9513 (I picked this up from the
Dalgaard book, Introductory Statistics with R, but its not really
sinking in with respect to how it applies to a Shipiro Wilk test).? 

The method description - retrieved using ?shapiro.test() - is a bit
light on details.

Thanks much.

-
Matthew C. Findley, CPSSc
Environmental Scientist
 
CH2M HILL
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] zoo merge() method

2007-04-12 Thread Matthew.Findley
R users:

I'd like to get some insight on an error I encounter when attempting to
work with two moderately sized sets of time series data.  FYI - I'm
using the following versions of R and supporting packages on a Windows
2000 OS:

- R version 2.4.1 (2006-12-18) 
- zoo version 1.2-2
- chron version 2.3-10

The two time series I'm working with are from the summer of 2004 and
are: 
1.) wet.bulb.air.temp: air temperatures recorded on an hourly basis, and

2.) creek.temperature: surface water body temperatures collected every
12 minutes.

I would ultimately like to observe the difference in temperatures and
attempted to get at this by merging the two time series (by union),
interpolating the NAs, and finally, subtracting one vector from the
other.  The problem is that I can not combine the two zoo time series
objects using the merge() or cbind() functions.  I get the following
error:

Error in z[match0(index(a), indexes), ] <- a[match0(indexes, index(a))]
: 
number of items to replace is not a multiple of replacement
length

The input/output from a recent R Console session might help, so I've
included it as follows:

> summary(creek.temperature)

 Index creek.temperature
 Min.   :(07/21/04 00:03:00)   Min.   :12.82
 1st Qu.:(08/11/04 03:00:00)   1st Qu.:16.28
 Median :(09/01/04 03:45:00)   Median :18.53
 Mean   :(09/01/04 04:45:13)   Mean   :18.87
 3rd Qu.:(09/22/04 06:37:00)   3rd Qu.:21.48
 Max.   :(10/13/04 09:22:00)   Max.   :27.72

> length(creek.temperature)

[1] 10140

> summary(wet.bulb.air.temp)

 Index wet.bulb.air.temp
 Min.   :(07/01/04 00:00:00)   Min.   : 3.889   
 1st Qu.:(07/31/04 12:00:00)   1st Qu.:12.778   
 Median :(08/31/04 00:00:00)   Median :14.444   
 Mean   :(08/31/04 00:00:00)   Mean   :14.469   
 3rd Qu.:(09/30/04 12:00:00)   3rd Qu.:16.667   
 Max.   :(10/31/04 00:00:00)   Max.   :22.222   

> length(wet.bulb.air.temp)

[1] 2929

> class(creek.temperature)

[1] "zoo"

> class(wet.bulb.air.temp)

[1] "zoo"

> merge(wet.bulb.air.temp, creek.temperature)

Error in z[match0(index(a), indexes), ] <- a[match0(indexes, index(a))]
: 
number of items to replace is not a multiple of replacement
length

> cbind(wet.bulb.air.temp, creek.temperature)

Error in z[match0(index(a), indexes), ] <- a[match0(indexes, index(a))]
: 
number of items to replace is not a multiple of replacement
length

The really puzzling part about this error is that it does not occur when
I pare down the data sets and only look at a 24 hour window of data (for
brevity, the input/output from that exercise has not been included in
this e-mail).

My question to the R user community is as follows:  What is this error?
How do I get past it and get these two data sets to play nice with each
other?

Thanks,

Matt Findley

-
Matthew C. Findley, CPSSc
Environmental Scientist
 
CH2M HILL
2300 NW Walnut Blvd
Corvallis, OR 97330-3538
 
Tel: 541.768.3504
Fax: 541.752.0276
 
[EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.