Re: [R] Statistical Analysis of an Exchange Rate

2020-03-05 Thread Mark Leeds
or possibly even more appropriate is quant.stackexchange.com.


On Thu, Mar 5, 2020 at 4:38 AM Eric Berger  wrote:

> Alternatively you might try posting to
> r-sig-fina...@r-project.org
>
>
>
> On Wed, Mar 4, 2020 at 9:38 PM Bert Gunter  wrote:
>
> > Your question is way off topic here -- this list is for R programming
> > questions, not statistical consulting. You might wish to try
> > stats.stackexchange.com for the latter.
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> and
> > sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Wed, Mar 4, 2020 at 10:58 AM spencer davis 
> > wrote:
> >
> > > So I've been researching statistical analysis for a considerable amount
> > of
> > > time and still haven't really found what I've been looking for and am
> > > hoping that by getting help answering this question it will send me
> down
> > > the right path to answering all of my questions. So I am hoping that
> > > someone will be able to tell me how and what package I would need to do
> > > what I'm about to describe. I want to take historical price data from 1
> > > minute currency pair charts and find the probabilities of price moves
> > after
> > > a pullback immediately following an impulsive move. So I would have to
> > set
> > > the definition of an impulsive move as price moving a certain
> percentage
> > in
> > > a certain amount of time, I'd have to define a threshold as to what
> would
> > > be considered a pullback and what wouldn't and then I'd like to gain
> the
> > > information as to what the probability is of different percentage of
> > moves
> > > at different pullbacks, the different probabilities with different
> length
> > > impulsive moves. Can anyone get me set on the right path here, I'm
> > swimming
> > > in information and am just so lost. Any help will be so much
> appreciated.
> > > Thanks!
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical Analysis of an Exchange Rate

2020-03-05 Thread Eric Berger
Alternatively you might try posting to
r-sig-fina...@r-project.org



On Wed, Mar 4, 2020 at 9:38 PM Bert Gunter  wrote:

> Your question is way off topic here -- this list is for R programming
> questions, not statistical consulting. You might wish to try
> stats.stackexchange.com for the latter.
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Mar 4, 2020 at 10:58 AM spencer davis 
> wrote:
>
> > So I've been researching statistical analysis for a considerable amount
> of
> > time and still haven't really found what I've been looking for and am
> > hoping that by getting help answering this question it will send me down
> > the right path to answering all of my questions. So I am hoping that
> > someone will be able to tell me how and what package I would need to do
> > what I'm about to describe. I want to take historical price data from 1
> > minute currency pair charts and find the probabilities of price moves
> after
> > a pullback immediately following an impulsive move. So I would have to
> set
> > the definition of an impulsive move as price moving a certain percentage
> in
> > a certain amount of time, I'd have to define a threshold as to what would
> > be considered a pullback and what wouldn't and then I'd like to gain the
> > information as to what the probability is of different percentage of
> moves
> > at different pullbacks, the different probabilities with different length
> > impulsive moves. Can anyone get me set on the right path here, I'm
> swimming
> > in information and am just so lost. Any help will be so much appreciated.
> > Thanks!
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical Analysis of an Exchange Rate

2020-03-04 Thread Bert Gunter
Your question is way off topic here -- this list is for R programming
questions, not statistical consulting. You might wish to try
stats.stackexchange.com for the latter.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Mar 4, 2020 at 10:58 AM spencer davis  wrote:

> So I've been researching statistical analysis for a considerable amount of
> time and still haven't really found what I've been looking for and am
> hoping that by getting help answering this question it will send me down
> the right path to answering all of my questions. So I am hoping that
> someone will be able to tell me how and what package I would need to do
> what I'm about to describe. I want to take historical price data from 1
> minute currency pair charts and find the probabilities of price moves after
> a pullback immediately following an impulsive move. So I would have to set
> the definition of an impulsive move as price moving a certain percentage in
> a certain amount of time, I'd have to define a threshold as to what would
> be considered a pullback and what wouldn't and then I'd like to gain the
> information as to what the probability is of different percentage of moves
> at different pullbacks, the different probabilities with different length
> impulsive moves. Can anyone get me set on the right path here, I'm swimming
> in information and am just so lost. Any help will be so much appreciated.
> Thanks!
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical analysis of olive dataset

2016-03-13 Thread Michael Friendly

On 3/12/2016 12:39 PM, Axel wrote:

The main goal of my analysis is to
determine which are the fatty acids that characterize the origin of an oil. As
a secondary goal, I wolud like to insert the results of the chemical analysis
of an oil that I analyzed (I am a Chemistry student) in order to determine its
region of production. I do not know if this last thing is possibile.


There are already plenty of tools for this; don't bother trying to 
re-invent an already well-working wheel.


* PCA + a biplot will give you a good overview.  With groups, I 
recommend ggbiplot, with data ellipses for the groups.

This shows clear separation along PC1

data(olive, package="tourr")
library(ggbiplot)
olivenum <- olive[,c(3:10)]

olive.pca <- prcomp(olivenum, scale.=TRUE)
summary(olive.pca)

# region should be a factor (area has 9 levels, maybe too confusing)
olive$region <- factor(olive$region, labels=c("North", "Sardinia", "South"))

ggbiplot(olive.pca, obs.scale = 1, var.scale = 1,
 groups = olive$region, ellipse = TRUE, varname.size=4,
 circle = TRUE) +
 theme_bw() +
 theme(legend.direction = 'horizontal',
   legend.position = 'top')


* Discrimination among regions by chemical composition:
A canonical discriminant analysis will show you this in
a low-rank view.  The biggest difference is between the North
vs. the other 2.


# MLM
olive.mlm <- lm(as.matrix(olive[,c(3:10)]) ~ olive$region, data=olive)

# Canonical discriminant analysis

# (need devel. version for ellipses)
# install.packages("candisc", repos="http://R-Forge.R-project.org;)
library(candisc)
olive.can <- candisc(olive.mlm)
olive.can
plot(olive.can, ellipse=TRUE)

* You can probably use the predict() method for MASS::lda() to predict
the class for new samples.

hope this helps,
-Michael

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical analysis of olive dataset

2016-03-13 Thread Michael Dewey

Dear Axel

Since you are using princomp (among other things) you might find the 
biplot function useful on the output of princomp.



I have not studies your code in detail but you do seem to be doing 
several things in multiple ways using functions from different sources. 
I wonder whether it might be better to stick to fewer functions.


On 12/03/2016 17:39, Axel wrote:

Hi to all the members of the list!

I am a novice as regards to statistical
analysis and the use of the R software, so I am experimenting with the dataset
"olive" included in the package "tourr".
This dataset contains the results of
the determination of the fatty acids in 572 samples of olive oil from Italy
(columns from 3 to 10) along with the area and the region of origin of the oil
(respectively, column 1 and column 2).

The main goal of my analysis is to
determine which are the fatty acids that characterize the origin of an oil. As
a secondary goal, I wolud like to insert the results of the chemical analysis
of an oil that I analyzed (I am a Chemistry student) in order to determine its
region of production. I do not know if this last thing is possibile.

I am
using R 3.2.4 on MacOS X El Capitan with the packages "tourr" and "psych"
loaded.
Here are the commands I have used up to now:

olivenum <- olive[,c(3:
10)]
mean <- colMeans(olivenum)
sd <- sapply(olivenum,sd)
describeBy(olivenum,
olive[2])
pairs(olivenum)
R <- cor(olivenum)
eigen(R)
# Since the first three
autovalues are greater than 1, these are the main components (column 1, 2 and
3). But I can determine them also using a scree diagram as following. Right?

autoval <- eigen(R)$values
autovec <- eigen(R)$vectors
pvarsp <- autoval/ncol
(olivenum)
plot(autoval,type="b",main="Scree diagram",xlab="Number of
components",ylab="Autovalues")
abline(h=1,lwd=3,col="red")

eigen (R)$vectors[,
1:3]
olive.scale <- scale(olivenum,T,T)
points <- olive.scale%*%autovec[,1:3]


#Since I selected three main components (three columns), how should I plot the
dispersion graph? I do not think that what I have done is right:
plot(points,
main="Dispersion graph",xlab="Component 1",ylab="Component 2")
princomp
(olivenum,cor=T)
#With the following command I obtain a summary of the
importance of components. For example, the variance of component 1 is about
0,465, of component 2 is 0,220 and of component 3 is 0,127 with a cumulative
variance of 0,812. This means that the values in the first three columns of the
matrix "olivenum" mostly characterize the differences between the observations.
Right?
summary(princomp(olivenum,cor=T))
screeplot(princomp(olivenum,cor=T))

plot(princomp(olivenum,cor=T)$scores,rownames(olivenum))
abline(h=0,v=0)

I
determined that three components can explain a great part of variability but I
don't know which are these components. How should I continue?

Thank you for

attention,
Axel

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Michael
http://www.dewey.myzen.co.uk/home.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical analysis of olive dataset

2016-03-12 Thread Jim Lemon
Hi Axel,
It seems to me that cluster analysis could be what you are seeking.
Identify the clusters of different combinations of fatty acids in the
oils. Do they correspond to location? If so, is there a method to
predict the cluster membership of a new set of measurements? Have a
look at the cluster package, which you should have.

Jim


On Sun, Mar 13, 2016 at 4:39 AM, Axel  wrote:
> Hi to all the members of the list!
>
> I am a novice as regards to statistical
> analysis and the use of the R software, so I am experimenting with the dataset
> "olive" included in the package "tourr".
> This dataset contains the results of
> the determination of the fatty acids in 572 samples of olive oil from Italy
> (columns from 3 to 10) along with the area and the region of origin of the oil
> (respectively, column 1 and column 2).
>
> The main goal of my analysis is to
> determine which are the fatty acids that characterize the origin of an oil. As
> a secondary goal, I wolud like to insert the results of the chemical analysis
> of an oil that I analyzed (I am a Chemistry student) in order to determine its
> region of production. I do not know if this last thing is possibile.
>
> I am
> using R 3.2.4 on MacOS X El Capitan with the packages "tourr" and "psych"
> loaded.
> Here are the commands I have used up to now:
>
> olivenum <- olive[,c(3:
> 10)]
> mean <- colMeans(olivenum)
> sd <- sapply(olivenum,sd)
> describeBy(olivenum,
> olive[2])
> pairs(olivenum)
> R <- cor(olivenum)
> eigen(R)
> # Since the first three
> autovalues are greater than 1, these are the main components (column 1, 2 and
> 3). But I can determine them also using a scree diagram as following. Right?
>
> autoval <- eigen(R)$values
> autovec <- eigen(R)$vectors
> pvarsp <- autoval/ncol
> (olivenum)
> plot(autoval,type="b",main="Scree diagram",xlab="Number of
> components",ylab="Autovalues")
> abline(h=1,lwd=3,col="red")
>
> eigen (R)$vectors[,
> 1:3]
> olive.scale <- scale(olivenum,T,T)
> points <- olive.scale%*%autovec[,1:3]
>
>
> #Since I selected three main components (three columns), how should I plot the
> dispersion graph? I do not think that what I have done is right:
> plot(points,
> main="Dispersion graph",xlab="Component 1",ylab="Component 2")
> princomp
> (olivenum,cor=T)
> #With the following command I obtain a summary of the
> importance of components. For example, the variance of component 1 is about
> 0,465, of component 2 is 0,220 and of component 3 is 0,127 with a cumulative
> variance of 0,812. This means that the values in the first three columns of 
> the
> matrix "olivenum" mostly characterize the differences between the 
> observations.
> Right?
> summary(princomp(olivenum,cor=T))
> screeplot(princomp(olivenum,cor=T))
>
> plot(princomp(olivenum,cor=T)$scores,rownames(olivenum))
> abline(h=0,v=0)
>
> I
> determined that three components can explain a great part of variability but I
> don't know which are these components. How should I continue?
>
> Thank you for
>
> attention,
> Axel
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical analysis of olive dataset

2016-03-12 Thread Bert Gunter
Inline.

Cheers,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Mar 12, 2016 at 9:39 AM, Axel  wrote:
> Hi to all the members of the list!
>
> I am a novice as regards to statistical
> analysis and the use of the R software, so I am experimenting with the dataset
> "olive" included in the package "tourr".

Stop experimenting and spend time with an R tutorial or two? There are
many good ones on the Web. See also
https://www.rstudio.com/online-learning/#R  for some recommendations.




> This dataset contains the results of
> the determination of the fatty acids in 572 samples of olive oil from Italy
> (columns from 3 to 10) along with the area and the region of origin of the oil
> (respectively, column 1 and column 2).
>
> The main goal of my analysis is to
> determine which are the fatty acids that characterize the origin of an oil. As
> a secondary goal, I wolud like to insert the results of the chemical analysis
> of an oil that I analyzed (I am a Chemistry student) in order to determine its
> region of production. I do not know if this last thing is possibile.
>
> I am
> using R 3.2.4 on MacOS X El Capitan with the packages "tourr" and "psych"
> loaded.
> Here are the commands I have used up to now:
>
> olivenum <- olive[,c(3:
> 10)]
> mean <- colMeans(olivenum)
> sd <- sapply(olivenum,sd)
> describeBy(olivenum,
> olive[2])
> pairs(olivenum)
> R <- cor(olivenum)
> eigen(R)
> # Since the first three
> autovalues are greater than 1, these are the main components (column 1, 2 and
> 3). But I can determine them also using a scree diagram as following. Right?
>
> autoval <- eigen(R)$values
> autovec <- eigen(R)$vectors
> pvarsp <- autoval/ncol
> (olivenum)
> plot(autoval,type="b",main="Scree diagram",xlab="Number of
> components",ylab="Autovalues")
> abline(h=1,lwd=3,col="red")
>
> eigen (R)$vectors[,
> 1:3]
> olive.scale <- scale(olivenum,T,T)
> points <- olive.scale%*%autovec[,1:3]
>
>
> #Since I selected three main components (three columns), how should I plot the
> dispersion graph? I do not think that what I have done is right:
> plot(points,
> main="Dispersion graph",xlab="Component 1",ylab="Component 2")
> princomp
> (olivenum,cor=T)
> #With the following command I obtain a summary of the
> importance of components. For example, the variance of component 1 is about
> 0,465, of component 2 is 0,220 and of component 3 is 0,127 with a cumulative
> variance of 0,812. This means that the values in the first three columns of 
> the
> matrix "olivenum" mostly characterize the differences between the 
> observations.
> Right?
> summary(princomp(olivenum,cor=T))
> screeplot(princomp(olivenum,cor=T))
>
> plot(princomp(olivenum,cor=T)$scores,rownames(olivenum))
> abline(h=0,v=0)
>
> I
> determined that three components can explain a great part of variability but I
> don't know which are these components. How should I continue?
>
> Thank you for
>
> attention,
> Axel
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical Analysis with R Beginner's Guide Book

2010-12-07 Thread John M. Quick
Hi Mike,

The book makes use of .csv files, which are provided, along with all R code and 
.RData files.

You have an interesting thought about people pulling data from diverse sources 
and making everyday use of R. For this, I would suggest using Excel or Google 
Docs Spreadsheets to compile and organize the data. Afterwards, the dataset 
could be exported as a .csv file for use in R.

John 

 
 Date: Tue, 7 Dec 2010 10:53:58 -0800
 From: j...@johnmquick.com
 To: r-help@r-project.org
 Subject: [R] Statistical Analysis with R Beginner's Guide Book
 
 
 Hi Everyone,
 
 I'm writing to announce my new R beginner's guide book and answer questions
 related to it.
 
 The primary focus of Statistical Analysis with R is helping new users become
 accustomed to R and empowering them to apply R to suit their own needs. It
 is a beginner's guide written for a broad audience and should be well
 received by businesspeople, IT professionals, researchers, and students
 alike. Statistical Analysis with R takes readers on a journey from their
 
 I guess I would just mention, not having looked at your links, that it
 may not be out of place to include information on scraping data from
 various sources that may be of interest to more casual amateur users.
 A number of people ask about data input from places like yahoo and
 the Forbes article someone posted suggests people do use R for home
 and personal usage. Often the data most interesting to this audience
 may not be known to them or getting it into R could be a challenge. 
 
 Personally I'd like to create a bigger audience to encourage various
 agencies, including the IRS for example, to make more open and free to
 use API's. 
 
 first installation and launch of R, to analyzing and assessing data, to
 communicating and visualizing results. You can
 http://rtutorialseries.blogspot.com/2010/11/r-beginners-guide-book-update.html
 learn more about the book on my R Tutorial Series blog. The book itself can
 be found on the http://link.packtpub.com/or7f1u Packt Publishing website .
 
 If you have questions about the book, such as its content coverage,
 approach, audience, etc., please respond and I will do my best to clarify.
 
 Sincerely,
 John M. Quick
 
 
 -
 John M. Quick
 
 * http://rTutorialSeries.blogspot.com R Tutorial Series Blog
 * http://link.packtpub.com/or7f1u R Beginner's Guide
 * http://www.johnmquick.com www.johnmquick.com
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Statistical-Analysis-with-R-Beginner-s-Guide-Book-tp3076991p3076991.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical analysis

2009-09-24 Thread Paul Hiemstra

Chris Li wrote:

Hi all,

I have got two datasets, one of them is rainfall data and the other one is
groundwater level data.

I would like to see whether there is a correlation between these two
datasets and if there is, to what extent they are correlated.

My stats background is limited, therefore any advice on which command I
should use in R would be greatly appreciated.

Thanks in advance.
Chris
  

Hi,

My advice would be to get an introductory statistics book and start with 
that. There is an Introductory stats book by Dalgaard that uses R. 
Strikes two birds with one blow.


http://www.amazon.com/Introductory-Statistics-R-Peter-Dalgaard/dp/0387954759

cheers,
Paul

--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone:  +3130 274 3113 Mon-Tue
Phone:  +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical analysis

2009-09-24 Thread Arien Lam

Hi Chris,

If I understand your question correctly, what you want is both easy and hard.
Easy:
# making a reproducible example, as asked in the posting guide
# two vectors
water - rnorm(1000)
rain - rgamma(1000,.5)
# the following does everything you mention and more
summary(lm(water~rain))
cor(water,rain)

Hard:
lm() and cor() assume independence of observations, linearity of the relation, normality of the 
residuals. Are these assumptions valid for your problem?
Are your datasets time series? There will be ??autocorrelation in both datasets. There may be a 
?lag. Decide whether to estimate and correct for those.

Are there multiple sample locations? There may be dependence.
Would you rather assume rain and change in groundwater level are related?
Etc.

Cheers,

Arien


Chris Li wrote:

Hi all,

I have got two datasets, one of them is rainfall data and the other one is
groundwater level data.

I would like to see whether there is a correlation between these two
datasets and if there is, to what extent they are correlated.

My stats background is limited, therefore any advice on which command I
should use in R would be greatly appreciated.

Thanks in advance.
Chris


--
drs. H.A. (Arien) Lam (Ph.D. student)
Department of Physical Geography
Faculty of Geosciences
Utrecht University, The Netherlands

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical analysis

2009-09-24 Thread Arun.stat

Rainfall data is widely accepted as Random walk process and hence it is
non-stationary. Therefore if correlation or regression coef. is measured on
raw data then you may land in the world of spurious measures. I would
suggest you to check whether unit root is there in your data or not first.
If it is there then estimate corr or any other statistical measure on
differenced data.

Best,



cls59 wrote:
 
 
 
 Chris Li wrote:
 
 Hi all,
 
 I have got two datasets, one of them is rainfall data and the other one
 is groundwater level data.
 
 I would like to see whether there is a correlation between these two
 datasets and if there is, to what extent they are correlated.
 
 My stats background is limited, therefore any advice on which command I
 should use in R would be greatly appreciated.
 
 Thanks in advance.
 Chris
 
 
 
 Supposing you have two variables-- precipitation, p, and groundwater
 potential, h-- a simple test for linear correlation is to produce a
 scatterplot of h vs. p:
 
 plot( h ~ p )
 
 If it looks linear, than it may be worthwhile to have R estimate the
 coefficient of correlation for the data:
 
 cor( p, h )
 
 If the correlation coefficient is close to +/- 1, than your data is
 exhibiting a strong linear trend and a linear model may be appropriate:
 
 linModel - lm( h ~ p )
 
 abline( linModel )
 
 
 Good luck!
 
 -Charlie
 
 

-- 
View this message in context: 
http://www.nabble.com/Statistical-analysis-tp25531331p25570612.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical analysis

2009-09-24 Thread Greg Snow
Since todays ground water may be influenced by yesterdays rainfall, you may 
want to look at the dynlm package and possibly lag.plot and the zoo package.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Chris Li
 Sent: Wednesday, September 23, 2009 5:37 PM
 To: r-help@r-project.org
 Subject: [R] Statistical analysis
 
 
 Hi all,
 
 I have got two datasets, one of them is rainfall data and the other one
 is
 groundwater level data.
 
 I would like to see whether there is a correlation between these two
 datasets and if there is, to what extent they are correlated.
 
 My stats background is limited, therefore any advice on which command I
 should use in R would be greatly appreciated.
 
 Thanks in advance.
 Chris
 --
 View this message in context: http://www.nabble.com/Statistical-
 analysis-tp25531331p25531331.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical analysis

2009-09-23 Thread cls59



Chris Li wrote:
 
 Hi all,
 
 I have got two datasets, one of them is rainfall data and the other one is
 groundwater level data.
 
 I would like to see whether there is a correlation between these two
 datasets and if there is, to what extent they are correlated.
 
 My stats background is limited, therefore any advice on which command I
 should use in R would be greatly appreciated.
 
 Thanks in advance.
 Chris
 


Supposing you have two variables-- precipitation, p, and groundwater
potential, h-- a simple test for linear correlation is to produce a
scatterplot of h vs. p:

plot( h ~ p )

If it looks linear, than it may be worthwhile to have R estimate the
coefficient of correlation for the data:

cor( p, h )

If the correlation coefficient is close to +/- 1, than your data is
exhibiting a strong linear trend and a linear model may be appropriate:

linModel - lm( h ~ p )

abline( linModel )


Good luck!

-Charlie


-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://www.nabble.com/Statistical-analysis-tp25531331p25531335.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.