Re: [R] hierarchical clustering with pearson's coefficient

2013-03-29 Thread Pierre Antoine DuBoDeNa
Anyone for that question?




2013/3/28 Pierre Antoine DuBoDeNa pad...@gmail.com

 Hello,

 I want to use pearson's correlation as distance between observations and
 then use any centroid based linkage distance (ex. Ward's distance)

 When linkage distances are formed as the Lance-Williams recursive
 formulation, they just require the initial distance between observations.
 See here: http://en.wikipedia.org/wiki/Ward%27s_method

 It is said that you have to use euclidean distance between the initial
 observations. However i have found this:

 http://research.stowers-institute.org/efg/R/Visualization/cor-cluster/

 where they use pearson's correlation for hierarchical clustering.

 Any idea if anything is violated in case pearson's correlation is used
 with Ward's linkage function?

 the dissimilarity of pearson's correlation can be defined as d =
 sqrt(1-pearsonsimilarity^2). can that be considered as norm1 distance? and
 thus norm2 if we square it? so that the wikipedia's statement To apply a
 recursive algorithm under this objective function, the initial distance
 between individual objects must be (proportional to) squared Euclidean
 distance. is valid?

 Best,
 Pierre


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] hierarchical clustering with pearson's coefficient

2013-03-28 Thread Pierre Antoine DuBoDeNa
Hello,

I want to use pearson's correlation as distance between observations and
then use any centroid based linkage distance (ex. Ward's distance)

When linkage distances are formed as the Lance-Williams recursive
formulation, they just require the initial distance between observations.
See here: http://en.wikipedia.org/wiki/Ward%27s_method

It is said that you have to use euclidean distance between the initial
observations. However i have found this:

http://research.stowers-institute.org/efg/R/Visualization/cor-cluster/

where they use pearson's correlation for hierarchical clustering.

Any idea if anything is violated in case pearson's correlation is used with
Ward's linkage function?

the dissimilarity of pearson's correlation can be defined as d =
sqrt(1-pearsonsimilarity^2). can that be considered as norm1 distance? and
thus norm2 if we square it? so that the wikipedia's statement To apply a
recursive algorithm under this objective function, the initial distance
between individual objects must be (proportional to) squared Euclidean
distance. is valid?

Best,
Pierre

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pearson's correlation and cross-correlation issue

2013-02-14 Thread Pierre Antoine DuBoDeNa
Hello,

I want to compute the pearson's correlation, but even for signals that are
shifted.

For example having two signals like:

1 1 2 1 1
and
1 2 1 1 1

the correlation is very low.. but if we shift them in the right we get much
better correlation.

I know that cross-correlation is used to find the best offset (where
correlation will be bigger).

Is there any metric that can do this job all together? find best
alignment.. shift and compute pearson's correlation? is that the same as
lagged correlation? or phase correlation? If it has a particular name let
me know.. and if there is any code for that.

Best,
Pierre

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] evaluation clusters

2013-02-04 Thread Pierre Antoine DuBoDeNa
Hello,

I was wondering if there is any package with several metrics for cluster
estimation (estimating the k=2...10) (or evaluation..).

Also if instead of euclidean distance in k-means I use some other kind of
distance.. then for those evaluation metrics which takes into consideration
the distance (n^2) matrix, i should also use that same distance metric i
used for clustering? Or should i always use the euclidean?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] time-series classification with k-nn

2012-06-17 Thread Pierre Antoine DuBoDeNa
Hello,

i am trying to apply k-nn classification for my time-series, however the
euclidean distance is not the best choice as the features i use are not all
normalized (others have values form 0-1 others are negative etc.) and also
it doesn't do any feature evaluation and give different weights to features.

Any idea if any package or code could be helpful? to learn distance metric
for k-nn ?

Also any way to compare with the classic distance metrics, like DTW etc.
? I am interested in learning the thresholds of similarity..

Best,
PA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] time-series statistics collection

2012-06-01 Thread Pierre Antoine DuBoDeNa

 Hello,

 I am trying to collect several global measures or statistics for
 time-series as well as packages of R that can compute them. I have found
 several of them in papers and books, but the literature is so big i am sure
 i am missing several of them.

 skewness
 kurtosis
 min
 max
 mean
 SD
 trend
 seasonality
 periodicity
 chaos (Lyapunov Exponent) / Largest Lyapunov Exponent (i think is the same
 statistic)
 serial correlation / auto-correlation (this is the same if i am correct
 Box-Pierce autocorrelation sum)
 higher-order autocorrelation
 nonlinearity (terasvirta test)
 self similarity (Hurst exponent)
 matual information sum

 any other statistics that i am missing? Maybe other useful tests?

 or books/papers that i could find more?

 also any packages that can compute some/all of them?

 Best,
 PA





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] time-series statistics collection

2012-06-01 Thread Pierre Antoine DuBoDeNa
Great thanks.

If anyone else have an idea of statistics that can represent different
aspects of time-series please let me know.

--
Hello,

I am trying to collect several global measures or statistics for
time-series as well as packages of R that can compute them. I have found
several of them in papers and books, but the literature is so big i am sure
i am missing several of them.

skewness
kurtosis
min
max
mean
SD
trend
seasonality
periodicity
chaos (Lyapunov Exponent) / Largest Lyapunov Exponent (i think is the same
statistic)
serial correlation / auto-correlation (this is the same if i am correct
Box-Pierce autocorrelation sum)
higher-order autocorrelation
nonlinearity (terasvirta test)
self similarity (Hurst exponent)
matual information sum

any other statistics that i am missing? Maybe other useful tests?

or books/papers that i could find more?

also any packages that can compute some/all of them?

Best,
PA
--

2012/6/1 Nicolas Iderhoff nicolasiderh...@googlemail.com

 One of the most important concepts is most certainly Stationarity (see
 “unit root test).
 the most common r-package will be: tseries.

 see:
 Brockwell/Davis (2006): Time Series: Theory and Models.
 Brockwell/Davis (2002): Introduction to Time Series and Forecasting.
 Cowpertwait/Metcalfe (2009): Introductory Time Series with R.
 Cryer/Chan (2008): Time Series Analysis: With Applications in R.

 for some general introductions of using time series in r.





 Am 01.06.2012 um 14:49 schrieb Pierre Antoine DuBoDeNa:

 
  Hello,
 
  I am trying to collect several global measures or statistics for
  time-series as well as packages of R that can compute them. I have found
  several of them in papers and books, but the literature is so big i am
 sure
  i am missing several of them.
 
  skewness
  kurtosis
  min
  max
  mean
  SD
  trend
  seasonality
  periodicity
  chaos (Lyapunov Exponent) / Largest Lyapunov Exponent (i think is the
 same
  statistic)
  serial correlation / auto-correlation (this is the same if i am correct
  Box-Pierce autocorrelation sum)
  higher-order autocorrelation
  nonlinearity (terasvirta test)
  self similarity (Hurst exponent)
  matual information sum
 
  any other statistics that i am missing? Maybe other useful tests?
 
  or books/papers that i could find more?
 
  also any packages that can compute some/all of them?
 
  Best,
  PA
 
 
 
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] time-series statistics collection

2012-05-31 Thread Pierre Antoine DuBoDeNa
Hello,

I am trying to collect several global measures or statistics for
time-series as well as packages of R that can compute them. I have found
several of them in papers and books, but the literature is so big i am sure
i am missing several of them.

skewness
kurtosis
min
max
mean
SD
trend
seasonality
periodicity
chaos (Lyapunov Exponent) / Largest Lyapunov Exponent (i think is the same
statistic)
serial correlation / auto-correlation (this is the same if i am correct
Box-Pierce autocorrelation sum)
higher-order autocorrelation
nonlinearity (terasvirta test)
self similarity (Hurst exponent)
matual information sum

any other statistics that i am missing? Maybe other useful tests?

or books/papers that i could find more?

also any packages that can compute some/all of them?

Best,
PA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.