Re: [R] hierarchical clustering with pearson's coefficient
Anyone for that question? 2013/3/28 Pierre Antoine DuBoDeNa pad...@gmail.com Hello, I want to use pearson's correlation as distance between observations and then use any centroid based linkage distance (ex. Ward's distance) When linkage distances are formed as the Lance-Williams recursive formulation, they just require the initial distance between observations. See here: http://en.wikipedia.org/wiki/Ward%27s_method It is said that you have to use euclidean distance between the initial observations. However i have found this: http://research.stowers-institute.org/efg/R/Visualization/cor-cluster/ where they use pearson's correlation for hierarchical clustering. Any idea if anything is violated in case pearson's correlation is used with Ward's linkage function? the dissimilarity of pearson's correlation can be defined as d = sqrt(1-pearsonsimilarity^2). can that be considered as norm1 distance? and thus norm2 if we square it? so that the wikipedia's statement To apply a recursive algorithm under this objective function, the initial distance between individual objects must be (proportional to) squared Euclidean distance. is valid? Best, Pierre [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] hierarchical clustering with pearson's coefficient
Hello, I want to use pearson's correlation as distance between observations and then use any centroid based linkage distance (ex. Ward's distance) When linkage distances are formed as the Lance-Williams recursive formulation, they just require the initial distance between observations. See here: http://en.wikipedia.org/wiki/Ward%27s_method It is said that you have to use euclidean distance between the initial observations. However i have found this: http://research.stowers-institute.org/efg/R/Visualization/cor-cluster/ where they use pearson's correlation for hierarchical clustering. Any idea if anything is violated in case pearson's correlation is used with Ward's linkage function? the dissimilarity of pearson's correlation can be defined as d = sqrt(1-pearsonsimilarity^2). can that be considered as norm1 distance? and thus norm2 if we square it? so that the wikipedia's statement To apply a recursive algorithm under this objective function, the initial distance between individual objects must be (proportional to) squared Euclidean distance. is valid? Best, Pierre [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pearson's correlation and cross-correlation issue
Hello, I want to compute the pearson's correlation, but even for signals that are shifted. For example having two signals like: 1 1 2 1 1 and 1 2 1 1 1 the correlation is very low.. but if we shift them in the right we get much better correlation. I know that cross-correlation is used to find the best offset (where correlation will be bigger). Is there any metric that can do this job all together? find best alignment.. shift and compute pearson's correlation? is that the same as lagged correlation? or phase correlation? If it has a particular name let me know.. and if there is any code for that. Best, Pierre [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] evaluation clusters
Hello, I was wondering if there is any package with several metrics for cluster estimation (estimating the k=2...10) (or evaluation..). Also if instead of euclidean distance in k-means I use some other kind of distance.. then for those evaluation metrics which takes into consideration the distance (n^2) matrix, i should also use that same distance metric i used for clustering? Or should i always use the euclidean? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] time-series classification with k-nn
Hello, i am trying to apply k-nn classification for my time-series, however the euclidean distance is not the best choice as the features i use are not all normalized (others have values form 0-1 others are negative etc.) and also it doesn't do any feature evaluation and give different weights to features. Any idea if any package or code could be helpful? to learn distance metric for k-nn ? Also any way to compare with the classic distance metrics, like DTW etc. ? I am interested in learning the thresholds of similarity.. Best, PA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] time-series statistics collection
Hello, I am trying to collect several global measures or statistics for time-series as well as packages of R that can compute them. I have found several of them in papers and books, but the literature is so big i am sure i am missing several of them. skewness kurtosis min max mean SD trend seasonality periodicity chaos (Lyapunov Exponent) / Largest Lyapunov Exponent (i think is the same statistic) serial correlation / auto-correlation (this is the same if i am correct Box-Pierce autocorrelation sum) higher-order autocorrelation nonlinearity (terasvirta test) self similarity (Hurst exponent) matual information sum any other statistics that i am missing? Maybe other useful tests? or books/papers that i could find more? also any packages that can compute some/all of them? Best, PA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time-series statistics collection
Great thanks. If anyone else have an idea of statistics that can represent different aspects of time-series please let me know. -- Hello, I am trying to collect several global measures or statistics for time-series as well as packages of R that can compute them. I have found several of them in papers and books, but the literature is so big i am sure i am missing several of them. skewness kurtosis min max mean SD trend seasonality periodicity chaos (Lyapunov Exponent) / Largest Lyapunov Exponent (i think is the same statistic) serial correlation / auto-correlation (this is the same if i am correct Box-Pierce autocorrelation sum) higher-order autocorrelation nonlinearity (terasvirta test) self similarity (Hurst exponent) matual information sum any other statistics that i am missing? Maybe other useful tests? or books/papers that i could find more? also any packages that can compute some/all of them? Best, PA -- 2012/6/1 Nicolas Iderhoff nicolasiderh...@googlemail.com One of the most important concepts is most certainly Stationarity (see “unit root test). the most common r-package will be: tseries. see: Brockwell/Davis (2006): Time Series: Theory and Models. Brockwell/Davis (2002): Introduction to Time Series and Forecasting. Cowpertwait/Metcalfe (2009): Introductory Time Series with R. Cryer/Chan (2008): Time Series Analysis: With Applications in R. for some general introductions of using time series in r. Am 01.06.2012 um 14:49 schrieb Pierre Antoine DuBoDeNa: Hello, I am trying to collect several global measures or statistics for time-series as well as packages of R that can compute them. I have found several of them in papers and books, but the literature is so big i am sure i am missing several of them. skewness kurtosis min max mean SD trend seasonality periodicity chaos (Lyapunov Exponent) / Largest Lyapunov Exponent (i think is the same statistic) serial correlation / auto-correlation (this is the same if i am correct Box-Pierce autocorrelation sum) higher-order autocorrelation nonlinearity (terasvirta test) self similarity (Hurst exponent) matual information sum any other statistics that i am missing? Maybe other useful tests? or books/papers that i could find more? also any packages that can compute some/all of them? Best, PA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] time-series statistics collection
Hello, I am trying to collect several global measures or statistics for time-series as well as packages of R that can compute them. I have found several of them in papers and books, but the literature is so big i am sure i am missing several of them. skewness kurtosis min max mean SD trend seasonality periodicity chaos (Lyapunov Exponent) / Largest Lyapunov Exponent (i think is the same statistic) serial correlation / auto-correlation (this is the same if i am correct Box-Pierce autocorrelation sum) higher-order autocorrelation nonlinearity (terasvirta test) self similarity (Hurst exponent) matual information sum any other statistics that i am missing? Maybe other useful tests? or books/papers that i could find more? also any packages that can compute some/all of them? Best, PA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.