>> DISTFUN now called METRIC, for clarity and compatibility. > >This "compatibility" is just in terms of the wording in the help text, >right? If so, then I don't think we should attempt compatibility at that >level, as that encourages copying help text from Matlab.
I perfectly agree, compatibility in the text is to be discouraged. I should have written: DISTFUN now called METRIC, for clarity and in spite of compatibility. >That being said, I have no objections to the change in wording. Yes, "metric" is better than "distance" here. >That is, add a @ at the end of the first line. This tells >texinfo that the second line is a continuation of the first. Thank you. I did not know that. The Matlab's result are the same as mine, apart from the cluster order which is arbitrary, as far as I can tell. So in fact my new version impreoves on the older. Here it is. I will upload as soon as I am ready. ===File ~/math/octavelib/statistics/linkage.m=============== ## Copyright (C) 2006, 2008 Bill Denney <[EMAIL PROTECTED]> ## Copyright (C) 2008 Francesco Potortì <[EMAIL PROTECTED]> ## ## This software is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3, or (at your option) ## any later version. ## ## This software is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this software; see the file COPYING. If not, see ## <http://www.gnu.org/licenses/>. ## -*- texinfo -*- ## @deftypefn {Function File} [EMAIL PROTECTED] =} linkage (@var{x}) ## @deftypefnx {Function File} [EMAIL PROTECTED] =} linkage (@var{x}, @var{method}) ## ## Return clusters generated from a distance vector created by the pdist ## function. ## ## Methods can be: ## ## @table @samp ## @item "single" (default) ## Shortest distance between two clusters (aka a minimum spanning tree) ## ## @item "complete" ## Furthest distance between two clusters ## ## @item "average" ## Unweighted average distance (aka group average) ## ## @item "weighted" ## Weighted average distance ## ## @item "centroid" ## Centroid distance (not implemented) ## ## @item "median" ## Weighted center of mass distance ## ## @item "ward" ## Inner squared distance (minimum variance) ## ## @end table ## ## @var{x} is the dissimilarity matrix relative to @var{n} observations, ## formatted as a @math{(n-1)*n/2}x1 vector as produced by @code{pdist}. ## @code{linkage} starts by putting each observation into a singleton ## cluster and numbering those from 1 to @var{n}. Then it merges two ## clusters to create a new cluster numbered @var{n+1}, and so on ## until all observations are grouped into a single cluster numbered ## @var{2*n-1}. Row @var{m} of the @math{m-1}x3 output matrix relates ## to cluster @math{n+m}: the first two columns are the numbers of the ## two component clusters and column 3 contains their distance. ## ## @seealso{cluster,pdist} ## @end deftypefn ## Author: Bill Denney <[EMAIL PROTECTED]> function y = linkage (x, method) ## check the input if (nargin < 1) || (nargin > 2) print_usage (); elseif (nargin < 2) method = "single"; endif if (isempty (x)) error ("linkage: x cannot be empty"); elseif (~ isvector (x)) error ("linkage: x must be a vector"); endif ## Function findfxn returns a scalar from a vector and a row from a ## matrix. method = lower (method); switch (method) case "single" ## this is just a minimal spanning tree findfxn = @min; case "complete" findfxn = @max; case { "average", "median", "weighted", "centroid", "ward" } error ("linkage: %s is not yet implemented", method); otherwise error ("linkage: %s: unknown method", method); endswitch dissim = squareform (x, "tomatrix"); startsize = size (dissim, 1); y = zeros (startsize - 1, 3); cnameidx = 1:startsize; for yidx = 1:startsize-1 ## Find the two nearest clusters. available = logical(tril (ones(size(dissim))) - eye(size(dissim))); [r, c] = find (min (dissim(available)) == dissim, 1); ## Here is the new cluster. y(yidx, :) = [cnameidx(r) cnameidx(c) dissim(r, c)]; ## Add it as a new cluster index and remove the old ones. cnameidx(r) = yidx + startsize; cnameidx(c) = []; ## Update the dissimilarity matrix; the diagonal element may be made ## inconsistent by this, but they are never used. newdissim = findfxn (dissim([r c], :)); dissim(r,:) = newdissim; dissim(:,r) = newdissim'; dissim(c,:) = []; dissim(:,c) = []; endfor endfunction %!shared xy, t %! xy = [3 1.7; 1 1; 2 3; 2 2.5; 1.2 1; 1.1 1.5; 3 1]; %! t = 1e-6; %!assert (cond (linkage (pdist (xy))), 66.534612, t); %!assert (cond (linkage (pdist (xy), "single")), 66.534612, t); %!assert (cond (linkage (pdist (xy), "complete")), 27.071750, t); ============================================================ -- Francesco Potortì (ricercatore) Voice: +39 050 315 3058 (op.2111) ISTI - Area della ricerca CNR Fax: +39 050 315 2040 via G. Moruzzi 1, I-56124 Pisa Email: [EMAIL PROTECTED] (entrance 20, 1st floor, room C71) Web: http://fly.isti.cnr.it/ ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Octave-dev mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/octave-dev
