Hi Ulrich,
I'm studying the principles of Affinity Propagation and I'm really glad to
use your package (apcluster) in order to cluster my data. I have just an
issue to solve..
If I apply the funcion: apcluster(sim)
where sim is the matrix of dissimilarities, sometimes I encounter the
warning
abanero wrote:
Do you know something like “knn1” that works with categorical variables
too?
Do you have any suggestion?
There are surely plenty of clustering algorithms around that do not require
a vector space structure on the inputs (like KNN does). I think
agglomerative clustering would
Dear abanero,
In principle, k nearest neighbours classification can be computed on
any dissimilarity matrix. Unfortunately, knn and knn1 seem to assume
Euclidean vectors as input, which restricts their use.
I'd probably compute an appropriate dissimilarity between points (have a
look at
Hi,
thank you Joris and Ulrich for you answers.
Joris Meys wrote:
see the library randomForest for example
I'm trying to find some example in randomForest with categorical variables
but I haven't found anything. Do you know any example with both categorical
and numerical variables? Anyway I
Hi Abanero,
first, I have to correct myself. Knn1 is a supervised learning algorithm, so
my comment wasn't completely correct. In any case, if you want to do a
clustering prior to a supervised classification, the function daisy() can
handle any kind of variable. The resulting distance matrix can
I'm confusing myself :-)
randomForest cannot handle character vectors as predictors. (Which is why I,
to my surprise, found out that a categorical variable could not be used in
the function). It can handle categorical variables as predictors IF they are
put in as a factor.
Obviously they handle
I had a look at the documentation of the package apcluster.
That's interesting but do you have any example using it with both
categorical
and numerical variables? I'd like to test it with a large dataset..
Your posting has opened my eyes: problems where both numerical and
categorical
Sorry, Joris, I overlooked that you already mentioned daisy() in your
posting. I should have credited your recommendation in my previous message.
Cheers, Ulrich
--
View this message in context:
Ulrich wrote:
Affinity propagation produces quite a number of clusters.
I tried with q=0 and produces 17 clusters. Anyway that's a good idea,
thanks. I'm looking to test it with my dataset.
So I'll probably use daisy() to compute an appropriate dissimilarity then
apcluster() or another
Christian wrote:
and the implement
nearest neighbours classification myself if I needed it.
It should be pretty straightforward to implement.
Do you intend modify the code of the knn1() function by yourself?
No; if you understand what the nearest neighbours method does, it's not
very
What do you suggest in order to assign a new observation to a determined
cluster?
As I mentioned already, I would simply assign the new observation to the
cluster to whose exemplar the new observation is most similar to (in a
knn1-like fashion). To compute these similarities, you can use the
Hi,
I have a 1.000 observations with 10 attributes (of different types: numeric,
dicotomic, categorical ecc..) and a measure M.
I need to cluster these observations in order to assign a new observation
(with the same 10 attributes but not the measure) to a cluster.
I want to calculate for
Not a direct answer, but from your description it looks like you are better
of with supervised classification algorithms instead of unsupervised
clustering. see the library randomForest for example. Alternatively, you can
try a logistic regression or a multinomial regression approach, but these
13 matches
Mail list logo