On Mon, Aug 25, 2008 at 1:36 PM, Weisi Duan <[EMAIL PROTECTED]> wrote: > > I just have another question about the option "--space" > which is set to either vector or similarity. I understand > the "similarity" refers to that we are calculating the > similarity of the pairs of feature vectors. But I do not > understand the "vector", what does "clustering directly in > vector space mean"? does it refer to calculation of the > distance between feature vectors? It seems so similarity is > almost the same as distance. Thanks a lot!
The --space option in discriminate.pl allows you to use vector or similarity space. Similarity space is created from a vector space by converting feature vectors into pairwise similarity values. In vector space you have a "context (row) by feature (column)" representation, whereas in similarity space you would have a context (row) by context (column) representation. This is carried out by the simat.pl program (or bitsimat.pl if the input is binary). There are some worked examples in the documentation of these programs that will hopefully make it clear what the distinction between vector and similarity space is: http://search.cpan.org/dist/Text-SenseClusters/Toolkit/matrix/simat.pl http://search.cpan.org/dist/Text-SenseClusters/Toolkit/matrix/bitsimat.pl Also, the following paper talks about similarity versus vector space and gives some details on both what they are and how they perform on the same task: Word Sense Discrimination by Clustering Contexts in Vector and Similarity Spaces (Purandare and Pedersen) - Appears in the Proceedings of the Conference on Computational Natural Language Learning (CoNLL), pp. 41-48, May 6-7, 2004, Boston, MA http://www.d.umn.edu/~tpederse/Pubs/conll04-purandarep.pdf I hope this helps! Do let us know as questions arise. Good luck, Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ senseclusters-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/senseclusters-users
