On Mon, Aug 25, 2008 at 1:36 PM, Weisi Duan <[EMAIL PROTECTED]> wrote:
>
> I just have another question about the option "--space"
> which is set to either vector or similarity.  I understand
> the "similarity" refers to that we are calculating the
> similarity of the pairs of feature vectors. But I do not
> understand the "vector", what does "clustering directly in
> vector space mean"? does it refer to calculation of the
> distance between feature vectors? It seems so similarity is
> almost the same as distance. Thanks a lot!

The --space option in discriminate.pl allows you to use vector or
similarity space. Similarity space is created from a vector space by
converting feature vectors into pairwise similarity values. In vector
space you have a "context (row) by feature (column)" representation,
whereas in similarity space you would have a context (row) by context
(column) representation.

This is carried out by the simat.pl program (or bitsimat.pl if the
input is binary). There are some worked examples in the documentation
of these programs that will hopefully make it clear what the
distinction between vector and similarity space is:

http://search.cpan.org/dist/Text-SenseClusters/Toolkit/matrix/simat.pl

http://search.cpan.org/dist/Text-SenseClusters/Toolkit/matrix/bitsimat.pl

Also, the following paper talks about similarity versus vector space
and gives some details on both what they are and how they perform on
the same task:

Word Sense Discrimination by Clustering Contexts in Vector and
Similarity Spaces  (Purandare and Pedersen) - Appears in the
Proceedings of the Conference on Computational Natural Language
Learning (CoNLL), pp. 41-48, May 6-7, 2004, Boston, MA
http://www.d.umn.edu/~tpederse/Pubs/conll04-purandarep.pdf

I hope this helps! Do let us know as questions arise.

Good luck,
Ted
-- 
Ted Pedersen
http://www.d.umn.edu/~tpederse

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
senseclusters-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/senseclusters-users

Reply via email to