Hi Prashant,

Yes, you need to create the key file manually, which requires that you
know the correct sense of your head word. So for your examples below
you might have a key file that looks something like this:

<instance id="0"/> <sense id="tagforsense"/>
<instance id="1"/> <sense id="tagforsense">/

where tagforsense can be any label that you wish to indicate the
appropriate sense.

When you provide a key file it presumes you know the answers to the
discrimination task you are carrying out, and then this allows you to
evaluate the results of SenseClusters compared to what you know to be
the "ground truth". You can use the --eval option on the command line
or check the evaluate box in the web interface.

The perldoc for text2sval.pl shows the above keyfile format, and that
doc can also be found here:

http://senseclusters.sourceforge.net/Toolkit_Docs/preprocess/plain/text2sval.html#key_keyfile

I hope this helps. Please let us know if you have any further questions!
Ted

On Dec 6, 2007 12:47 AM, Prashant More <[EMAIL PROTECTED]> wrote:
> Respected Sir,
>
> I've experimented with Sense Clusters using the datasets provided on the
> site.
> Now I want to use my own data with Sense Clusters.
> I've the data in plain text files and I need to convert it to Senseval2
> format as SenseCluster requires it in that format.
> The script "text2sval.pl" converts plain text files into Senseval2 format.
> For that, it asks for a KeyFile which is supposed to contain instance ids
> and optional sense tags of the instances in the text file.
> Though the keyfile is an optional argument to "text2sval.pl", its not giving
> much clear output without key file.
>     So, I want to know, whether its created manually(if so, is there any
> standard procedure?) or any tool is used to create it?
>
>
>     To make the point clear i'm giving a snapshot of both the input and
> output below.
>
>
> The Sample input to "text2sval.pl" is,
> -------------------------------------------------------------------------------------
>  us all natives of this region as soon we heard about the catastrophe
> Saturday morning said one of the volunteers Bajaj Zanji a 20 year old
> <head>idlypuri</head> restaurant worker in Tehran My job consists of digging
> out the dead with a shovel because we have no other means at our disposal he
>  called us his picture wouldn't be spotted in this ad The advertisement
> notes that Atta lived among us attending classes shopping at the mall earing
> <head>idlypuri</head> going out now and then with friends But it also calls
> attention to signs that should have drawn attention to the Egyptian student
> like the
> -------------------------------------------------------------------------------------
>
> The Output displayed is like this,
> -------------------------------------------------------------------------------------
> <corpus lang="english">
> <lexelt item="LEXELT">
> <instance id="0">
> <answer instance="0" senseid="NOTAG"/>
> <context>
>  us all natives of this region as soon we heard about the catastrophe
> Saturday morning said one of the volunteers Bajaj Zanji a 20 year old
> <head>idlypuri</head> restaurant worker in Tehran My job consists of digging
> out the dead with a shovel because we have no other means at our disposal he
> </context>
> </instance>
> <instance id="1">
> <answer instance="1" senseid="NOTAG"/>
> <context>
>  called us his picture wouldn't be spotted in this ad The advertisement
> notes that Atta lived among us attending classes shopping at the mall earing
> <head>idlypuri</head> going out now and then with friends But it also calls
> attention to signs that should have drawn attention to the Egyptian student
> like the
> </context>
> -------------------------------------------------------------------------------------
>
>            Since here i've not mentioned any KeyFile argument, its using
> default "senseid", "instance id" and "lexelt iem".
>             I want to know about how these 3 things are given in "keyfile",
> whether manually or any tool is used here, is there any standard procedure
> or what?
>
>           I hope, the point is much clear now.
>           Sorry for the lengthy mail.
>
>          Expecting your positive reply.
>           Thanking you.
>
> --
> Cheers!!
>
> More Prashant J.
> C-DAC (Erstwhile NCST),
> Mumbai.
>



-- 
Ted Pedersen
http://www.d.umn.edu/~tpederse

-------------------------------------------------------------------------
SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell.  From the desktop to the data center, Linux is going
mainstream.  Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4
_______________________________________________
senseclusters-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/senseclusters-users

Reply via email to