Hi Prashant, Yes, you need to create the key file manually, which requires that you know the correct sense of your head word. So for your examples below you might have a key file that looks something like this:
<instance id="0"/> <sense id="tagforsense"/> <instance id="1"/> <sense id="tagforsense">/ where tagforsense can be any label that you wish to indicate the appropriate sense. When you provide a key file it presumes you know the answers to the discrimination task you are carrying out, and then this allows you to evaluate the results of SenseClusters compared to what you know to be the "ground truth". You can use the --eval option on the command line or check the evaluate box in the web interface. The perldoc for text2sval.pl shows the above keyfile format, and that doc can also be found here: http://senseclusters.sourceforge.net/Toolkit_Docs/preprocess/plain/text2sval.html#key_keyfile I hope this helps. Please let us know if you have any further questions! Ted On Dec 6, 2007 12:47 AM, Prashant More <[EMAIL PROTECTED]> wrote: > Respected Sir, > > I've experimented with Sense Clusters using the datasets provided on the > site. > Now I want to use my own data with Sense Clusters. > I've the data in plain text files and I need to convert it to Senseval2 > format as SenseCluster requires it in that format. > The script "text2sval.pl" converts plain text files into Senseval2 format. > For that, it asks for a KeyFile which is supposed to contain instance ids > and optional sense tags of the instances in the text file. > Though the keyfile is an optional argument to "text2sval.pl", its not giving > much clear output without key file. > So, I want to know, whether its created manually(if so, is there any > standard procedure?) or any tool is used to create it? > > > To make the point clear i'm giving a snapshot of both the input and > output below. > > > The Sample input to "text2sval.pl" is, > ------------------------------------------------------------------------------------- > us all natives of this region as soon we heard about the catastrophe > Saturday morning said one of the volunteers Bajaj Zanji a 20 year old > <head>idlypuri</head> restaurant worker in Tehran My job consists of digging > out the dead with a shovel because we have no other means at our disposal he > called us his picture wouldn't be spotted in this ad The advertisement > notes that Atta lived among us attending classes shopping at the mall earing > <head>idlypuri</head> going out now and then with friends But it also calls > attention to signs that should have drawn attention to the Egyptian student > like the > ------------------------------------------------------------------------------------- > > The Output displayed is like this, > ------------------------------------------------------------------------------------- > <corpus lang="english"> > <lexelt item="LEXELT"> > <instance id="0"> > <answer instance="0" senseid="NOTAG"/> > <context> > us all natives of this region as soon we heard about the catastrophe > Saturday morning said one of the volunteers Bajaj Zanji a 20 year old > <head>idlypuri</head> restaurant worker in Tehran My job consists of digging > out the dead with a shovel because we have no other means at our disposal he > </context> > </instance> > <instance id="1"> > <answer instance="1" senseid="NOTAG"/> > <context> > called us his picture wouldn't be spotted in this ad The advertisement > notes that Atta lived among us attending classes shopping at the mall earing > <head>idlypuri</head> going out now and then with friends But it also calls > attention to signs that should have drawn attention to the Egyptian student > like the > </context> > ------------------------------------------------------------------------------------- > > Since here i've not mentioned any KeyFile argument, its using > default "senseid", "instance id" and "lexelt iem". > I want to know about how these 3 things are given in "keyfile", > whether manually or any tool is used here, is there any standard procedure > or what? > > I hope, the point is much clear now. > Sorry for the lengthy mail. > > Expecting your positive reply. > Thanking you. > > -- > Cheers!! > > More Prashant J. > C-DAC (Erstwhile NCST), > Mumbai. > -- Ted Pedersen http://www.d.umn.edu/~tpederse ------------------------------------------------------------------------- SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 _______________________________________________ senseclusters-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/senseclusters-users
