Respected Sir,
I've experimented with Sense Clusters using the datasets provided on the
site.
Now I want to use my own data with Sense Clusters.
I've the data in plain text files and I need to convert it to Senseval2
format as SenseCluster requires it in that format.
The script "text2sval.pl" converts plain text files into Senseval2 format.
For that, it asks for a KeyFile which is supposed to contain instance ids
and optional sense tags of the instances in the text file.
Though the keyfile is an optional argument to "text2sval.pl", its not giving
much clear output without key file.
So, I want to know, whether its created manually(if so, is there any
standard procedure?) or any tool is used to create it?
To make the point clear i'm giving a snapshot of both the input and
output below.
The Sample input to "text2sval.pl" is,
-------------------------------------------------------------------------------------
us all natives of this region as soon we heard about the catastrophe
Saturday morning said one of the volunteers Bajaj Zanji a 20 year old
<head>idlypuri</head> restaurant worker in Tehran My job consists of digging
out the dead with a shovel because we have no other means at our disposal
he
called us his picture wouldn't be spotted in this ad The advertisement
notes that Atta lived among us attending classes shopping at the mall earing
<head>idlypuri</head> going out now and then with friends But it also calls
attention to signs that should have drawn attention to the Egyptian student
like the
-------------------------------------------------------------------------------------
The Output displayed is like this,
-------------------------------------------------------------------------------------
<corpus lang="english">
<lexelt item="LEXELT">
<instance id="0">
<answer instance="0" senseid="NOTAG"/>
<context>
us all natives of this region as soon we heard about the catastrophe
Saturday morning said one of the volunteers Bajaj Zanji a 20 year old
<head>idlypuri</head> restaurant worker in Tehran My job consists of digging
out the dead with a shovel because we have no other means at our disposal
he
</context>
</instance>
<instance id="1">
<answer instance="1" senseid="NOTAG"/>
<context>
called us his picture wouldn't be spotted in this ad The advertisement
notes that Atta lived among us attending classes shopping at the mall earing
<head>idlypuri</head> going out now and then with friends But it also calls
attention to signs that should have drawn attention to the Egyptian student
like the
</context>
-------------------------------------------------------------------------------------
*Since here i've not mentioned any KeyFile argument, its using
default "senseid", "instance id" and "lexelt iem".
I want to know about how these 3 things are given in "keyfile",
whether manually or any tool is used here, is there any standard procedure
or what? *
I hope, the point is much clear now.
Sorry for the lengthy mail.
Expecting your positive reply.
Thanking you.
--
Cheers!!
More Prashant J.
C-DAC (Erstwhile NCST),
Mumbai.
-------------------------------------------------------------------------
SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell. From the desktop to the data center, Linux is going
mainstream. Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4
_______________________________________________
senseclusters-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/senseclusters-users