Dear all,

Sorry if you received multiple copies of this email (The links were
embedded). Here are the actual links:

*Figure:*
https://drive.google.com/file/d/0B7ON7bq1zRm3Sm1YYktJTVctLWs/view?usp=sharing
*Semeval/senseval results summary:*
https://docs.google.com/spreadsheets/d/1NCiwXBQs0rxUwtZ3tiwx9FZ4WELIfNCkMKp8rlnKObY/edit?usp=sharing
*Literature survey of WSD techniques:*
https://docs.google.com/spreadsheets/d/1WQbJNeaKjoT48iS_7oR8ifZlrd4CfhU1Tay_LLPtlCM/edit?usp=sharing

Yours faithfully

On Mon, May 18, 2015 at 10:17 PM, Anthony Beylerian <
anthonybeyler...@hotmail.com> wrote:

> Please excuse the duplicate email, we could not attach the mentioned
> figure.
> Kindly find it here.
> Thank you.
>
> From: anthonybeyler...@hotmail.com
> To: dev@opennlp.apache.org
> Subject: GSoC 2015 - WSD Module
> Date: Mon, 18 May 2015 22:14:43 +0900
>
>
>
>
> Dear all,
> In the context of building a Word Sense Disambiguation (WSD) module, after
> doing a survey on WSD techniques, we realized the following points :
> - WSD techniques can be split into three sets (supervised,
> unsupervised/knowledge based, hybrid) - WSD is used for different directly
> related objectives such as all-words disambiguation, lexical sample
> disambiguation, multi/cross-lingual approaches etc.- Senseval/Semeval seem
> to be good references to compare different techniques for WSD since many of
> them were tested on the same data (but different one each event).- For the
> sake of making a first solution, we propose to start with supporting the
> "lexical sample" type of disambiguation, meaning to disambiguate
> single/limited word(s) from an input text.
> Therefore, we have decided to collect information about the different
> techniques in the literature (such as  references, performance, parameters
> etc.) in this spreadsheet here.Otherwise we have also collected the results
> of all the senseval/semeval exercises here.(Note that each document has
> many sheets)The collected results, could help decide on which techniques to
> start with as main models for each set of techniques
> (supervised/unsupervised).
> We also propose a general approach for the package in the figure
> attached.The main components are as follows :
> 1- The different resources publicly available : WordNet, BabelNet,
> Wikipedia, etc.However, we would also like to allow the users to use their
> own local resources, by maybe defining a type of connector to the resource
> interface.
> 2- The resource interface will have the role to provide both a sense
> inventory that the user can query and a knowledge base (such as semantic or
> syntactic info. etc.) that might be used depending on the technique.We
> might even later consider building a local cache for remote services.
> 3- The WSD algorithms/techniques themselves that will make use of the
> resource interface to access the resources required.These techniques will
> be split into two main packages as in the left side of the figure :
> Supervised/Unsupervised.The utils package includes common tools used in
> both types of techniques.The details mentioned in each package should be
> common to all implementations of these abstract models.
> 4- I/O could be processed in different formats (XML/JSON etc) or a simpler
> structure following your recommendations.
> If you have any suggestions or recommendations, we would really appreciate
> discussing them and would like your guidance to iterate on this tool-set.
> Best regards,
>
> Anthony Beylerian, Mondher Bouazizi
>

Reply via email to