Robert Stojnic <rainmansr <at> gmail.com> writes:

> 
> 
> Hi Gautham,
> 
> I think mining wiktionary is an interesting project. However, about the 
> more practical Lucene part: at some point I tried using wordnet to 
> expand queries however I found that it introduces too many false 
> positives. The most challenging part I think it *context-based* 
> expansion. I.e. a simple synonym-based expansion is of no use because it 
> introduces too many meanings that the user didn't quite have in mind. 
> However, if we could somehow use the words in the query to find a 
> meaning from a set of possible meanings that could be really helpful.
> 
> You can look into existing lucene-search source to see how I used 
> wordnet. I think in the end I ended up using it only for very obvious 
> stuff (e.g. 11 = eleven, UK = United Kingdom, etc..).
> 
> Cheers, r.
> 
> On 06/04/12 01:58, Gautham Shankar wrote:
> > Hello,
> >
> > Based on the feedback i received i have updated my proposal page.
> >
> > https://www.mediawiki.org/wiki/User:Gautham_shankar/Gsoc
> >
> > There is about 20 Hrs for the deadline and any final feedback would be
> > useful.
> > I have also submitted the proposal at the GSOC page.
> >
> > Regards,
> > Gautham Shankar
> > _______________________________________________
> > Wikitech-l mailing list
> > Wikitech-l <at> lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> 

Hi Robert,

Thank you for your feedback.
Like you pointed out, query expansion using the wordnet data directly, reduces 
the quality of the search.

I found this research paper very interesting.
www.sftw.umac.mo/~fstzgg/dexa2005.pdf
They have built a TSN (Term Semantic Network) for the given query based on the 
usage of words in the documents. The expansion words obtained from the wordnet 
are then filtered out based on the TSN data.

I did not add this detail to my proposal since i thought it deals more with the 
creation of the wordnet. I would love to implement the TSN concept once the 
wordnet is complete.

Regards,
Gautham Shankar



_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to