I am currently writing sth about text retrieval using EM clustering. The approach represents documents as high-dimensional vectors, but still it is not related to Lucene (yet?). How would you add clustering to Lucene? I think it may be a very interesting technique to improve search results. If it works. My current experience shows that it scales rather bad for larger document collections.
I don't think I will take part in Googles SoC, as I have my own "summer of code" right now. But I would surely like to take part in discussions about that topic, or at least read it and throw 2cents at it now and then. cheers Daniel Lorenzo schrieb: >Some people just replied, but I forgot the most important thing... >I'm thinking of this project as part of the Google's Summer of Code program, >so I'm looking for other students. >I've sent an email to Erik and he told me that we can propose this as part >of Google's SoC if we find some other people interested in it. >Lorenzo > >On 6/7/05, Lorenzo <[EMAIL PROTECTED]> wrote: > > >>I'm writing this message trying to find some people interested in creating >>a 'general purpose' lucene search results' clustering extension. >>I wrote a simply implementation of clustering, and I would like to >>contribute to lucene development by releasing an open source clustering >>implementation. I know that maybe each project need a different >>implementation but that would be a useful basis for everyone to develop his >>own project. >>Is anyone interested in it? >>Lorenzo >> >> >> > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]