-----Original Message-----
From: Joaquin Delgado 
Sent: Friday, January 28, 2005 4:41 PM
To: 'Lucene Developers List'; [EMAIL PROTECTED]
Subject: RE: -> Grouping Search Results by Clustering Snippets:

This is a very interesting thread. Down is a link to a paper I published many 
years ago (1998) about RAAP, a bookmark recommender system:

DELGADO, J., ISHII, N. and URA, T., "Content-based Collaborative Information 
Filtering: Actively Learning to Classify and Recommend Documents" in M. Klusch, 
G. Weiß (Eds.): (1998) Cooperative Information Agents II. Learning, Mobility 
and Electronic Commerce for Information Discovery on the Internet. 
Springer-Verlag, Lecture Notes in Artificial Intelligence Series No. 1435.

http://www.triplehop.com/pdf/cia-final.pdf

Regarding the clustering technique, I'd like your opinion on the topic 
clustering you can find at http://www.find.com

This one uses title and snippets from the external engines and "concepts" 
extracted from documents at indexing time.

And for those interested (and willing to read a big chuck of old good stuff 
about information filtering and recommender systems :-) you can also access my 
2000 PhD. Thesis at: http://www.triplehop.com/pdf/Doctoral_Thesis.pdf


Cheers,

-- Joaquin


-----Original Message-----
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Friday, January 28, 2005 2:47 PM
To: Lucene Developers List; [EMAIL PROTECTED]
Subject: RE: -> Grouping Search Results by Clustering Snippets:

This is very much of interest to me.  Although it's not in the UI, I
did integrate Lucene and Carrot2 in Simpy ( http://www.simpy.com ). 
Clustering is currently triggered only by a search.  Although you may
not be able to tell (again, sucky UI) Simpy is designed in a way that
will let me hook in a recommender system, much like you describe it. 
Users store links into their Simpy accounts, they tag them, perform
searches, find other users, add them to their Topics (Simpy-specific
thing), and so on, so there is a lot of knowledge about a user that can
be derived from all that.  Currently, the only quasi-smart thing that
goes beyond a simple search is 'More users like this', and even that
has a small bug that I need to fix for the next release, but what you
are describing sounds very much like one of the directions in which I
want to take Simpy and its users. :)

Otis


--- Adam Saltiel <[EMAIL PROTECTED]> wrote:

> This has been implemented in open source, but not with lucene?
> http://www.cs.put.poznan.pl/dweiss/carrot/
> and
> http://carrot2.sourceforge.net/
> David Weiss is a Polish academic at Poznan University, Poland. He and
> others have implemented a servlet based web app that uses pipe lined
> components that communicate using http and implement a couple of
> clustering algorithms.
> Clustering, of course, can go way beyond search result presentation
> and
> there are some very suggestive examples at
> http://www.sics.se/humle/socialcomputing/
> Where the encore project (Martin Svennson) is based on orthogonal
> transformations of a large sparse matrix (a possible method for
> matrix
> dimension reduction). I think it would be interesting to hook a
> recommender system into lucene, thus clustering would take place on
> the
> basis of user profile which may be built up automatically by
> accumulating clicks and comparing to other visitors, with some
> intelligent weighting to node inputs.
> This calls into question what really a search is, does it have to be
> instigated by the user or might their context and history suggest
> enough
> to pull in additional material? So this would be on top of snippets
> and
> also influence what snippets are returned as well as their
> presentation.
> Coller still would be to be able to recognise the user without a
> login.
> This might be implemented with cookies, but to deal with the user in
> terms of types of interests, a series of faceted profiles, so that
> portals could become fluidly dynamic. Sounds far flung, but I
> actually
> think it is just round the corner.
> Let me know if this is of interest.
> 
> Adam
> 
> > -----Original Message-----
> > From: integer [daniel prawdzik] [mailto:[EMAIL PROTECTED]
> > Sent: Wednesday, January 26, 2005 5:17 PM
> > To: lucene-dev@jakarta.apache.org
> > Subject: -> Grouping Search Results by Clustering Snippets:
> >
> > Grouping Search Results by Clustering Snippets:
> >
> > The presentation of search engines are typically long unsorted
> lists
> of
> > results. To find the page you're looking for, is often
> time-consuming
> > and unsatisfying.
> > Showing the results in groups by similar  topics is a quite more
> > suitable solution to give an user a quick overview over the
> results.
> > This can be done by a technology called cluster analysis. Actually
> I'm
> > working on my diploma master thesis about this topic. In my
> > understanding, it's too nice to be born for the archive, so I want
> to
> > implement this feature in an opensource software. The coding of
> this
> > programm already gone pretty far, I've got some tests done and the
> > results are impresive and might still get better [you can see some
> > results on http://www.trist.de/CV/Text-Mining/ -> sorry, only in
> german]
> >
> > To make a long story short:
> > I'm wondering, if this is an attractive feature for the lucene
> > community?
> >
> > regards,
> > integer
> >
> >
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to