Chris Anderson wrote: Sam Joseph wrote: > > > Where the ranks indicate the number of times a user has bookmarked > > something after searching for it with that keyword, the number of times > > it was clicked through after it was searched for using that keyword and > > the number of times it was returned as a search result for that keyword. > > You can then use some prob maths to compare the ranks. See: > > > > http://www.neurogrid.net/WhitePaper0_3.html > > > > For more details of the maths that can be used for this. > >
> Gotcha. Any thoughts on profiling users bookmarks to estimate keyword > rankings of new data? Well, that's kind of what I'm working on with NeuroGrid now. It's not set up yet, but my approach is to get a person's bookmark file, extract all of the urls out of it, download each of those pages, chew them up, spit out all the tags, and then use some basic information retrieval statistics (like TFIDF - term frequency inverse document frequency) to work out which subset of keywords are relevant and use those as the basis for a user's NeuroGrid profile. One could go so far as to try and create ranks based on the TFIDF and then translate them into usage ranks, like the ones I described, but I think they are just a very different kind of thing, and the idea with NG is that user's should be able to edit all the associations between keywords and their bookmarks, it should all be personalised. So I would imagine using the bookmark file as a way to get some urls into the system, a little TFIDF to provide base associations and then let the searching do its work. NG searching allows urls to become associated with other keywords through multiple keyword searches and so on, so I'm kind of putting my trust in that, rather than some information theoretical scheme that allegedly works out the *best* representation for the data. I think that data should be represented in a way that reflects the way it gets used. CHEERS> SAM p.s. any tips on how I can get my mails to follow the threading in these lists. I beginning to think my only option is to re-subscribe and receive individual messages. _______________________________________________ Devl mailing list Devl at freenetproject.org http://lists.freenetproject.org/mailman/listinfo/devl