Hi,

I wanted to pick people's brains a little bit on the subject of determining 
importance.  This isn't necessarily Mahout related, although I think we have 
some tools that help in the area.

One of the emerging trends it seems these days with all our connectivity and 
content is a notion of importance/priority.  Some examples: 
1. Google now has "Priority Inbox" for instance and I think most would agree 
that for things like Twitter and Facebook it would be really nice if you could 
separate out the Important updates/people from the less important.  
2. Identifying important phrases, etc. in text across a corpus.  
3. One of the things I think most researchers do when exploring a new topic is 
to identify the one or two seminal papers in the field, read them, and then 
read the ones that cite those papers and so on.
4. Take in all the day's news and figure out what the key articles are to read 
(in some sense it's picking the most representative document in a cluster) or 
that the article talking about raising Federal income taxes is likely more 
important
than the one talking about raising local sales tax (or vice versa!)
5. PageRank, TextRank, etc. and other approaches to calculating authority

What I'm looking for is help in researching this area.  Is there a name for 
this (sub-)field (importance theory? prioritization theory?), particularly in 
mach. learning and NLP that is geared towards this?  I realize some (most) of 
these problems can be solved with classifiers amongst other things like graph 
algorithms (particularly ones that use the social graph), but it also seems 
like the area is bigger than a particular implementation, so I wanted to hear 
what others thought.  How would you go about solving these problems?  Do you 
have any pointers to useful references on the subject (theoretical or 
practical)?  What other examples have you run up against?

Thanks,
Grant

Reply via email to