Re: [Wikitech-l] Parallel computing project

Paul Houle Mon, 25 Oct 2010 09:38:54 -0700

  On 10/24/2010 8:42 PM, Aryeh Gregor wrote:
> My first thought was to write a GPU program to crack MediaWiki
> password hashes as quickly as possible, then use what we've studied in
> class about GPU architecture to design a hash function that would be
> as slow as possible to crack on a GPU relative to its PHP execution
> speed, as Tim suggested a while back.  However, maybe there's
> something more interesting I could do.


     Boring.

     I want Wikipedia converted into facts in a representation system 
that supports modal,  temporal,  and "microtheory" reasoning.  You 
know,  in the "real" world,  :James_T_Kirk is a :Fictional_Character,  
but in the Star Trek universe,  he's a :Person.

     Of course,  you'd have to pick some chunk out of that big task 
that's doable.  One thing I'd like is something that extracts the 
"meaning" of hyperlinks.  For instance,  if we look at

http://en.wikipedia.org/wiki/Bruce_Lee

We see a link to :Wong_Jack_Man,  and in dbpedia right now,  this is 
represented as a unidirectional hyperlink w/o semantics.  Now,  a 
smarter system could say

:Bruce_Lee :Had_A_Fight_With :Wong_Jack_Man.

Although wikipedia is a relatively difficult text to work with with 
typical BOW and NLP methods,  it's got enough semantic structure that 
hybrid semantic-BOW/NLP methods ought to be able to work miracles.  I 
think that the way hyperlinks are used in text could be used to learn 
templates for detecting named entity references.   I think it also ought 
to be possible to build linguistic models for classification.  For 
instance,  if you're having trouble telling your Jaguars apart,

http://en.wikipedia.org/wiki/Jaguar_(disambiguation) 
<http://en.wikipedia.org/wiki/Jaguar_%28disambiguation%29>

and related documents might help you make a filter that can tell the 
difference between "jaguar the cat" and "jaguar the car".

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parallel computing project

Reply via email to