Hi all,

I'm on my way of writing a formal proposal for GSOC but i want to test
the waters on my idea, on what to focus my application.

So basically on my college we use Elements of statistical learning
heavily, and we also sometimes dwell in (even) more advanced
techniques if they are needed.

I have seen on JIRA that there is interest (based on this ticket
https://issues.apache.org/jira/browse/MAHOUT-597) to use Kernels,
though they are no used for localization as they are in my proposal.

My work would consist in two parts : Add a kernel smoothing
implementation for current implementations of k-NN. This is useful for
assigning weights to the different points in the neghborhood
(depending of the point features) which makes a k-NN classification
much less prone to wiggling from one class to the other.

The other thing to implement this summer for expanding k-NN is :
locality-Sensitive Hashing (LSH) which is an algorithm for solving the
(approximate/exact) Near Neighbor Search in high dimensional spaces.
LSH is great for doing dimension reduction. LSH is a good way . LSH is
good for situations were you have high dimensions and you want
accurate results.

Also I wil have to integrate this small improvements to k-NN into Hadoop jobs.

Thanks and hope to hear for you people

Federico

-- 
Federico Brubacher
@fbru02

Reply via email to