[ https://issues.apache.org/jira/browse/MADLIB-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173113#comment-15173113 ]
ANISH SINGH commented on MADLIB-927: ------------------------------------ Hello Rahul Sir, I'm Anish, a sophomore CSE student. Last winter, I decided to develop a share price prediction program and started work on it. I decided to use Apache Spark ml libraries, but they did not contain a default implementation of k-NN algorithm and it has not been developed as of now. I extensively studied papers about the algorithm and find myself in a suitable position to work on this project for the entire Summer. I would like to request to be guided further about the issue so that I can study more about it and draw up my proposal. The completion of the project would facilitate my previous attempts at the share price prediction program. Thank You. > Initial implementation of k-NN > ------------------------------ > > Key: MADLIB-927 > URL: https://issues.apache.org/jira/browse/MADLIB-927 > Project: Apache MADlib > Issue Type: New Feature > Reporter: Rahul Iyer > Labels: gsoc2016, starter > > k-Nearest Neighbors is a very simple algorithm that is based on finding > nearest neighbors of data points in a metric feature space according to a > specified distance function. It is considered one of the canonical algorithms > of data science. It is a nonparametric method, which makes it applicable to a > lot of real-world problems, where the data doesn’t satisfy particular > distribution assumptions. Also, it can be implemented as a lazy algorithm, > which means there is no training phase where information in the data is > condensed into coefficients, but there is a costly testing phase where all > data is used to make predictions. > This JIRA involves implementing the naïve approach - i.e. compute the k > nearest neighbors by going through all points. -- This message was sent by Atlassian JIRA (v6.3.4#6332)