[ 
https://issues.apache.org/jira/browse/MAHOUT-332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844392#action_12844392
 ] 

Kay Kay commented on MAHOUT-332:
--------------------------------

{quote}
must explore SQL and NOSQL implementations, and design a framework with which 
data from them could be fetched and converted to mahout format or used directly 
as a matrix transparently
{quote}

It would be useful to use thrift as the protocol with the noSQL systems,  as 
opposed to the native API of them so that a nice abstraction could be made for 
all the NoSQL systems in general and specific thrift client implementations 
added to maximize code re-use.  Even if someone were to make the port for 1 
NoSQL client, having the demarcation would help to pick up and port. 

> Create adapters for  MYSQL and NOSQL(hbase, cassandra) to access data for all 
> the algorithms to use
> ---------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-332
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-332
>             Project: Mahout
>          Issue Type: New Feature
>            Reporter: Robin Anil
>
> A student with a good proposal 
> - should be free to work for Mahout in the summer and should be thrilled to 
> work in this area :)
> - should be able to program in Java and be comfortable with datastructures 
> and algorithms
> - must explore SQL and NOSQL implementations, and design a framework with 
> which data from them could be fetched and converted to mahout format or used 
> directly as a matrix transparently
> - should have a plan to make it high performance with ample caching 
> strategies or the ability to use it on a map/reduce job
> - should focus more on getting a working version than to implement all 
> functionalities. So its recommended that you divide features into milestones
> - must have clear deadlines and pace it evenly across the span of 3 months.
> If you can do something extra it counts, but make sure the plan is reasonable 
> within the specified time frame.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to