[ 
https://issues.apache.org/jira/browse/HAMA-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272228#comment-13272228
 ] 

Mikalai Parafeniuk commented on HAMA-524:
-----------------------------------------

Hello.
Thanks for choosing me for this project. I am glad to announce that the work 
has started. First of all you can find some information about mine 
implementation timeline here: 
https://google-melange.appspot.com/gsoc/project/google/gsoc2012/mikalaj/18001

Some ideas and questions about entire project:
1) I propose to create separate package org.apache.hama.examples.matrix.sparse 
for this project and other code for matrix computing. I think it will be usefull
because it will be hard to find other examples between mine classes.
2) Whether to use mahout package for existing implementations of sparse 
matrices format or create my own implementation, based on their approach? I 
think it won't take much time to create formats and writables, and less 
dependencies will be brought to hama project.

Now I am working on creating abstract interfaces for different formats of 
matrices. Here are some basic concepts:
1) Create class MatrixCell. It will have next fields: row, column, value.
2) Create interface MatrixFormat. The class will give possibility to return 
iterator for MatrixCell. Also format will support adding MatrixCell.
3) Other custom formats like CRS and CCS should implement MatrixFormat.
4) Create interface Converter for conversion between different formats.
5) Create class BaseConverter which converts all formats without exploiting 
internal data structures: iterates MatixCell from one matrix and sets it to 
another.
6) Other converters will implement converter and will exploit internal data 
structures of formats. To register some custom converters between two formats I 
will use configuration: it will consist of class name of input format, class 
name of output format, class name of converter.
7) Use reflection in time of conversion. If custom converter for formats can be 
found, it will be used. Otherwise BaseConverter will be used .

I think this approach gives independence of internal data format and gives 
opportunity to add data format quickly. Some questions, suggestions about this 
idea?

Some general questions:
1) I want to share snapshots of my code nearly once a week. But I think to 
release patches only when some piece of work is complete. It is a good idea to 
use git for this purpose?
                
> [GSoC 2012] Sparse Matrix-Vector multiplication (SpMV) on Hama
> --------------------------------------------------------------
>
>                 Key: HAMA-524
>                 URL: https://issues.apache.org/jira/browse/HAMA-524
>             Project: Hama
>          Issue Type: New Feature
>          Components: bsp core, examples, math
>            Reporter: Edward J. Yoon
>            Assignee: Mikalai Parafeniuk
>              Labels: gsoc, gsoc2012, newbie
>
> Implement Efficient and Fast SpMV algorithm which can be widely used in 
> scientific computing, financial modeling, information retrieval, and others, 
> using Hama Bulk Synchronous Parallel framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to