[
https://issues.apache.org/jira/browse/HAMA-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272228#comment-13272228
]
Mikalai Parafeniuk commented on HAMA-524:
-----------------------------------------
Hello.
Thanks for choosing me for this project. I am glad to announce that the work
has started. First of all you can find some information about mine
implementation timeline here:
https://google-melange.appspot.com/gsoc/project/google/gsoc2012/mikalaj/18001
Some ideas and questions about entire project:
1) I propose to create separate package org.apache.hama.examples.matrix.sparse
for this project and other code for matrix computing. I think it will be usefull
because it will be hard to find other examples between mine classes.
2) Whether to use mahout package for existing implementations of sparse
matrices format or create my own implementation, based on their approach? I
think it won't take much time to create formats and writables, and less
dependencies will be brought to hama project.
Now I am working on creating abstract interfaces for different formats of
matrices. Here are some basic concepts:
1) Create class MatrixCell. It will have next fields: row, column, value.
2) Create interface MatrixFormat. The class will give possibility to return
iterator for MatrixCell. Also format will support adding MatrixCell.
3) Other custom formats like CRS and CCS should implement MatrixFormat.
4) Create interface Converter for conversion between different formats.
5) Create class BaseConverter which converts all formats without exploiting
internal data structures: iterates MatixCell from one matrix and sets it to
another.
6) Other converters will implement converter and will exploit internal data
structures of formats. To register some custom converters between two formats I
will use configuration: it will consist of class name of input format, class
name of output format, class name of converter.
7) Use reflection in time of conversion. If custom converter for formats can be
found, it will be used. Otherwise BaseConverter will be used .
I think this approach gives independence of internal data format and gives
opportunity to add data format quickly. Some questions, suggestions about this
idea?
Some general questions:
1) I want to share snapshots of my code nearly once a week. But I think to
release patches only when some piece of work is complete. It is a good idea to
use git for this purpose?
> [GSoC 2012] Sparse Matrix-Vector multiplication (SpMV) on Hama
> --------------------------------------------------------------
>
> Key: HAMA-524
> URL: https://issues.apache.org/jira/browse/HAMA-524
> Project: Hama
> Issue Type: New Feature
> Components: bsp core, examples, math
> Reporter: Edward J. Yoon
> Assignee: Mikalai Parafeniuk
> Labels: gsoc, gsoc2012, newbie
>
> Implement Efficient and Fast SpMV algorithm which can be widely used in
> scientiļ¬c computing, financial modeling, information retrieval, and others,
> using Hama Bulk Synchronous Parallel framework.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira