Hello 张亚

Le 03/05/13 17:42, 张亚 a écrit :
I want to implement Hadoop base back-end for SIS during the GSoC 2013
period. And have submitted a proposal.
Any suggestion will be welcome.

I would like to help, but it is not clear to me which part of SIS could be the subject of a Hadoop work yet... Hadoop is a framework that allows for the distributed processing of large data sets across clusters of computers. Problem is that there is not yet (to my knowledge) any process in SIS which are numerically intensive enough for experimenting Hadoop. Those processes will exist later, but are not yet there.

Actually we experimented Hadoop on our side about 2 or 3 years ago. We had a student who worked on that subject. His work was to rotate a tiled image, where each tiles were processed by a different node on a cluster. Of course doing an image rotation is a relatively simple process, but the goal was to experiment distributed processing rather than doing "real" work. The experiment was not very conclusive in part because we tried to perform the rotation with Java Advanced Imaging (JAI) and it was pretty hard to use JAI with Hadoop (JAI was not designed for that), and in part because the time needed for transferring large tiles between the nodes (even with ultra fast transfers) overcome the gain of using many nodes for such a "simple" task as image rotation. So we gained just enough experience for concluding that this is a challenging topic.

But on the "SIS + Hadoop" topic, it seems to me that before to try Hadoop with needs to have some SIS processes? Would creating those processes be part of the Google Summer of Code? If so, it seems to me that this work alone could keep someone busy for the whole summer...

Regards,

Martin

Reply via email to