[5] is another paper that I just went through.

[5] -
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.140.3264&rep=rep1&type=pdf

Thanks,
Danushka

On Tue, Sep 25, 2012 at 5:40 AM, Danushka Menikkumbura <
[email protected]> wrote:

> Hi all,
>
> I am a student of 2012 M.Sc.(CS) batch of University of Moratuwa, Sri
> Lanka. Big data is one of the areas that I research and I am currently
> looking into possibilities and challenges in bringing in big data
> capabilities to science gateways under the supervision of Dr. Shahani
> Weerawarana. With the knowledge that I have gathered so far, I understand
> that Airavata lacks its strength in this area.
>
> Basically support for big data in Airavata could be in different shapes.
>
> 1. Simply make big data techniques available during workflow execution.
> This could be in the form of MapReduce (Hadoop), BigTable data models
> (Cassandra), etc. The idea is to handle huge data volumes as mentioned in
> [1]. (e.g. 700 TB/sec data flood off the SKA [2] in near future).
>
> 2. Using a big-data-ready distributed filesystem as the core filesystem of
> Airavata (e.g. HDFS) and make is available across the framework.
>
> 3. Challenges related to data provenance [3], [4].
>
> I believe you see things better when you look at Airavata from these
> perspectives and maybe you have already put thoughts into these aspects.
>
> Please share your thoughts and help me understand what I should actually
> look into.
>
> [1] - http://www.slideshare.net/Hadoop_Summit/big-data-challenges-at-nasa
> [2] - http://en.wikipedia.org/wiki/Square_Kilometre_Array
> [3] - http://rac.uits.iu.edu/sites/default/files/SimmhanICWS06.pdf
> [4] - http://bit.ly/PC2Eq4
>
> Thanks,
> Danushka
>
>
>

Reply via email to