[ https://issues.apache.org/jira/browse/AIRAVATA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385595#comment-14385595 ]
pankaj saha commented on AIRAVATA-1646: --------------------------------------- Hi Chris, I haven't considered it yet, but would definitely be interested in interacting with the mentors during the community bonding phase regarding the best way to incorporate OODT. > [GSoC] Brainstorm Airavata Data Management Needs > ------------------------------------------------ > > Key: AIRAVATA-1646 > URL: https://issues.apache.org/jira/browse/AIRAVATA-1646 > Project: Airavata > Issue Type: Brainstorming > Reporter: Suresh Marru > Labels: gsoc, gsoc2015,, mentor > > Currently Airavata focuses on Execution Management and the Registry > Sub-System (with app, resource and experiment catalogs) capture metadata > about applications and executions. There were few efforts (primarily from > student projects) to explore this void. It will be good to concretely propose > data management solutions to for input data registration, input and generated > retrieval, data transfers and replication management. > Metadata Catalog: In addition current metadata management is based on > shredding thrift data models into mysql/derby schema. This is described in > [1]. We have discussed extensively on using Object Store data bases with a > conclusion of understanding the requirements more systematically. A good > stand alone task would be to understand current metadata management and > propose alternative solutions with proof of concept implementations. Once the > community is convinced, we can then plan on implementing them into > production. > Provenance: Airavata could be enhanced to capture provenance to organize the > data for reuse, discovery, comparison and sharing. This is a well explored > field. There might be good compelling third party solutions. Especially it > will be good to explore in the bigdata space and identify leverages (either > concepts, or even better implementations). > Auditing and Traceability: As Airavata mediates executions on behalf of > gateways, it has to strike a balance between abstracting the compute resource > interactions at the same time providing transparent execution trace. This > will bloat the amount of data to be catalogued. A good effort will be to > understand the current extent of airavata audits and provide suggestions. > BigData Leverage: Airavata needs to leverage the influx of tools in this > space. Any suggestions on relevant tools which will enhance Airavata > experience will be a good fit. > [1] - > https://cwiki.apache.org/confluence/display/AIRAVATA/Airavata+Data+Models+0.12 > [2] - http://markmail.org/thread/4lguliiktjohjmsd -- This message was sent by Atlassian JIRA (v6.3.4#6332)