[
https://issues.apache.org/jira/browse/AIRAVATA-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17941654#comment-17941654
]
Vaibhav Sharma commented on AIRAVATA-3965:
------------------------------------------
Hello, I just wanted to connect once to discuss on some doubts related to the
project statement and any relevant papers or work that has been done to get the
proposal better. Please let me know a time that works best for everyone. Thank
you.
> Facilitating computational experiment generation in AIRAVATA
> ------------------------------------------------------------
>
> Key: AIRAVATA-3965
> URL: https://issues.apache.org/jira/browse/AIRAVATA-3965
> Project: Airavata
> Issue Type: New Feature
> Reporter: Giri Krishnan
> Priority: Major
> Labels: gsoc, gsoc2025, mentor
>
> Computational sciences involve extensive experimentation which often involves
> searching over space of parameters, variables, functions and workflows.
> Individual researchers and groups often perform a large number of such
> searches to identify critical functional forms and workflows for any
> particular study. The goal of this work is to provide a tool that facilitates
> this search process. This will enable visualization, identifying or learning
> templates and generate potential experiments based on past experiments using
> LLM and neurosymbolic methods.
>
> This task requires the following specific goals for this work :
> # Provide visualization of past computational experiments: Tracking various
> computational experiments with various variations is often a challenging
> problem for individuals and groups of researchers. Often various adhoc
> approaches (directories, git etc) are used to track these changes, but often
> it is very difficult to provide an entire overview of past experiments. The
> goal of this work is to develop a visualization approach that allows to
> examine all the past experiments. This will require dimensionality reduction
> on the embeddings from LLMs which have been tested on its code generation
> abilities (eg. codellama, Llama 4 Maverick) for generating visualizations.
> Further comparison in the performance with standard code cloning and
> similarity measures will be required.
> # Identify template based on past experiment database: It is common for
> several computational experiments to share a common structure, in such cases
> identifying the 'template' allows for identifying common approaches in past
> experiments and to generate new ones. This work will need software engineer
> approach and AI based approaches to identify such templates. The templates
> will also be integrated with the visualization (in addition to embedding
> based visualization) allowing for examining the collections of experiments
> that belong to each template.
> # Generate new suggested experiments using templates and visualization
> guided search. Generation of new experiments is a key component of
> computational science work. To facilate this process, will require a visual
> interactive way to generate experiments based within the regions of previous
> experiments and also in the space where it was not previously explored. In
> addition, this will require generation of new experiments based on templates
> that were identified from the previous step. Template based generation could
> also provide a verifiable way to generate experiments.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)