Hi Abhinav,

Can you post your proposal to this JIRA and also start drafting it on the GSoC 
portal?

https://issues.apache.org/jira/browse/AIRAVATA-3608

Suresh

On Apr 15, 2022, at 12:41 PM, Abhinav Sinha 
<[email protected]<mailto:[email protected]>> wrote:

This message was sent from a non-IU address. Please exercise caution when 
clicking links or opening attachments from external sources.

Hello Dev!

I’ve attached the first draft of my project proposal for GSoC 2022. I’d love 
for you to take a look and suggest any improvements/changes.

Thanks,
Abhinav

From: Abhinav Sinha <[email protected]<mailto:[email protected]>>
Date: Sunday, April 3, 2022 at 7:48 AM
To: Airavata Dev <[email protected]<mailto:[email protected]>>
Cc: Ranawaka, Isuru Janith <[email protected]<mailto:[email protected]>>, Marru, 
Suresh <[email protected]<mailto:[email protected]>>
Subject: Apache Custos for GSoC 2022
Hi,

I had fruitful discussions with Isuru last week.


  1.  We started off with an overview of the Custos Portal. Isuru provided me 
with a demo-run of a sample use case. We went over the Authentication 
process->Authorization tiers-> Tenant creation->Role based configurations. As 
we were going through the demo, Isuru explained the different features of the 
application.

(Following our discussion, I went through the tutorial 
here<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FCUSTOS%2FTutorial%2BSteps%2Bfor%2BReference%2BPortal%2Band%2BCustos%2BPortal&data=04%7C01%7Csmarru%40iu.edu%7Cbd8bb3114c99457bb9f008da1efec5ae%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C637856378059319645%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=kg%2FBDVRFZPbpIpMqQr7EfnjDkrq03mkfRE68tJaylgA%3D&reserved=0>
 to recap.


  1.  Isuru provided me with a brief summary of Custos Architecture as well. I 
had read the following papers suggested by Suresh earlier to build on my 
understanding –

https://dl.acm.org/doi/pdf/10.1145/3311790.3396635<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdl.acm.org%2Fdoi%2Fpdf%2F10.1145%2F3311790.3396635&data=04%7C01%7Csmarru%40iu.edu%7Cbd8bb3114c99457bb9f008da1efec5ae%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C637856378059319645%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=8wCqcxTA25MN8qdebMD9E3cfxTHIkp%2FD43N2crOtC9s%3D&reserved=0>
https://arxiv.org/pdf/2107.04172.pdf<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Farxiv.org%2Fpdf%2F2107.04172.pdf&data=04%7C01%7Csmarru%40iu.edu%7Cbd8bb3114c99457bb9f008da1efec5ae%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C637856378059319645%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=4SXFrcA0eHL%2Fk5tF8JXFJ%2BE5NrPecP3LIm4eOQrzEpI%3D&reserved=0>


  1.  Then we went over the current deployment architecture. Here, Isuru demoed 
the Kubernetes cluster deployment and we looked at the various components and 
their config files. We also went over the Keycloak IDP and HashiCorp secret 
storage.

(I went through the tutorial 
here<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FCUSTOS%2FCustos%2BDeployment%2BArchitecture%2Band%2BInstallation%2BGuide&data=04%7C01%7Csmarru%40iu.edu%7Cbd8bb3114c99457bb9f008da1efec5ae%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C637856378059319645%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=rrv59rQERtmHGgTEk%2BBqiMmvkuqDRQC703Qqd6tpPCY%3D&reserved=0>
 to recap our discussion)

After the demos, we discussed the following open items that I could possibly 
work on as part of GSoC.


  1.  An important missing piece in Custos today is external data backups (as 
mentioned in the documentation 
here<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FCUSTOS%2FCustos%2BDeployment%2BArchitecture%2Band%2BInstallation%2BGuide&data=04%7C01%7Csmarru%40iu.edu%7Cbd8bb3114c99457bb9f008da1efec5ae%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C637856378059319645%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=rrv59rQERtmHGgTEk%2BBqiMmvkuqDRQC703Qqd6tpPCY%3D&reserved=0>).
 He suggested using Velero (which is an open source tool to backup Kubernetes 
resources) to create database backups.

I am going over the documentation here to understand how to use 
Velero<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fvelero.io%2Fdocs%2Fv1.8%2Fhow-velero-works%2F&data=04%7C01%7Csmarru%40iu.edu%7Cbd8bb3114c99457bb9f008da1efec5ae%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C637856378059319645%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=ieY%2BCuu276h7N%2FDWQzh07lSRq61JiMekq3GZCSTidnU%3D&reserved=0>
 and come up with a plan to implement the data back up feature.


  1.  Another open item is creating a Custos 
Operator<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fairavata-custos%2Fissues%2F149&data=04%7C01%7Csmarru%40iu.edu%7Cbd8bb3114c99457bb9f008da1efec5ae%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C637856378059319645%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=JtlbwdADQTrJSSCno9bsuZiQNglmHPoU4ghl5LOkVtk%3D&reserved=0>.
 The goal here is to automate deployments. Isuru briefly went over the 
Kubernetes Operator 
pattern<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkubernetes.io%2Fdocs%2Fconcepts%2Fextend-kubernetes%2Foperator%2F&data=04%7C01%7Csmarru%40iu.edu%7Cbd8bb3114c99457bb9f008da1efec5ae%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C637856378059319645%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=WS%2B8EL8k7nQYkk1yrZq3S3FS4qx4kN57C25ON%2FdpRmI%3D&reserved=0>
 that could help provide an abstract resource to manage deployments for all of 
the Custos microservices (Current deployments are done using a Maven task)

As per Isuru’s suggestion, I am going over the keycloak 
operator<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkeycloak%2Fkeycloak-operator&data=04%7C01%7Csmarru%40iu.edu%7Cbd8bb3114c99457bb9f008da1efec5ae%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C637856378059475884%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=WZj7Zi9ZHW4L3TpEPfhig3lwwG3zNdTu9NcmstIRMhY%3D&reserved=0>
 as a guide to implement something similar for Custos.


  1.  Finally, Isuru highlighted the need for an intelligent way to divide 
microservices in deployment configs in order to boost performance and improve 
memory consumption.

Currently, in Custos, this composition is based on the functions served by the 
microservices -> so we have 2 major units in the deployments - Core services 
and Integration services. This division approach isn’t very scalable. A better 
approach could be based on resource utilization (and other carefully designed 
heuristics – maybe like affinity as discussed in this 
paper<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fieeexplore.ieee.org%2Fdocument%2F6531761&data=04%7C01%7Csmarru%40iu.edu%7Cbd8bb3114c99457bb9f008da1efec5ae%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C637856378059475884%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=8rsr2nBPcI0dVY7v9ljgzmArPFBQChtB2EJQORqEZqQ%3D&reserved=0>.
 I am also going over 
this<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fjisajournal.springeropen.com%2Farticles%2F10.1186%2Fs13174-019-0104-0&data=04%7C01%7Csmarru%40iu.edu%7Cbd8bb3114c99457bb9f008da1efec5ae%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C637856378059475884%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=FSrCdPLcxcvy7%2BYaNa2Bl9m2GYc0K6NPHCzYXGGE7M0%3D&reserved=0>
 cool paper that discusses this problem.

I would love to work on any of the 3 open items, but I need some of your ideas 
on which of these could be a good GSoC project.

#1: Isuru pointed out that Data-backup is a high priority item for Custos at 
the moment, but it may/may not be an ideal choice for a full-fledged GSoC 
project.

#2: Building a Custos operator, unlike that of the data back-up feature, is a 
full-fledged project in itself. It needs expert understanding of the Kubernetes 
Operator pattern – I am starting to explore it.
I
#3 ’d love to explore the open research problem of microservice placement, but 
I’d like to hear your opinion on it – and if you think this could be a good 
project. This closely aligns with my goal of exploring a research problem as 
part of my Master’s Thesis project.
Thanks,
Abhinav
<GSoC_Proposal.pdf>

Reply via email to