Hi Jayan, Thanks for sharing. One question, the airavata-data-catalog already has a DATA_PRODUCT table and a way to store a data product's metadata. Could that be used instead of adding a new table?
Or more generally my question is how does this replica catalog API relate to the data catalog API/data model? Thanks, Marcus > On Mar 31, 2023, at 4:11 PM, Jayan Vidanapathirana > <jcvidanapathir...@gmail.com> wrote: > > Hi All, > > I have implemented basic flow(simple create and retrieve) of the replica > catalog and drafted a pull request[1] to the Airavata data catalog as a new > module. According to that implementation I have come to the following > database structure for the replica catalog and I greatly appreciate your > thoughts and feedback on the designs[2]. At this stage S3 storage type was > considered as a sample. > > <Replica Catalog V2.drawio (1).png> > > Also please refer to the following google doc[3] to review the implemented > APIs. > > [1] https://github.com/apache/airavata-data-catalog/pull/28 > [2] > https://drive.google.com/file/d/1KP-8IWdvpPvjSWUG2t41K7WQXW2f9_qN/view?usp=sharing > [3] > https://docs.google.com/document/d/1U-ok1ICt_EmjjxR9UuACV6g6YYkgoADfm0ECJ9ZDI3k/edit?usp=sharing > > Thank you. > > On Mon, Mar 20, 2023 at 2:20 AM Suresh Marru <sma...@apache.org> wrote: > Hi Jayan, > > Can you contribute a PR to the data catalog repo so we can keep the feedback > on that issue? > > Thanks for your contribution, > Suresh > >> On Mar 19, 2023, at 12:55 PM, Jayan Vidanapathirana >> <jcvidanapathir...@gmail.com> wrote: >> >> Hi All, >> >> I have updated the draft code base[1] with a simple workflow of adding data >> to replica catalog. Still services are not yet finalized and will be >> enhanced with the workflow. >> >> [1] https://github.com/Jayancv/airavata-replica-catalog >> >> Thanks. >> >> On Sat, Feb 25, 2023 at 4:21 PM Jayan Vidanapathirana >> <jcvidanapathir...@gmail.com> wrote: >> Hi Dimuthu and Marcus, >> >> Thank you both for checking my PoC and providing valuable feedback. >> >> Dimuthu, >> • Im agree with you regarding Replica location categories. It should be >> a data catalog level attribute. >> • To manage replica data access permissions don't we need user >> information at Replica catalog level ? I'm a bit confused on the permission >> management side of this catalog. >> • ReplicaListEntry - Added to expose the list DataReplicaLocation s >> with basic details in AllDataReplicaGetResponse which provide all the >> replica items for the given product_id. However, here I was not considering >> that hierarchical structure. ReplicaGroupEntry is actually a one product >> replica which holds the file structure of the replica data. According to >> your suggestion we can model that AllDataReplicaGetResponse as follows, >> message AllDataReplicaGetResponse { >> data_product_id = 1 >> repeated ReplicaGroupEntry replica_list = 2; >> } >> >> message ReplicaGroupEntry { >> string replica_group_id = 1 >> repeated ReplicaGroupEntry directories = 2; >> repeated DataReplicaLocation files = 3; >> } >> >> Marcus, >> • Yes, I will remove replica_id from the data catalog diagram. >> • I added that parent_data_product_id to replica data by considering >> full context with replica catalog and data catalog relation. But within >> replica catalog context there is no such paranet product relationship. >> Therefore we can rename it to data_product_id. Thanks for pointing this out. >> >> Thanks. >> >> On Thu, Feb 23, 2023 at 2:48 AM Christie, Marcus Aaron <machr...@iu.edu> >> wrote: >> Hi Jayan, >> >> I would like to echo Dimuthu and say that this looks great and I appreciate >> the effort in your pulling this all together. I have some feedback to share. >> >> The high-level architecture diagram shows the replica id being stored in the >> data catalog. That was an initial idea that we had, but we decided that the >> replica catalog would store the data product id. That seems reflected in >> your API design so I think you already know this, but I wanted to point it >> out since the diagram might be a little confusing for others. >> >> In the ReplicaCatalogAPI.proto the name of the data product id field is >> "parent_data_product_id". I would suggest calling it "data_product_id" >> instead. "parent_data_product_id" means "the id of the parent data product >> of this data product" in the data catalog. It might be confusing to use the >> same name in ReplicaCatalogAPI.proto. >> >> >> Thanks, >> >> Marcus >> >> > On Feb 18, 2023, at 3:09 PM, Jayan Vidanapathirana >> > <jcvidanapathir...@gmail.com> wrote: >> > >> > Hi All, >> > >> > As a new contributor to the Cybershuttle project, I have been actively >> > involved in implementing the Data Replica Catalog. This new catalog is >> > designed to interface with both the Apache Airavata Data Catalog [1] and >> > Airavata MFT[2]. This replica catalog should be able to store each replica >> > resource storage details and secret/credential details specific to the >> > storage type. The proposed high-level architecture will be as follows: >> > >> > >> > >> > I will mainly work on the highlighted area (red color box) and as an >> > initial step started defining APIs which communicate with Replica Catalog. >> > This API calls will be gRPC APIs and following methods will be implement, >> > >> > Replica Registration >> > >> > • registerReplicaLocation(DataReplicaCreateRequest createRequest) >> > • updateReplicaLocation(DataReplicaCreateRequest updateRequest) >> > • DataReplicaLocationModel getReplicaLocation(DataReplicaGetRequest >> > getReplicaRequest) >> > • removeReplicaLocation(DataReplicaDeleteRequest >> > deleteReplicaRequest) >> > • getAllReplicaLocations(AllDataReplicaGetRequest allDataGetRequest) >> > • removeAllReplicaLocations(AllDataReplicaDeleteRequest >> > allDataDeleteRequest) >> > >> > Storage Registration >> > >> > registerSecretForStorage(SecretForStorage request) >> > deleteSecretsForStorage(SecretForStorageDeleteRequest request) >> > getSecretForStorage(SecretForStorageGetRequest request) >> > searchStorages(StorageSearchRequest request) >> > listStorages(StorageListRequest request) >> > resolveStorageType (StorageTypeResolveRequest request) >> > >> > Storage - Internal APIs >> > >> > S3StorageListResponse listS3Storage(S3StorageListRequest request) >> > Optional<S3Storage> getS3Storage(S3StorageGetRequest request) >> > S3Storage createS3Storage(S3StorageCreateRequest request) >> > boolean updateS3Storage(S3StorageUpdateRequest request) >> > boolean deleteS3Storage(S3StorageDeleteRequest request) >> > >> > AzureStorageListResponse listAzureStorage(AzureStorageListRequest request) >> > Optional<AzureStorage> getAzureStorage(AzureStorageGetRequest request) >> > AzureStorage createAzureStorage(AzureStorageCreateRequest request) >> > boolean updateAzureStorage(AzureStorageUpdateRequest request) >> > boolean deleteAzureStorage(AzureStorageDeleteRequest request) >> > >> > GCSStorageListResponse listGCSStorage(GCSStorageListRequest request) >> > Optional<GCSStorage> getGCSStorage(GCSStorageGetRequest request) >> > GCSStorage createGCSStorage(GCSStorageCreateRequest request) >> > boolean updateGCSStorage(GCSStorageUpdateRequest request) >> > boolean deleteGCSStorage(GCSStorageDeleteRequest request) >> > >> > Secret Registration >> > >> > registerSecret(SecretRegistrationRequest request) >> > deleteSecret(SecretDeleteRequest request) >> > resolveStorageType (StorageTypeResolveRequest request) >> > >> > Secret - Internal APIs >> > >> > Optional<S3Secret> getS3Secret(S3SecretGetRequest request) >> > S3Secret createS3Secret(S3SecretCreateRequest request) >> > boolean updateS3Secret(S3SecretUpdateRequest request) >> > boolean deleteS3Secret(S3SecretDeleteRequest request) >> > >> > Optional<AzureSecret> getAzureSecret(AzureSecretGetRequest request) >> > AzureSecret createAzureSecret(AzureSecretCreateRequest request) >> > boolean updateAzureSecret(AzureSecretUpdateRequest request) >> > boolean deleteAzureSecret(AzureSecretDeleteRequest request) >> > >> > Optional<GCSSecret> getGCSSecret(GCSSecretGetRequest request) >> > GCSSecret createGCSSecret(GCSSecretCreateRequest request) >> > boolean updateGCSSecret(GCSSecretUpdateRequest request) >> > boolean deleteGCSSecret(GCSSecretDeleteRequest request) >> > >> > >> > Poc[3] : https://github.com/Jayancv/airavata-replica-catalog (Defining >> > API calls) >> > Draft APIs : refer the attachment replicaCatalogAPIsDocumentation.html[4] >> > which generated using the Poc [3] >> > >> > I greatly appreciate your thoughts and feedback on the designs[5], as they >> > can help us improve and adopt a more generalized approach. Additionally, I >> > would like to identify any other factors that we should take into account >> > to minimize potential issues in the future. Are there any other >> > considerations that we should keep in mind? >> > >> > >> > [1] - https://github.com/apache/airavata-data-catalog >> > [2] - https://github.com/apache/airavata-mft >> > [3] - https://github.com/Jayancv/airavata-replica-catalog >> > [4] - >> > https://drive.google.com/file/d/1C4_H_Y5fZ4-5fmIHBNZyh3lXbV7vL5Ah/view?usp=sharing >> > [5] - >> > https://docs.google.com/document/d/1dQUpHVkccx-O9mbYuAo-wtcLQWJ1LaKUzBpaBMCgSac/edit?usp=sharing >> > >> > Thanks. >> > -- >> > Best Regards >> > >> > Jayan Vidanapathirana >> > >> > <replicaCatalogAPIsDocumentation.html> >> >> >> >> -- >> Best Regards >> >> Jayan Vidanapathirana >> >> >> >> -- >> Best Regards >> >> Jayan Vidanapathirana >> > > > > -- > Best Regards > > Jayan Vidanapathirana >
smime.p7s
Description: S/MIME cryptographic signature