Hi Jayan,

Thanks for sharing. One question, the airavata-data-catalog already has a 
DATA_PRODUCT table and a way to store a data product's metadata. Could that be 
used instead of adding a new table?

Or more generally my question is how does this replica catalog API relate to 
the data catalog API/data model?



> On Mar 31, 2023, at 4:11 PM, Jayan Vidanapathirana 
> <jcvidanapathir...@gmail.com> wrote:
> Hi All,
> I have implemented basic flow(simple create and retrieve) of the replica 
> catalog and drafted a pull request[1] to the Airavata data catalog as a new 
> module. According to that implementation I have come to the following 
> database structure for the replica catalog and I greatly appreciate your 
> thoughts and feedback on the designs[2]. At this stage S3 storage type was 
> considered as a sample. 
> <Replica Catalog V2.drawio (1).png>
> Also please refer to the following google doc[3] to review the implemented 
> APIs.
> [1] https://github.com/apache/airavata-data-catalog/pull/28
> [2] 
> https://drive.google.com/file/d/1KP-8IWdvpPvjSWUG2t41K7WQXW2f9_qN/view?usp=sharing
> [3] 
> https://docs.google.com/document/d/1U-ok1ICt_EmjjxR9UuACV6g6YYkgoADfm0ECJ9ZDI3k/edit?usp=sharing
> Thank you.
> On Mon, Mar 20, 2023 at 2:20 AM Suresh Marru <sma...@apache.org> wrote:
> Hi Jayan,
> Can you contribute a PR to the data catalog repo so we can keep the feedback 
> on that issue?
> Thanks for your contribution,
> Suresh
>> On Mar 19, 2023, at 12:55 PM, Jayan Vidanapathirana 
>> <jcvidanapathir...@gmail.com> wrote:
>> Hi All,
>> I have updated the draft code base[1] with a simple workflow of adding data 
>> to replica catalog. Still services are not yet finalized and will be 
>> enhanced with the workflow. 
>> [1] https://github.com/Jayancv/airavata-replica-catalog
>> Thanks.
>> On Sat, Feb 25, 2023 at 4:21 PM Jayan Vidanapathirana 
>> <jcvidanapathir...@gmail.com> wrote:
>> Hi Dimuthu and Marcus,
>> Thank you both for checking my PoC and providing valuable feedback.
>> Dimuthu,
>>      • Im agree with you regarding Replica location categories. It should be 
>> a data catalog level attribute. 
>>      • To manage replica data access permissions don't we need user 
>> information at Replica catalog level ? I'm a bit confused on the permission 
>> management side of this catalog. 
>>      • ReplicaListEntry  - Added to expose the list DataReplicaLocation s 
>> with basic details in AllDataReplicaGetResponse which provide all the 
>> replica items for the given product_id. However, here I was not considering 
>> that hierarchical structure. ReplicaGroupEntry is actually a one product 
>> replica which holds the file structure of the replica data. According to 
>> your suggestion we can model that AllDataReplicaGetResponse as follows,
>> message AllDataReplicaGetResponse {
>>   data_product_id = 1
>>   repeated ReplicaGroupEntry replica_list = 2;
>> }
>> message ReplicaGroupEntry {
>>   string replica_group_id = 1
>>   repeated ReplicaGroupEntry directories = 2;
>>   repeated DataReplicaLocation files = 3;
>> }
>> Marcus, 
>>      • Yes, I will remove replica_id  from the data catalog diagram. 
>>      • I added that parent_data_product_id to replica data by considering 
>> full context with replica catalog and data catalog relation. But within 
>> replica catalog context there is no such paranet product relationship. 
>> Therefore we can rename it to data_product_id. Thanks for pointing this out. 
>> Thanks.
>> On Thu, Feb 23, 2023 at 2:48 AM Christie, Marcus Aaron <machr...@iu.edu> 
>> wrote:
>> Hi Jayan,
>> I would like to echo Dimuthu and say that this looks great and I appreciate 
>> the effort in your pulling this all together.  I have some feedback to share.
>> The high-level architecture diagram shows the replica id being stored in the 
>> data catalog. That was an initial idea that we had, but we decided that the 
>> replica catalog would store the data product id. That seems reflected in 
>> your API design so I think you already know this, but I wanted to point it 
>> out since the diagram might be a little confusing for others.
>> In the ReplicaCatalogAPI.proto the name of the data product id field is 
>> "parent_data_product_id". I would suggest calling it "data_product_id" 
>> instead. "parent_data_product_id" means "the id of the parent data product 
>> of this data product" in the data catalog. It might be confusing to use the 
>> same name in ReplicaCatalogAPI.proto.
>> Thanks,
>> Marcus
>> > On Feb 18, 2023, at 3:09 PM, Jayan Vidanapathirana 
>> > <jcvidanapathir...@gmail.com> wrote:
>> > 
>> > Hi All, 
>> > 
>> > As a new contributor to the Cybershuttle project, I have been actively 
>> > involved in implementing the Data Replica Catalog. This new catalog is 
>> > designed to interface with both the Apache Airavata Data Catalog [1] and 
>> > Airavata MFT[2]. This replica catalog should be able to store each replica 
>> > resource storage details and secret/credential details specific to the 
>> > storage type. The proposed high-level architecture will be as follows:
>> > 
>> > 
>> > 
>> > I will mainly work on the highlighted area (red color box) and as an 
>> > initial step started defining APIs which communicate with Replica Catalog. 
>> > This API calls will be gRPC APIs and following methods will be implement,
>> > 
>> > Replica Registration
>> > 
>> >       • registerReplicaLocation(DataReplicaCreateRequest createRequest)
>> >       • updateReplicaLocation(DataReplicaCreateRequest updateRequest)
>> >       • DataReplicaLocationModel getReplicaLocation(DataReplicaGetRequest 
>> > getReplicaRequest)
>> >       • removeReplicaLocation(DataReplicaDeleteRequest 
>> > deleteReplicaRequest)
>> >       • getAllReplicaLocations(AllDataReplicaGetRequest allDataGetRequest)
>> >       • removeAllReplicaLocations(AllDataReplicaDeleteRequest 
>> > allDataDeleteRequest)
>> > 
>> > Storage Registration
>> > 
>> > registerSecretForStorage(SecretForStorage request)
>> > deleteSecretsForStorage(SecretForStorageDeleteRequest request)
>> > getSecretForStorage(SecretForStorageGetRequest request)
>> > searchStorages(StorageSearchRequest request)
>> > listStorages(StorageListRequest request)
>> > resolveStorageType (StorageTypeResolveRequest request)
>> > 
>> > Storage - Internal APIs
>> > 
>> > S3StorageListResponse listS3Storage(S3StorageListRequest request) 
>> > Optional<S3Storage> getS3Storage(S3StorageGetRequest request) 
>> > S3Storage createS3Storage(S3StorageCreateRequest request) 
>> > boolean updateS3Storage(S3StorageUpdateRequest request) 
>> > boolean deleteS3Storage(S3StorageDeleteRequest request) 
>> > 
>> > AzureStorageListResponse listAzureStorage(AzureStorageListRequest request) 
>> > Optional<AzureStorage> getAzureStorage(AzureStorageGetRequest request) 
>> > AzureStorage createAzureStorage(AzureStorageCreateRequest request) 
>> > boolean updateAzureStorage(AzureStorageUpdateRequest request) 
>> > boolean deleteAzureStorage(AzureStorageDeleteRequest request) 
>> > 
>> > GCSStorageListResponse listGCSStorage(GCSStorageListRequest request) 
>> > Optional<GCSStorage> getGCSStorage(GCSStorageGetRequest request) 
>> > GCSStorage createGCSStorage(GCSStorageCreateRequest request) 
>> > boolean updateGCSStorage(GCSStorageUpdateRequest request) 
>> > boolean deleteGCSStorage(GCSStorageDeleteRequest request) 
>> > 
>> > Secret Registration
>> > 
>> > registerSecret(SecretRegistrationRequest request)
>> > deleteSecret(SecretDeleteRequest request)
>> > resolveStorageType (StorageTypeResolveRequest request)
>> > 
>> > Secret  - Internal APIs
>> > 
>> > Optional<S3Secret> getS3Secret(S3SecretGetRequest request) 
>> > S3Secret createS3Secret(S3SecretCreateRequest request) 
>> > boolean updateS3Secret(S3SecretUpdateRequest request) 
>> > boolean deleteS3Secret(S3SecretDeleteRequest request) 
>> > 
>> > Optional<AzureSecret> getAzureSecret(AzureSecretGetRequest request) 
>> > AzureSecret createAzureSecret(AzureSecretCreateRequest request) 
>> > boolean updateAzureSecret(AzureSecretUpdateRequest request) 
>> > boolean deleteAzureSecret(AzureSecretDeleteRequest request) 
>> > 
>> > Optional<GCSSecret> getGCSSecret(GCSSecretGetRequest request) 
>> > GCSSecret createGCSSecret(GCSSecretCreateRequest request) 
>> > boolean updateGCSSecret(GCSSecretUpdateRequest request) 
>> > boolean deleteGCSSecret(GCSSecretDeleteRequest request) 
>> > 
>> > 
>> > Poc[3] : https://github.com/Jayancv/airavata-replica-catalog  (Defining 
>> > API calls)
>> > Draft APIs : refer the attachment replicaCatalogAPIsDocumentation.html[4] 
>> > which generated using the Poc [3]
>> > 
>> > I greatly appreciate your thoughts and feedback on the designs[5], as they 
>> > can help us improve and adopt a more generalized approach. Additionally, I 
>> > would like to identify any other factors that we should take into account 
>> > to minimize potential issues in the future. Are there any other 
>> > considerations that we should keep in mind? 
>> > 
>> > 
>> > [1] - https://github.com/apache/airavata-data-catalog
>> > [2] - https://github.com/apache/airavata-mft
>> > [3] - https://github.com/Jayancv/airavata-replica-catalog 
>> > [4] - 
>> > https://drive.google.com/file/d/1C4_H_Y5fZ4-5fmIHBNZyh3lXbV7vL5Ah/view?usp=sharing
>> > [5] - 
>> > https://docs.google.com/document/d/1dQUpHVkccx-O9mbYuAo-wtcLQWJ1LaKUzBpaBMCgSac/edit?usp=sharing
>> > 
>> > Thanks.
>> > -- 
>> > Best Regards
>> > 
>> > Jayan Vidanapathirana
>> > 
>> > <replicaCatalogAPIsDocumentation.html>
>> -- 
>> Best Regards
>> Jayan Vidanapathirana
>> -- 
>> Best Regards
>> Jayan Vidanapathirana
> -- 
> Best Regards
> Jayan Vidanapathirana

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to