Hi Dimuthu,

Thank you for the review. We will look into the changes asap. 

Thank you
Aravind Ramalingam

> On Apr 20, 2020, at 22:42, DImuthu Upeksha <[email protected]> wrote:
> 
> 
> Hi Aravind,
> 
> I reviewed the PR and submitted my reviews. Please have a look at them. I 
> didn't thoroughly go through optimizations in the code as there are some 
> templating fixes and cleaning up required. Once you fix them, I will do a 
> thorough review. Make sure to do a rebase of the PR next time as there are 
> conflicts from other commits. Thanks for your contributions.
> 
> Dimuthu
> 
>> On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <[email protected]> 
>> wrote:
>> Hello,
>> 
>> We have raised a Pull Request [12]. 
>> 
>> We look forward to your feedback. 
>> 
>> [12] https://github.com/apache/airavata-mft/pull/6
>> 
>> Thank you
>> Aravind Ramalingam
>> 
>>> On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha 
>>> <[email protected]> wrote:
>>> Sounds good. Please send a PR once it is done.
>>> 
>>> Dimuthu
>>> 
>>>> On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <[email protected]> 
>>>> wrote:
>>>> Hello,
>>>> 
>>>> Thank you Sudhakar and Dimuthu. We figured it out.
>>>> 
>>>> Like Sudhakar had pointed out with the issue link, GCS had returned a 
>>>> BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 
>>>> hash.
>>>> 
>>>> Currently we successfully tested from S3 to GCS and back. We are yet to 
>>>> test with other protocols.
>>>> 
>>>> Thank you
>>>> Aravind Ramalingam  
>>>> 
>>>>> On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <[email protected]> 
>>>>> wrote:
>>>>> https://github.com/googleapis/google-cloud-java/issues/4117 Does this 
>>>>> help?
>>>>> 
>>>>>  
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Sudhakar.
>>>>> 
>>>>>  
>>>>> 
>>>>> From: DImuthu Upeksha <[email protected]>
>>>>> Reply-To: "[email protected]" <[email protected]>
>>>>> Date: Sunday, April 19, 2020 at 4:46 PM
>>>>> To: Airavata Dev <[email protected]>
>>>>> Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>>>> 
>>>>>  
>>>>> 
>>>>> Aravind,
>>>>> 
>>>>>  
>>>>> 
>>>>> Can you send a PR for what you have done so far so that I can provide a 
>>>>> feedback. One thing you have to make sure is that the GCS Metadata 
>>>>> collector returns the correct md5 for that file. You can download the 
>>>>> file and run "md5sum <file name>" locally to get actual md5 value for 
>>>>> that file and compare with what you can see in GCS implementation.
>>>>> 
>>>>>  
>>>>> 
>>>>> In S3, etag is the right property to fetch md5 for target resource. I'm 
>>>>> not sure what is the right method for GCS. You have to locally try and 
>>>>> verify.
>>>>> 
>>>>>  
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Dimuthu
>>>>> 
>>>>>  
>>>>> 
>>>>> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>> Hi Dimuthu,
>>>>> 
>>>>>  
>>>>> 
>>>>> We are working on GCS and we got certain parts working, but after a 
>>>>> transfer is compete we are facing errors with the metadata checks.
>>>>> 
>>>>>  
>>>>> 
>>>>> <image001.png>
>>>>> 
>>>>>  
>>>>> 
>>>>> We are currently testing S3 to GCS. We noticed in the S3 implementation 
>>>>> that Etag was set as the Md5sum. In our case we tried using both Etag and 
>>>>> Md5Hash, but both threw the above error.
>>>>> 
>>>>>  
>>>>> 
>>>>> //S3 implementation
>>>>> 
>>>>> metadata.setMd5sum(s3Metadata.getETag());
>>>>> //GCS implementation
>>>>> metadata.setMd5sum(gcsMetadata.getEtag());
>>>>> or
>>>>> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>>>>>  
>>>>> We are confused at this point, could you please guide us?
>>>>>  
>>>>> Thank you
>>>>> Aravind Ramalingam
>>>>>  
>>>>> 
>>>>> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha 
>>>>> <[email protected]> wrote:
>>>>> 
>>>>> Hi Aravind,
>>>>> 
>>>>>  
>>>>> 
>>>>> You don't need the file to be present in the gcs example I sent. It needs 
>>>>> an Input Stream to read the content. You can use the same approach I have 
>>>>> done in S3 [9] transport to do that. It's straightforward. Replace file 
>>>>> input stream with context.getStreamBuffer().getInputStream().
>>>>> 
>>>>>  
>>>>> 
>>>>> Akshay,
>>>>> 
>>>>>  
>>>>> 
>>>>> You can't assume that file is on the machine. It should be provided from 
>>>>> the secret service. I found this example in [10]
>>>>> 
>>>>> Storage storage = StorageOptions.newBuilder()
>>>>>     .setCredentials(ServiceAccountCredentials.fromStream(new 
>>>>> FileInputStream("/path/to/my/key.json")))
>>>>>     .build()
>>>>>     .getService();
>>>>>  
>>>>> 
>>>>> It accepts a InputStream of json. You can programmatically load the 
>>>>> content of that json into a java String through secret service and 
>>>>> convert that string to a Input Stream as shown in [11]
>>>>> 
>>>>>  
>>>>> 
>>>>> [9] 
>>>>> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>>>>> 
>>>>> [10] https://github.com/googleapis/google-cloud-java
>>>>> 
>>>>> [11] https://www.baeldung.com/convert-string-to-input-stream
>>>>> 
>>>>>  
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Dimuthu
>>>>> 
>>>>>  
>>>>> 
>>>>> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <[email protected]> wrote:
>>>>> 
>>>>> Hello,
>>>>> 
>>>>>  
>>>>> 
>>>>> We were searching about how to use google API’s to send files, but it’s 
>>>>> required the first steps to be authentication. In that, the GCP API 
>>>>> requires a credentials.json file to be present in the system.
>>>>> 
>>>>>  
>>>>> 
>>>>> Is it fine if we currently design the GCS transport feature such that the 
>>>>> file is already present in the system ?
>>>>> 
>>>>>  
>>>>> 
>>>>> Kind Regards
>>>>> 
>>>>> Akshay
>>>>> 
>>>>>  
>>>>> 
>>>>> From: Aravind Ramalingam <[email protected]>
>>>>> Reply-To: "[email protected]" <[email protected]>
>>>>> Date: Friday, April 17, 2020 at 00:30
>>>>> To: "[email protected]" <[email protected]>
>>>>> Subject: [External] Re: Apache Airavata MFT - AWS/GCS support
>>>>> 
>>>>>  
>>>>> 
>>>>> This message was sent from a non-IU address. Please exercise caution when 
>>>>> clicking links or opening attachments from external sources.
>>>>> 
>>>>> 
>>>>> Hello,
>>>>> 
>>>>>  
>>>>> 
>>>>> Wouldn't it be that in this example the whole file has to be present and 
>>>>> converted into a single stream and uploaded at once?
>>>>> 
>>>>> We had understood that MFT expects it to be chunk by chunk upload without 
>>>>> having to have the entire file present.
>>>>> 
>>>>>  
>>>>> 
>>>>> Thank you
>>>>> 
>>>>> Aravind Ramalingam
>>>>> 
>>>>>  
>>>>> 
>>>>> On Apr 17, 2020, at 00:07, DImuthu Upeksha <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>> Aravind,
>>>>> 
>>>>>  
>>>>> 
>>>>> Streaming is supported in GCS java client. Have a look at here [8]
>>>>> 
>>>>>  
>>>>> 
>>>>> [8] 
>>>>> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>>>>> 
>>>>>  
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Dimuthu
>>>>> 
>>>>>  
>>>>> 
>>>>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>> Hello Dimuthu,
>>>>> 
>>>>>  
>>>>> 
>>>>> As a followup, we explored GCS in detail. We are faced with a small 
>>>>> dilemma. We found that though GCS has a Java support, but the 
>>>>> functionality does not seem to extend to a stream based upload and 
>>>>> download. 
>>>>> 
>>>>> The documentation says it is currently done with a gsutil command line 
>>>>> library [7], hence we are confused if we would be able to proceed the GCS 
>>>>> integration.
>>>>> 
>>>>>  
>>>>> 
>>>>> Could you please give us any suggestions? Also we were wondering if we 
>>>>> could maybe take up Box integration or some other provider if GCS proves 
>>>>> not possible currently.
>>>>> 
>>>>>  
>>>>> 
>>>>> [7] https://cloud.google.com/storage/docs/streaming 
>>>>> 
>>>>>  
>>>>> 
>>>>> Thank you
>>>>> 
>>>>> Aravind Ramalingam
>>>>> 
>>>>>  
>>>>> 
>>>>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>> Hello Dimuthu,
>>>>> 
>>>>>  
>>>>> 
>>>>> We had just started looking into Azure and GCS. Since Azure is done we 
>>>>> will take up and explore GCS.
>>>>> 
>>>>>  
>>>>> 
>>>>> Thank you for the update.
>>>>> 
>>>>> Thank you
>>>>> 
>>>>> Aravind Ramalingam
>>>>> 
>>>>>  
>>>>> 
>>>>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>> Aravind,
>>>>> 
>>>>>  
>>>>> 
>>>>> I'm not sure whether you have made any progress on Azure transport yet. I 
>>>>> got a chance to look into that [6]. Let me know if you are working on GCS 
>>>>> or any other so that I can plan ahead. Next I will be focusing on Box 
>>>>> transport.
>>>>> 
>>>>>  
>>>>> 
>>>>> [6] 
>>>>> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>>>> 
>>>>>  
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Dimuthu
>>>>> 
>>>>>  
>>>>> 
>>>>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>> Hi  Dimuthu,
>>>>> 
>>>>>  
>>>>> 
>>>>> Thank you for the update. We look into it and get an idea about how the 
>>>>> system works.
>>>>> 
>>>>> We were hoping to try an implementation for GCS, we will also look into 
>>>>> Azure.
>>>>> 
>>>>>  
>>>>> 
>>>>> Thank you
>>>>> 
>>>>> Aravind Ramalingam
>>>>> 
>>>>>  
>>>>> 
>>>>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha 
>>>>> <[email protected]> wrote:
>>>>> 
>>>>> Aravind,
>>>>> 
>>>>>  
>>>>> 
>>>>> Here [2] is the complete commit for S3 transport implementation but don't 
>>>>> get confused by the amount of changes as this includes both transport 
>>>>> implementation and the service backend implementations. If you need to 
>>>>> implement a new transport, you need to implement a Receiver, Sender and a 
>>>>> MetadataCollector like this [3]. Then you need to add that resource 
>>>>> support to Resource service and Secret service [4] [5]. You can similarly 
>>>>> do that for Azure. A sample SCP -> S3 transfer request is like below. 
>>>>> Hope that helps.
>>>>> 
>>>>>  
>>>>> 
>>>>> String sourceId = "remote-ssh-resource";
>>>>> String sourceToken = "local-ssh-cred";
>>>>> String sourceType = "SCP";
>>>>> String destId = "s3-file";
>>>>> String destToken = "s3-cred";
>>>>> String destType = "S3";
>>>>> 
>>>>> TransferApiRequest request = TransferApiRequest.newBuilder()
>>>>>         .setSourceId(sourceId)
>>>>>         .setSourceToken(sourceToken)
>>>>>         .setSourceType(sourceType)
>>>>>         .setDestinationId(destId)
>>>>>         .setDestinationToken(destToken)
>>>>>         .setDestinationType(destType)
>>>>>         .setAffinityTransfer(false).build();
>>>>>  
>>>>> 
>>>>> [2] 
>>>>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>>>> 
>>>>> [3] 
>>>>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>>>> 
>>>>> [4] 
>>>>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>>>> 
>>>>> [5] 
>>>>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>>>> 
>>>>>  
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Dimuthu
>>>>> 
>>>>>  
>>>>> 
>>>>>  
>>>>> 
>>>>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha 
>>>>> <[email protected]> wrote:
>>>>> 
>>>>> There is a working on S3 transport in my local copy. Will commit it once 
>>>>> I test it out properly. You can follow the same pattern for any cloud 
>>>>> provider which has clients with streaming IO. Streaming among different 
>>>>> transfer protocols inside an Agent has been discussed in the last part of 
>>>>> this [1] document. Try to get the conceptual idea from that and reverse 
>>>>> engineer SCP transport. 
>>>>> 
>>>>>  
>>>>> 
>>>>> [1] 
>>>>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>>>> 
>>>>>  
>>>>> 
>>>>> Dimuthu
>>>>> 
>>>>>  
>>>>> 
>>>>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>> Hello, 
>>>>> 
>>>>> We were looking at the existing code in the project. We could find 
>>>>> implementations only for local copy and SCP.
>>>>> We were confused on how to go about with an external provider like S3 or 
>>>>> Azure? Since it would require integrating with their respective clients. 
>>>>> 
>>>>> Thank you
>>>>> Aravind Ramalingam
>>>>> 
>>>>> > On Apr 4, 2020, at 21:15, Suresh Marru <[email protected]> wrote:
>>>>> > 
>>>>> > Hi Aravind,
>>>>> > 
>>>>> > I have to catch up with the code, but you may want to look at the S3 
>>>>> > implementation and extend it to Azure, GCP or other cloud services like 
>>>>> > Box, Dropbox and so on. 
>>>>> > 
>>>>> > There could be many use cases, here is an idea:
>>>>> > 
>>>>> > * Compute a job on a supercomputer with SCP access and push the outputs 
>>>>> > to a Cloud storage. 
>>>>> > 
>>>>> > Suresh
>>>>> > 
>>>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <[email protected]> 
>>>>> >> wrote:
>>>>> >> 
>>>>> >> Hello,
>>>>> >> 
>>>>> >> We set up the MFT project on local system and tested out SCP transfer 
>>>>> >> between JetStream VMs, we were wondering how the support can be 
>>>>> >> extended for AWS/GCS.
>>>>> >> 
>>>>> >> As per our understanding, the current implementation has support for 
>>>>> >> two protocols i.e. local-transport and scp-transport. Would we have to 
>>>>> >> modify/add to the code base to extend support for AWS/GCS clients?
>>>>> >> 
>>>>> >> Could you please provide suggestions for this use case. 
>>>>> >> 
>>>>> >> Thank you
>>>>> >> Aravind Ramalingam
>>>>> >

Reply via email to