Hello,

In addition to the previous thread from Aravind regarding the error, we tested 
the implementation from apache repository directly without making any of our 
own changes and did testing with other protocols and faced the similar problem.

Kind Regards
Akshay Rajvanshi

From: Aravind Ramalingam <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Tuesday, April 21, 2020 at 20:58
To: "[email protected]" <[email protected]>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Hello,

While testing we noticed an error in the SecretServiceApplication, it seems to 
be a problem with the gRPC calls to the service.

I have attached the screenshot for your reference.

Could you please help us with this?

Thank you
Aravind Ramalingam



On Mon, Apr 20, 2020 at 10:59 PM Aravind Ramalingam 
<[email protected]<mailto:[email protected]>> wrote:
Hi Dimuthu,

Thank you for the review. We will look into the changes asap.

Thank you
Aravind Ramalingam


On Apr 20, 2020, at 22:42, DImuthu Upeksha 
<[email protected]<mailto:[email protected]>> wrote:
Hi Aravind,

I reviewed the PR and submitted my reviews. Please have a look at them. I 
didn't thoroughly go through optimizations in the code as there are some 
templating fixes and cleaning up required. Once you fix them, I will do a 
thorough review. Make sure to do a rebase of the PR next time as there are 
conflicts from other commits. Thanks for your contributions.

Dimuthu

On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam 
<[email protected]<mailto:[email protected]>> wrote:
Hello,

We have raised a Pull Request [12].

We look forward to your feedback.

[12] https://github.com/apache/airavata-mft/pull/6

Thank you
Aravind Ramalingam

On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha 
<[email protected]<mailto:[email protected]>> wrote:
Sounds good. Please send a PR once it is done.

Dimuthu

On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam 
<[email protected]<mailto:[email protected]>> wrote:
Hello,

Thank you Sudhakar and Dimuthu. We figured it out.

Like Sudhakar had pointed out with the issue link, GCS had returned a BASE64 
Md5Hash, we had to convert it to HEX and it matched with the S3 hash.

Currently we successfully tested from S3 to GCS and back. We are yet to test 
with other protocols.

Thank you
Aravind Ramalingam

On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar 
<[email protected]<mailto:[email protected]>> wrote:
https://github.com/googleapis/google-cloud-java/issues/4117 Does this help?

Thanks,
Sudhakar.

From: DImuthu Upeksha 
<[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Sunday, April 19, 2020 at 4:46 PM
To: Airavata Dev <[email protected]<mailto:[email protected]>>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Aravind,

Can you send a PR for what you have done so far so that I can provide a 
feedback. One thing you have to make sure is that the GCS Metadata collector 
returns the correct md5 for that file. You can download the file and run 
"md5sum <file name>" locally to get actual md5 value for that file and compare 
with what you can see in GCS implementation.

In S3, etag is the right property to fetch md5 for target resource. I'm not 
sure what is the right method for GCS. You have to locally try and verify.

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam 
<[email protected]<mailto:[email protected]>> wrote:
Hi Dimuthu,

We are working on GCS and we got certain parts working, but after a transfer is 
compete we are facing errors with the metadata checks.

<image001.png>

We are currently testing S3 to GCS. We noticed in the S3 implementation that 
Etag was set as the Md5sum. In our case we tried using both Etag and Md5Hash, 
but both threw the above error.

//S3 implementation

metadata.setMd5sum(s3Metadata.getETag());

//GCS implementation

metadata.setMd5sum(gcsMetadata.getEtag());

or

metadata.setMd5sum(gcsMetadata.getMd5Hash());



We are confused at this point, could you please guide us?



Thank you

Aravind Ramalingam

On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha 
<[email protected]<mailto:[email protected]>> wrote:
Hi Aravind,

You don't need the file to be present in the gcs example I sent. It needs an 
Input Stream to read the content. You can use the same approach I have done in 
S3 [9] transport to do that. It's straightforward. Replace file input stream 
with context.getStreamBuffer().getInputStream().

Akshay,

You can't assume that file is on the machine. It should be provided from the 
secret service. I found this example in [10]

Storage storage = StorageOptions.newBuilder()

    .setCredentials(ServiceAccountCredentials.fromStream(new 
FileInputStream("/path/to/my/key.json")))

    .build()

    .getService();

It accepts a InputStream of json. You can programmatically load the content of 
that json into a java String through secret service and convert that string to 
a Input Stream as shown in [11]

[9] 
https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
[10] https://github.com/googleapis/google-cloud-java
[11] https://www.baeldung.com/convert-string-to-input-stream

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay 
<[email protected]<mailto:[email protected]>> wrote:
Hello,

We were searching about how to use google API’s to send files, but it’s 
required the first steps to be authentication. In that, the GCP API requires a 
credentials.json file to be present in the system.

Is it fine if we currently design the GCS transport feature such that the file 
is already present in the system ?

Kind Regards
Akshay

From: Aravind Ramalingam <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Friday, April 17, 2020 at 00:30
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: [External] Re: Apache Airavata MFT - AWS/GCS support

This message was sent from a non-IU address. Please exercise caution when 
clicking links or opening attachments from external sources.

Hello,

Wouldn't it be that in this example the whole file has to be present and 
converted into a single stream and uploaded at once?
We had understood that MFT expects it to be chunk by chunk upload without 
having to have the entire file present.

Thank you
Aravind Ramalingam

On Apr 17, 2020, at 00:07, DImuthu Upeksha 
<[email protected]<mailto:[email protected]>> wrote:
Aravind,

Streaming is supported in GCS java client. Have a look at here [8]

[8] 
https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104

Thanks
Dimuthu

On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam 
<[email protected]<mailto:[email protected]>> wrote:
Hello Dimuthu,

As a followup, we explored GCS in detail. We are faced with a small dilemma. We 
found that though GCS has a Java support, but the functionality does not seem 
to extend to a stream based upload and download.
The documentation says it is currently done with a gsutil command line library 
[7], hence we are confused if we would be able to proceed the GCS integration.

Could you please give us any suggestions? Also we were wondering if we could 
maybe take up Box integration or some other provider if GCS proves not possible 
currently.

[7] https://cloud.google.com/storage/docs/streaming

Thank you
Aravind Ramalingam

On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam 
<[email protected]<mailto:[email protected]>> wrote:
Hello Dimuthu,

We had just started looking into Azure and GCS. Since Azure is done we will 
take up and explore GCS.

Thank you for the update.
Thank you
Aravind Ramalingam

On Apr 16, 2020, at 00:30, DImuthu Upeksha 
<[email protected]<mailto:[email protected]>> wrote:
Aravind,

I'm not sure whether you have made any progress on Azure transport yet. I got a 
chance to look into that [6]. Let me know if you are working on GCS or any 
other so that I can plan ahead. Next I will be focusing on Box transport.

[6] 
https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd

Thanks
Dimuthu

On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam 
<[email protected]<mailto:[email protected]>> wrote:
Hi  Dimuthu,

Thank you for the update. We look into it and get an idea about how the system 
works.
We were hoping to try an implementation for GCS, we will also look into Azure.

Thank you
Aravind Ramalingam

On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha 
<[email protected]<mailto:[email protected]>> wrote:
Aravind,

Here [2] is the complete commit for S3 transport implementation but don't get 
confused by the amount of changes as this includes both transport 
implementation and the service backend implementations. If you need to 
implement a new transport, you need to implement a Receiver, Sender and a 
MetadataCollector like this [3]. Then you need to add that resource support to 
Resource service and Secret service [4] [5]. You can similarly do that for 
Azure. A sample SCP -> S3 transfer request is like below. Hope that helps.


String sourceId = "remote-ssh-resource";
String sourceToken = "local-ssh-cred";
String sourceType = "SCP";
String destId = "s3-file";
String destToken = "s3-cred";
String destType = "S3";

TransferApiRequest request = TransferApiRequest.newBuilder()
        .setSourceId(sourceId)
        .setSourceToken(sourceToken)
        .setSourceType(sourceType)
        .setDestinationId(destId)
        .setDestinationToken(destToken)
        .setDestinationType(destType)
        .setAffinityTransfer(false).build();

[2] 
https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
[3] 
https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
[4] 
https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
[5] 
https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45

Thanks
Dimuthu


On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha 
<[email protected]<mailto:[email protected]>> wrote:
There is a working on S3 transport in my local copy. Will commit it once I test 
it out properly. You can follow the same pattern for any cloud provider which 
has clients with streaming IO. Streaming among different transfer protocols 
inside an Agent has been discussed in the last part of this [1] document. Try 
to get the conceptual idea from that and reverse engineer SCP transport.

[1] 
https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo

Dimuthu

On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam 
<[email protected]<mailto:[email protected]>> wrote:
Hello,

We were looking at the existing code in the project. We could find 
implementations only for local copy and SCP.
We were confused on how to go about with an external provider like S3 or Azure? 
Since it would require integrating with their respective clients.

Thank you
Aravind Ramalingam

> On Apr 4, 2020, at 21:15, Suresh Marru 
> <[email protected]<mailto:[email protected]>> wrote:
>
> Hi Aravind,
>
> I have to catch up with the code, but you may want to look at the S3 
> implementation and extend it to Azure, GCP or other cloud services like Box, 
> Dropbox and so on.
>
> There could be many use cases, here is an idea:
>
> * Compute a job on a supercomputer with SCP access and push the outputs to a 
> Cloud storage.
>
> Suresh
>
>> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam 
>> <[email protected]<mailto:[email protected]>> wrote:
>>
>> Hello,
>>
>> We set up the MFT project on local system and tested out SCP transfer 
>> between JetStream VMs, we were wondering how the support can be extended for 
>> AWS/GCS.
>>
>> As per our understanding, the current implementation has support for two 
>> protocols i.e. local-transport and scp-transport. Would we have to 
>> modify/add to the code base to extend support for AWS/GCS clients?
>>
>> Could you please provide suggestions for this use case.
>>
>> Thank you
>> Aravind Ramalingam
>

Reply via email to