Hi Dimuthu, Thank you for the review. We will look into the changes asap.
Thank you Aravind Ramalingam > On Apr 20, 2020, at 22:42, DImuthu Upeksha <[email protected]> wrote: > > > Hi Aravind, > > I reviewed the PR and submitted my reviews. Please have a look at them. I > didn't thoroughly go through optimizations in the code as there are some > templating fixes and cleaning up required. Once you fix them, I will do a > thorough review. Make sure to do a rebase of the PR next time as there are > conflicts from other commits. Thanks for your contributions. > > Dimuthu > >> On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <[email protected]> >> wrote: >> Hello, >> >> We have raised a Pull Request [12]. >> >> We look forward to your feedback. >> >> [12] https://github.com/apache/airavata-mft/pull/6 >> >> Thank you >> Aravind Ramalingam >> >>> On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha >>> <[email protected]> wrote: >>> Sounds good. Please send a PR once it is done. >>> >>> Dimuthu >>> >>>> On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <[email protected]> >>>> wrote: >>>> Hello, >>>> >>>> Thank you Sudhakar and Dimuthu. We figured it out. >>>> >>>> Like Sudhakar had pointed out with the issue link, GCS had returned a >>>> BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 >>>> hash. >>>> >>>> Currently we successfully tested from S3 to GCS and back. We are yet to >>>> test with other protocols. >>>> >>>> Thank you >>>> Aravind Ramalingam >>>> >>>>> On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <[email protected]> >>>>> wrote: >>>>> https://github.com/googleapis/google-cloud-java/issues/4117 Does this >>>>> help? >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Sudhakar. >>>>> >>>>> >>>>> >>>>> From: DImuthu Upeksha <[email protected]> >>>>> Reply-To: "[email protected]" <[email protected]> >>>>> Date: Sunday, April 19, 2020 at 4:46 PM >>>>> To: Airavata Dev <[email protected]> >>>>> Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support >>>>> >>>>> >>>>> >>>>> Aravind, >>>>> >>>>> >>>>> >>>>> Can you send a PR for what you have done so far so that I can provide a >>>>> feedback. One thing you have to make sure is that the GCS Metadata >>>>> collector returns the correct md5 for that file. You can download the >>>>> file and run "md5sum <file name>" locally to get actual md5 value for >>>>> that file and compare with what you can see in GCS implementation. >>>>> >>>>> >>>>> >>>>> In S3, etag is the right property to fetch md5 for target resource. I'm >>>>> not sure what is the right method for GCS. You have to locally try and >>>>> verify. >>>>> >>>>> >>>>> >>>>> Thanks >>>>> >>>>> Dimuthu >>>>> >>>>> >>>>> >>>>> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <[email protected]> >>>>> wrote: >>>>> >>>>> Hi Dimuthu, >>>>> >>>>> >>>>> >>>>> We are working on GCS and we got certain parts working, but after a >>>>> transfer is compete we are facing errors with the metadata checks. >>>>> >>>>> >>>>> >>>>> <image001.png> >>>>> >>>>> >>>>> >>>>> We are currently testing S3 to GCS. We noticed in the S3 implementation >>>>> that Etag was set as the Md5sum. In our case we tried using both Etag and >>>>> Md5Hash, but both threw the above error. >>>>> >>>>> >>>>> >>>>> //S3 implementation >>>>> >>>>> metadata.setMd5sum(s3Metadata.getETag()); >>>>> //GCS implementation >>>>> metadata.setMd5sum(gcsMetadata.getEtag()); >>>>> or >>>>> metadata.setMd5sum(gcsMetadata.getMd5Hash()); >>>>> >>>>> We are confused at this point, could you please guide us? >>>>> >>>>> Thank you >>>>> Aravind Ramalingam >>>>> >>>>> >>>>> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha >>>>> <[email protected]> wrote: >>>>> >>>>> Hi Aravind, >>>>> >>>>> >>>>> >>>>> You don't need the file to be present in the gcs example I sent. It needs >>>>> an Input Stream to read the content. You can use the same approach I have >>>>> done in S3 [9] transport to do that. It's straightforward. Replace file >>>>> input stream with context.getStreamBuffer().getInputStream(). >>>>> >>>>> >>>>> >>>>> Akshay, >>>>> >>>>> >>>>> >>>>> You can't assume that file is on the machine. It should be provided from >>>>> the secret service. I found this example in [10] >>>>> >>>>> Storage storage = StorageOptions.newBuilder() >>>>> .setCredentials(ServiceAccountCredentials.fromStream(new >>>>> FileInputStream("/path/to/my/key.json"))) >>>>> .build() >>>>> .getService(); >>>>> >>>>> >>>>> It accepts a InputStream of json. You can programmatically load the >>>>> content of that json into a java String through secret service and >>>>> convert that string to a Input Stream as shown in [11] >>>>> >>>>> >>>>> >>>>> [9] >>>>> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73 >>>>> >>>>> [10] https://github.com/googleapis/google-cloud-java >>>>> >>>>> [11] https://www.baeldung.com/convert-string-to-input-stream >>>>> >>>>> >>>>> >>>>> Thanks >>>>> >>>>> Dimuthu >>>>> >>>>> >>>>> >>>>> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <[email protected]> wrote: >>>>> >>>>> Hello, >>>>> >>>>> >>>>> >>>>> We were searching about how to use google API’s to send files, but it’s >>>>> required the first steps to be authentication. In that, the GCP API >>>>> requires a credentials.json file to be present in the system. >>>>> >>>>> >>>>> >>>>> Is it fine if we currently design the GCS transport feature such that the >>>>> file is already present in the system ? >>>>> >>>>> >>>>> >>>>> Kind Regards >>>>> >>>>> Akshay >>>>> >>>>> >>>>> >>>>> From: Aravind Ramalingam <[email protected]> >>>>> Reply-To: "[email protected]" <[email protected]> >>>>> Date: Friday, April 17, 2020 at 00:30 >>>>> To: "[email protected]" <[email protected]> >>>>> Subject: [External] Re: Apache Airavata MFT - AWS/GCS support >>>>> >>>>> >>>>> >>>>> This message was sent from a non-IU address. Please exercise caution when >>>>> clicking links or opening attachments from external sources. >>>>> >>>>> >>>>> Hello, >>>>> >>>>> >>>>> >>>>> Wouldn't it be that in this example the whole file has to be present and >>>>> converted into a single stream and uploaded at once? >>>>> >>>>> We had understood that MFT expects it to be chunk by chunk upload without >>>>> having to have the entire file present. >>>>> >>>>> >>>>> >>>>> Thank you >>>>> >>>>> Aravind Ramalingam >>>>> >>>>> >>>>> >>>>> On Apr 17, 2020, at 00:07, DImuthu Upeksha <[email protected]> >>>>> wrote: >>>>> >>>>> Aravind, >>>>> >>>>> >>>>> >>>>> Streaming is supported in GCS java client. Have a look at here [8] >>>>> >>>>> >>>>> >>>>> [8] >>>>> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104 >>>>> >>>>> >>>>> >>>>> Thanks >>>>> >>>>> Dimuthu >>>>> >>>>> >>>>> >>>>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <[email protected]> >>>>> wrote: >>>>> >>>>> Hello Dimuthu, >>>>> >>>>> >>>>> >>>>> As a followup, we explored GCS in detail. We are faced with a small >>>>> dilemma. We found that though GCS has a Java support, but the >>>>> functionality does not seem to extend to a stream based upload and >>>>> download. >>>>> >>>>> The documentation says it is currently done with a gsutil command line >>>>> library [7], hence we are confused if we would be able to proceed the GCS >>>>> integration. >>>>> >>>>> >>>>> >>>>> Could you please give us any suggestions? Also we were wondering if we >>>>> could maybe take up Box integration or some other provider if GCS proves >>>>> not possible currently. >>>>> >>>>> >>>>> >>>>> [7] https://cloud.google.com/storage/docs/streaming >>>>> >>>>> >>>>> >>>>> Thank you >>>>> >>>>> Aravind Ramalingam >>>>> >>>>> >>>>> >>>>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <[email protected]> >>>>> wrote: >>>>> >>>>> Hello Dimuthu, >>>>> >>>>> >>>>> >>>>> We had just started looking into Azure and GCS. Since Azure is done we >>>>> will take up and explore GCS. >>>>> >>>>> >>>>> >>>>> Thank you for the update. >>>>> >>>>> Thank you >>>>> >>>>> Aravind Ramalingam >>>>> >>>>> >>>>> >>>>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <[email protected]> >>>>> wrote: >>>>> >>>>> Aravind, >>>>> >>>>> >>>>> >>>>> I'm not sure whether you have made any progress on Azure transport yet. I >>>>> got a chance to look into that [6]. Let me know if you are working on GCS >>>>> or any other so that I can plan ahead. Next I will be focusing on Box >>>>> transport. >>>>> >>>>> >>>>> >>>>> [6] >>>>> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd >>>>> >>>>> >>>>> >>>>> Thanks >>>>> >>>>> Dimuthu >>>>> >>>>> >>>>> >>>>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <[email protected]> >>>>> wrote: >>>>> >>>>> Hi Dimuthu, >>>>> >>>>> >>>>> >>>>> Thank you for the update. We look into it and get an idea about how the >>>>> system works. >>>>> >>>>> We were hoping to try an implementation for GCS, we will also look into >>>>> Azure. >>>>> >>>>> >>>>> >>>>> Thank you >>>>> >>>>> Aravind Ramalingam >>>>> >>>>> >>>>> >>>>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha >>>>> <[email protected]> wrote: >>>>> >>>>> Aravind, >>>>> >>>>> >>>>> >>>>> Here [2] is the complete commit for S3 transport implementation but don't >>>>> get confused by the amount of changes as this includes both transport >>>>> implementation and the service backend implementations. If you need to >>>>> implement a new transport, you need to implement a Receiver, Sender and a >>>>> MetadataCollector like this [3]. Then you need to add that resource >>>>> support to Resource service and Secret service [4] [5]. You can similarly >>>>> do that for Azure. A sample SCP -> S3 transfer request is like below. >>>>> Hope that helps. >>>>> >>>>> >>>>> >>>>> String sourceId = "remote-ssh-resource"; >>>>> String sourceToken = "local-ssh-cred"; >>>>> String sourceType = "SCP"; >>>>> String destId = "s3-file"; >>>>> String destToken = "s3-cred"; >>>>> String destType = "S3"; >>>>> >>>>> TransferApiRequest request = TransferApiRequest.newBuilder() >>>>> .setSourceId(sourceId) >>>>> .setSourceToken(sourceToken) >>>>> .setSourceType(sourceType) >>>>> .setDestinationId(destId) >>>>> .setDestinationToken(destToken) >>>>> .setDestinationType(destType) >>>>> .setAffinityTransfer(false).build(); >>>>> >>>>> >>>>> [2] >>>>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948 >>>>> >>>>> [3] >>>>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3 >>>>> >>>>> [4] >>>>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90 >>>>> >>>>> [5] >>>>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45 >>>>> >>>>> >>>>> >>>>> Thanks >>>>> >>>>> Dimuthu >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha >>>>> <[email protected]> wrote: >>>>> >>>>> There is a working on S3 transport in my local copy. Will commit it once >>>>> I test it out properly. You can follow the same pattern for any cloud >>>>> provider which has clients with streaming IO. Streaming among different >>>>> transfer protocols inside an Agent has been discussed in the last part of >>>>> this [1] document. Try to get the conceptual idea from that and reverse >>>>> engineer SCP transport. >>>>> >>>>> >>>>> >>>>> [1] >>>>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo >>>>> >>>>> >>>>> >>>>> Dimuthu >>>>> >>>>> >>>>> >>>>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <[email protected]> >>>>> wrote: >>>>> >>>>> Hello, >>>>> >>>>> We were looking at the existing code in the project. We could find >>>>> implementations only for local copy and SCP. >>>>> We were confused on how to go about with an external provider like S3 or >>>>> Azure? Since it would require integrating with their respective clients. >>>>> >>>>> Thank you >>>>> Aravind Ramalingam >>>>> >>>>> > On Apr 4, 2020, at 21:15, Suresh Marru <[email protected]> wrote: >>>>> > >>>>> > Hi Aravind, >>>>> > >>>>> > I have to catch up with the code, but you may want to look at the S3 >>>>> > implementation and extend it to Azure, GCP or other cloud services like >>>>> > Box, Dropbox and so on. >>>>> > >>>>> > There could be many use cases, here is an idea: >>>>> > >>>>> > * Compute a job on a supercomputer with SCP access and push the outputs >>>>> > to a Cloud storage. >>>>> > >>>>> > Suresh >>>>> > >>>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <[email protected]> >>>>> >> wrote: >>>>> >> >>>>> >> Hello, >>>>> >> >>>>> >> We set up the MFT project on local system and tested out SCP transfer >>>>> >> between JetStream VMs, we were wondering how the support can be >>>>> >> extended for AWS/GCS. >>>>> >> >>>>> >> As per our understanding, the current implementation has support for >>>>> >> two protocols i.e. local-transport and scp-transport. Would we have to >>>>> >> modify/add to the code base to extend support for AWS/GCS clients? >>>>> >> >>>>> >> Could you please provide suggestions for this use case. >>>>> >> >>>>> >> Thank you >>>>> >> Aravind Ramalingam >>>>> >
