Hi Praneeth,

SCP transport is now available in MFT [1] and working fine with S3. You can
update your load testing scripts to measure the performance.

[1]
https://github.com/apache/airavata-mft/commit/a61fa4a34129cf0e7e807ee23461237a8a72af22

Thanks
Dimuthu

On Mon, Mar 20, 2023 at 3:43 PM Chityala, Praneeth <pkchi...@iu.edu> wrote:

> Hi Marcus,
>
> Thank you. The Local plugin to download data has been implemented, but the
> testing to check speeds is something not done yet. I will also add this to
> my future performance testing.
>
> Best,
> Praneeth
>
> On 3/20/23, 2:43 PM, "Christie, Marcus Aaron" <machr...@iu.edu <mailto:
> machr...@iu.edu>> wrote:
>
>
> HI Praneeth,
>
>
> This looks like a great contribution to MFT and I appreciate your write up
> here.
>
>
> One question: these numbers are for uploading from the local EC2 instance
> to S3, correct? Did you do any analysis on the opposite, downloading from
> S3 to a local EC2 instance?
>
>
> Thanks,
>
>
> Marcus
>
>
> > On Mar 17, 2023, at 12:22 AM, Chityala, Praneeth <pkchi...@iu.edu
> <mailto:pkchi...@iu.edu>> wrote:
> >
> > You don't often get email from pkchi...@iu.edu <mailto:pkchi...@iu.edu>.
> Learn why this is important
> > Dear All,
> >
> > This is Praneeth Chityala currently pursuing master in Computer Science
> at Indiana University Bloomington. As part of my independent study I took
> up MFT as the research area and starting understanding the architecture.
> >
> > As many of you know MFT uses Agent to transfer data from one cloud
> storage to other cloud storage. These agents can be deployed on any compute
> machines. If the machine in which agent is deployed might have data files
> which needs to be uploaded to cloud storage, that’s where my involvement in
> the project came in. I worked on implementing the below extensions:
> > • Implemented the Local transport extension to allow agent to transfer
> data from its host machine given storage – Local transport extension
> > • Transport has three variations – streaming, chunked file transfer and
> chunked streaming
> > • Implemented the CLI for configuring local agent – Local agent CLI
> >
> > Performance testing results:
> >
> > After successfully testing from my local machine to AWS S3 storage, I
> have deployed agent in AWS EC2 machine and performed multiple tests for
> compare it’s performance with rclone and AWS cli.
> > Below charts indicates the average transfer speeds from our analysis.
> >
> > <image001.png>
> >
> >
> > For files from 100MB to 1GB, MFT is more than 60% faster than rclone and
> more than 150% faster than AWS cli.
> >
> > Configurations of the testing:
> >
> > • Local Machine: It’s Ubuntu EC2 VM on AWS (instance type c5.9xlarge)
> with 18 cores, 10Gbps dedicated network speed and 1GBps read/write speed to
> disk.
> >
> > • Cloud Storage: AWS S3 bucket in the same region as above VM.
> >
> > • Test sets: From x-axis labels of the graph, 10m_1000 means a test set
> of 1000 10MB files. All other test sets follow similar naming convention.
> >
> > • Testing trails: Each test is run for 5 times on each transfer method.
> >
> > • Testing presets: Before each test caching of VM is cleared so none of
> the tests get advantage of higher read speeds using page caching. This is
> done to simulate worst possible conditions while reading data.
> >
> > • MFT configuration: I used chunked streaming with
> > • 20MB as chunk size
> > • 32 concurrent transfers
> > • 32 concurrent chunked threads
> >
> > • rclone configuration: After exploring many possible optimizations
> available for rclone I used following settings:
> > • --s3-chunk-size 128000
> > • --buffer-size 128000
> > • —s3-upload-cutoff 0
> > • --s3-upload-concurrency 32
> > • --multi-thread-streams 32
> > • --multi-thread-cutoff 0
> > • --s3-disable-http2
> > • --no-check-dest
> > • --transfers 32
> > • --fast-list
> >
> > • AWS cli configuration: I used native AWS cli to transfer as it doesn’t
> have much dedicated optimizations in our findings
> >
> > Observations:
> > • For local transport I used BufferedStreaming which helped MFT to get
> the max read speeds from local disk without hitting the max IOPS.
> >
> > Future plans for testing:
> > • Jetstream2: Planning to replace AWS EC2 with Jetstream2 virtual
> machine and perform similar tests
> > • Emulab: Simulate same testing using Emulab VMs and custom
> configurations with help of Dimuthu.
> > • Azure: Perform local to Azure cloud storages testing with MFT, rclone
> and Azure cli
> > • GCP: Perform local to GCS testing with MFT, rclone and GCP cli
> > • I have different implementation of MFT local transport for system
> which support DMA (Direct Memory Access), we also plan to test on such
> systems with DMA, the present EC2 system doesn’t support DMA.
> >
> > Further Improvements of MFT:
> > • As we noticed MFT is lagging speeds vs rclone for files less than or
> equal to 1MB, we plan to stress analyze the whole system and improve speeds
> for smaller files
> >
> > Acknowledgement: I thank Dimuthu Wannipurage for clearing many doubts
> about MFT and providing guidance when needed.
> >
> > Thank you and please let us know your comments or thoughts.
> >
> > Best,
> > Praneeth Chityala
>
>
>
>
>
>

Reply via email to