HI Praneeth, This looks like a great contribution to MFT and I appreciate your write up here.
One question: these numbers are for uploading from the local EC2 instance to S3, correct? Did you do any analysis on the opposite, downloading from S3 to a local EC2 instance? Thanks, Marcus > On Mar 17, 2023, at 12:22 AM, Chityala, Praneeth <pkchi...@iu.edu> wrote: > > You don't often get email from pkchi...@iu.edu. Learn why this is important > Dear All, > > This is Praneeth Chityala currently pursuing master in Computer Science at > Indiana University Bloomington. As part of my independent study I took up MFT > as the research area and starting understanding the architecture. > > As many of you know MFT uses Agent to transfer data from one cloud storage to > other cloud storage. These agents can be deployed on any compute machines. If > the machine in which agent is deployed might have data files which needs to > be uploaded to cloud storage, that’s where my involvement in the project came > in. I worked on implementing the below extensions: > • Implemented the Local transport extension to allow agent to transfer > data from its host machine given storage – Local transport extension > • Transport has three variations – streaming, chunked file > transfer and chunked streaming > • Implemented the CLI for configuring local agent – Local agent CLI > > Performance testing results: > > After successfully testing from my local machine to AWS S3 storage, I have > deployed agent in AWS EC2 machine and performed multiple tests for compare > it’s performance with rclone and AWS cli. > Below charts indicates the average transfer speeds from our analysis. > > <image001.png> > > > For files from 100MB to 1GB, MFT is more than 60% faster than rclone and more > than 150% faster than AWS cli. > > Configurations of the testing: > > • Local Machine: It’s Ubuntu EC2 VM on AWS (instance type c5.9xlarge) > with 18 cores, 10Gbps dedicated network speed and 1GBps read/write speed to > disk. > > • Cloud Storage: AWS S3 bucket in the same region as above VM. > > • Test sets: From x-axis labels of the graph, 10m_1000 means a test set > of 1000 10MB files. All other test sets follow similar naming convention. > > • Testing trails: Each test is run for 5 times on each transfer method. > > • Testing presets: Before each test caching of VM is cleared so none of > the tests get advantage of higher read speeds using page caching. This is > done to simulate worst possible conditions while reading data. > > • MFT configuration: I used chunked streaming with > • 20MB as chunk size > • 32 concurrent transfers > • 32 concurrent chunked threads > > • rclone configuration: After exploring many possible optimizations > available for rclone I used following settings: > • --s3-chunk-size 128000 > • --buffer-size 128000 > • —s3-upload-cutoff 0 > • --s3-upload-concurrency 32 > • --multi-thread-streams 32 > • --multi-thread-cutoff 0 > • --s3-disable-http2 > • --no-check-dest > • --transfers 32 > • --fast-list > > • AWS cli configuration: I used native AWS cli to transfer as it > doesn’t have much dedicated optimizations in our findings > > Observations: > • For local transport I used BufferedStreaming which helped MFT to get > the max read speeds from local disk without hitting the max IOPS. > > Future plans for testing: > • Jetstream2: Planning to replace AWS EC2 with Jetstream2 virtual > machine and perform similar tests > • Emulab: Simulate same testing using Emulab VMs and custom > configurations with help of Dimuthu. > • Azure: Perform local to Azure cloud storages testing with MFT, rclone > and Azure cli > • GCP: Perform local to GCS testing with MFT, rclone and GCP cli > • I have different implementation of MFT local transport for system > which support DMA (Direct Memory Access), we also plan to test on such > systems with DMA, the present EC2 system doesn’t support DMA. > > Further Improvements of MFT: > • As we noticed MFT is lagging speeds vs rclone for files less than or > equal to 1MB, we plan to stress analyze the whole system and improve speeds > for smaller files > > Acknowledgement: I thank Dimuthu Wannipurage for clearing many doubts about > MFT and providing guidance when needed. > > Thank you and please let us know your comments or thoughts. > > Best, > Praneeth Chityala
smime.p7s
Description: S/MIME cryptographic signature