Package: wnpp Severity: wishlist Owner: Sascha Steinbiss <sa...@debian.org>
* Package name : s4cmd Version : 2.0.1 Upstream Author : BloomReach Inc. * URL : https://github.com/bloomreach/s4cmd * License : Apache Programming Lang: Python Description : Super Amazon S3 command line tool The s4cmd tool is intended as an alternative to s3cmd for enhanced performance and for large files, and with a number of additional features and fixes. It strives to be compatible with the most common usage scenarios for s3cmd. It does not offer exact drop-in compatibility, due to a number of corner cases where different behavior seems preferable, or for bugfixes. The main features that distinguish s4cmd are: - Simple (less than 1500 lines of code) and implemented in pure Python, based on the widely used Boto3 library. - Multi-threaded/multi-connection implementation for enhanced performance on all commands. As with many network-intensive applications (like web browsers), accessing S3 in a single-threaded way is often significantly less efficient than having multiple connections actively transferring data at once. In general, one gets a 2X boost to upload/download speeds from this. - Path handling: S3 is not a traditional filesystem with built-in support for directory structure: internally, there are only objects, not directories or folders. However, most people use S3 in a hierarchical structure, with paths separated by slashes, to emulate traditional filesystems. S4cmd follows conventions to more closely replicate the behavior of traditional filesystems in certain corner cases. For example, "ls" and "cp" work much like in Unix shells, to avoid odd surprises. - Wildcard support: Wildcards, including multiple levels of wildcards, like in Unix shells, are handled. For example: s3://my-bucket/my-folder/20120512//chunk00?1? - Automatic retry: Failure tasks will be executed again after a delay. - Multi-part upload support for files larger than 5GB. - Handling of MD5s properly with respect to multi-part uploads. - Miscellaneous enhancements and bugfixes: - Partial file creation: Avoid creating empty target files if source does not exist. Avoid creating partial output files when commands are interrupted. - General thread safety: Tool can be interrupted or killed at any time without being blocked by child threads or leaving incomplete or corrupt files in place. - Ensure exit code is nonzero on all failure scenarios. - Expected handling of symlinks (they are followed). - Support both s3:// and s3n:// prefixes (the latter is common with Amazon Elastic Mapreduce).