[ https://issues.apache.org/jira/browse/HADOOP-18543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642064#comment-17642064 ]
ASF GitHub Bot commented on HADOOP-18543: ----------------------------------------- steveloughran commented on PR #5172: URL: https://github.com/apache/hadoop/pull/5172#issuecomment-1334108437 sorry, but I'm going to say -1 to using the normal IO buffer size as the GET range. The default value of 4k is way too small even for parquet/orc reads, it will break all existing apps in performance terms: distcp, parquet library, avro, ORC, everything, as they all use the default value. 1. there is a configuration option for multipart download size, which is filesystem-wide. Not as flexible, but something everyone will expect to work. 2. If you want better control of read policy, buffer sizes etc, then this connector needs to implement openFile(), as s3a and abfs do. that will let you add a new option to specify the range for GET calls. > AliyunOSS: AliyunOSSFileSystem#open(Path path, int bufferSize) should use > buffer size as its downloadPartSize > ------------------------------------------------------------------------------------------------------------- > > Key: HADOOP-18543 > URL: https://issues.apache.org/jira/browse/HADOOP-18543 > Project: Hadoop Common > Issue Type: Bug > Components: fs/oss > Reporter: Hangxiang Yu > Priority: Major > Labels: pull-request-available > > In our application, different components have their own suitable buffer size > to download. > But currently, AliyunOSSFileSystem#open(Path path, int bufferSize) just get > downloadPartSize from configuration. > We cannnot use different value for different components in our programs. > I think we should the method should use the buffer size from the paramater. > AliyunOSSFileSystem#open(Path path) could have default value as current > default downloadPartSize. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org