[ 
https://issues.apache.org/jira/browse/HADOOP-18543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642064#comment-17642064
 ] 

ASF GitHub Bot commented on HADOOP-18543:
-----------------------------------------

steveloughran commented on PR #5172:
URL: https://github.com/apache/hadoop/pull/5172#issuecomment-1334108437

   sorry, but I'm going to say -1 to using the normal IO buffer size as the GET 
range. The default value of 4k is way too small even for parquet/orc reads, it 
will break all existing apps in performance terms: distcp, parquet library, 
avro, ORC, everything, as they all use the default value.
   
   1. there is a configuration option for multipart download size, which is 
filesystem-wide. Not as flexible, but something everyone will expect to work.
   2. If you want better control of read policy, buffer sizes etc, then this 
connector needs to implement openFile(), as s3a and abfs do. that will let you 
add a new option to specify the range for GET calls.




> AliyunOSS: AliyunOSSFileSystem#open(Path path, int bufferSize) should use 
> buffer size as its downloadPartSize
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-18543
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18543
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/oss
>            Reporter: Hangxiang Yu
>            Priority: Major
>              Labels: pull-request-available
>
> In our application, different components have their own suitable buffer size 
> to download.
> But currently, AliyunOSSFileSystem#open(Path path, int bufferSize) just get 
> downloadPartSize from configuration.
> We cannnot use different value for different components in our programs.
> I think we should the method should use the buffer size from the paramater.
> AliyunOSSFileSystem#open(Path path) could have default value as current 
> default downloadPartSize.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to