[jira] [Commented] (HADOOP-18371) s3a FS init logs at warn if fs.s3a.create.storage.class is unset

2022-08-11 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578331#comment-17578331
 ] 

Monthon Klongklaew commented on HADOOP-18371:
-

Hi [~vjasani] no problem. thanks for taking a look

> s3a FS init logs at warn if fs.s3a.create.storage.class is unset
> 
>
> Key: HADOOP-18371
> URL: https://issues.apache.org/jira/browse/HADOOP-18371
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.9
>Reporter: Steve Loughran
>Assignee: Monthon Klongklaew
>Priority: Blocker
>  Labels: pull-request-available
>
> if you don't have an s3a storage class set in 
> {{fs.s3a.create.storage.class}}, then whenever you create an S3A FS instance, 
> it logs at warn
> {code}
> bin/hadoop s3guard bucket-info $BUCKET
> 2022-07-27 11:53:11,239 [main] INFO  Configuration.deprecation 
> (Configuration.java:logDeprecation(1459)) - fs.s3a.server-side-encryption.key 
> is deprecated. Instead, use fs.s3a.encryption.key
> 2022-07-27 11:53:11,240 [main] INFO  Configuration.deprecation 
> (Configuration.java:logDeprecation(1459)) - 
> fs.s3a.server-side-encryption-algorithm is deprecated. Instead, use 
> fs.s3a.encryption.algorithm
> 2022-07-27 11:53:11,396 [main] WARN  s3a.S3AFileSystem 
> (S3AFileSystem.java:createRequestFactory(1004)) - Unknown storage class 
> property fs.s3a.create.storage.class: ; falling back to default storage class
> 2022-07-27 11:53:11,839 [main] INFO  impl.DirectoryPolicyImpl 
> (DirectoryPolicyImpl.java:getDirectoryPolicy(189)) - Directory markers will 
> be kept
> Filesystem s3a://stevel-london
> Location: eu-west-2
> {code}
> note, this is why part of quaifying an sdk update involves looking at the 
> logs and running the CLI commands by hand...you see if new messages have 
> crept in



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12020) Support configuration of different S3 storage classes

2022-08-01 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573825#comment-17573825
 ] 

Monthon Klongklaew commented on HADOOP-12020:
-

oh, I've just noticed the issue with log at warn, I'll be taking a look at it 
soon

> Support configuration of different S3 storage classes
> -
>
> Key: HADOOP-12020
> URL: https://issues.apache.org/jira/browse/HADOOP-12020
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: Hadoop on AWS
>Reporter: Yann Landrin-Schweitzer
>Assignee: Monthon Klongklaew
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Amazon S3 uses, by default, the NORMAL_STORAGE class for s3 objects.
> This offers, according to Amazon's material, 99.% reliability.
> For many applications, however, the 99.99% reliability offered by the 
> REDUCED_REDUNDANCY storage class is amply sufficient, and comes with a 
> significant cost saving.
> HDFS, when using the legacy s3n protocol, or the new s3a scheme, should 
> support overriding the default storage class of created s3 objects so that 
> users can take advantage of this cost benefit.
> This would require minor changes of the s3n and s3a drivers, using 
> a configuration property fs.s3n.storage.class to override the default storage 
> when desirable. 
> This override could be implemented in Jets3tNativeFileSystemStore with:
>   S3Object object = new S3Object(key);
>   ...
>   if(storageClass!=null)  object.setStorageClass(storageClass);
> It would take a more complex form in s3a, e.g. setting:
> InitiateMultipartUploadRequest initiateMPURequest =
> new InitiateMultipartUploadRequest(bucket, key, om);
> if(storageClass !=null ) {
> initiateMPURequest = 
> initiateMPURequest.withStorageClass(storageClass);
> }
> and similar statements in various places.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12020) Support configuration of different S3 storage classes

2022-08-01 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573824#comment-17573824
 ] 

Monthon Klongklaew commented on HADOOP-12020:
-

I have added new tests for byte/heap buffer writes and addressed the issue in 
this PR [#4669|https://github.com/apache/hadoop/pull/4669]

also, the PR which fixes s3 select tests is ready for review here 
[#4489|https://github.com/apache/hadoop/pull/4489]

> Support configuration of different S3 storage classes
> -
>
> Key: HADOOP-12020
> URL: https://issues.apache.org/jira/browse/HADOOP-12020
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: Hadoop on AWS
>Reporter: Yann Landrin-Schweitzer
>Assignee: Monthon Klongklaew
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Amazon S3 uses, by default, the NORMAL_STORAGE class for s3 objects.
> This offers, according to Amazon's material, 99.% reliability.
> For many applications, however, the 99.99% reliability offered by the 
> REDUCED_REDUNDANCY storage class is amply sufficient, and comes with a 
> significant cost saving.
> HDFS, when using the legacy s3n protocol, or the new s3a scheme, should 
> support overriding the default storage class of created s3 objects so that 
> users can take advantage of this cost benefit.
> This would require minor changes of the s3n and s3a drivers, using 
> a configuration property fs.s3n.storage.class to override the default storage 
> when desirable. 
> This override could be implemented in Jets3tNativeFileSystemStore with:
>   S3Object object = new S3Object(key);
>   ...
>   if(storageClass!=null)  object.setStorageClass(storageClass);
> It would take a more complex form in s3a, e.g. setting:
> InitiateMultipartUploadRequest initiateMPURequest =
> new InitiateMultipartUploadRequest(bucket, key, om);
> if(storageClass !=null ) {
> initiateMPURequest = 
> initiateMPURequest.withStorageClass(storageClass);
> }
> and similar statements in various places.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18339) S3A storage class option only picked up when buffering writes to disk

2022-07-25 Thread Monthon Klongklaew (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Monthon Klongklaew reassigned HADOOP-18339:
---

Assignee: Monthon Klongklaew

> S3A storage class option only picked up when buffering writes to disk
> -
>
> Key: HADOOP-18339
> URL: https://issues.apache.org/jira/browse/HADOOP-18339
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.9
>Reporter: Steve Loughran
>Assignee: Monthon Klongklaew
>Priority: Major
>
> when you switch s3a output stream buffering to heap or byte buffer, the 
> storage class option isn't added to the put request
> {code}
>   
> fs.s3a.fast.upload.buffer
> bytebuffer
>   
> {code}
> and the ITestS3AStorageClass tests fail.
> {code}
> java.lang.AssertionError: [Storage class of object 
> s3a://stevel-london/test/testCreateAndCopyObjectWithStorageClassGlacier/file1]
>  
> Expecting:
>  
> to be equal to:
>  <"glacier">
> ignoring case considerations
>   at 
> org.apache.hadoop.fs.s3a.ITestS3AStorageClass.assertObjectHasStorageClass(ITestS3AStorageClass.java:215)
>   at 
> org.apache.hadoop.fs.s3a.ITestS3AStorageClass.testCreateAndCopyObjectWithStorageClassGlacier(ITestS3AStorageClass.java:129)
> {code}
> we noticed this in a code review; the request factory only sets the option 
> when the source is a file, not memory.
> proposed: parameterize the test suite on disk/byte buffer, then fix



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12020) Support AWS S3 reduced redundancy storage class

2022-06-15 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554643#comment-17554643
 ] 

Monthon Klongklaew commented on HADOOP-12020:
-

thank you for reporting this, I will take a look

> Support AWS S3 reduced redundancy storage class
> ---
>
> Key: HADOOP-12020
> URL: https://issues.apache.org/jira/browse/HADOOP-12020
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: Hadoop on AWS
>Reporter: Yann Landrin-Schweitzer
>Assignee: Monthon Klongklaew
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.4
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Amazon S3 uses, by default, the NORMAL_STORAGE class for s3 objects.
> This offers, according to Amazon's material, 99.% reliability.
> For many applications, however, the 99.99% reliability offered by the 
> REDUCED_REDUNDANCY storage class is amply sufficient, and comes with a 
> significant cost saving.
> HDFS, when using the legacy s3n protocol, or the new s3a scheme, should 
> support overriding the default storage class of created s3 objects so that 
> users can take advantage of this cost benefit.
> This would require minor changes of the s3n and s3a drivers, using 
> a configuration property fs.s3n.storage.class to override the default storage 
> when desirable. 
> This override could be implemented in Jets3tNativeFileSystemStore with:
>   S3Object object = new S3Object(key);
>   ...
>   if(storageClass!=null)  object.setStorageClass(storageClass);
> It would take a more complex form in s3a, e.g. setting:
> InitiateMultipartUploadRequest initiateMPURequest =
> new InitiateMultipartUploadRequest(bucket, key, om);
> if(storageClass !=null ) {
> initiateMPURequest = 
> initiateMPURequest.withStorageClass(storageClass);
> }
> and similar statements in various places.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12020) Support AWS S3 reduced redundancy storage class

2022-05-04 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531634#comment-17531634
 ] 

Monthon Klongklaew commented on HADOOP-12020:
-

I have opened a PR but somehow it doesn't link to the issue. Here is the link 
https://github.com/apache/hadoop/pull/3877

> Support AWS S3 reduced redundancy storage class
> ---
>
> Key: HADOOP-12020
> URL: https://issues.apache.org/jira/browse/HADOOP-12020
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: Hadoop on AWS
>Reporter: Yann Landrin-Schweitzer
>Assignee: Monthon Klongklaew
>Priority: Major
>
> Amazon S3 uses, by default, the NORMAL_STORAGE class for s3 objects.
> This offers, according to Amazon's material, 99.% reliability.
> For many applications, however, the 99.99% reliability offered by the 
> REDUCED_REDUNDANCY storage class is amply sufficient, and comes with a 
> significant cost saving.
> HDFS, when using the legacy s3n protocol, or the new s3a scheme, should 
> support overriding the default storage class of created s3 objects so that 
> users can take advantage of this cost benefit.
> This would require minor changes of the s3n and s3a drivers, using 
> a configuration property fs.s3n.storage.class to override the default storage 
> when desirable. 
> This override could be implemented in Jets3tNativeFileSystemStore with:
>   S3Object object = new S3Object(key);
>   ...
>   if(storageClass!=null)  object.setStorageClass(storageClass);
> It would take a more complex form in s3a, e.g. setting:
> InitiateMultipartUploadRequest initiateMPURequest =
> new InitiateMultipartUploadRequest(bucket, key, om);
> if(storageClass !=null ) {
> initiateMPURequest = 
> initiateMPURequest.withStorageClass(storageClass);
> }
> and similar statements in various places.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18175) test failures with prefetching s3a input stream

2022-04-21 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525618#comment-17525618
 ] 

Monthon Klongklaew commented on HADOOP-18175:
-

assigning this to myself as I've been looking at these tests. will open a PR 
for it soon.

> test failures with prefetching s3a input stream
> ---
>
> Key: HADOOP-18175
> URL: https://issues.apache.org/jira/browse/HADOOP-18175
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Major
>
> identify and fix all test regressions from the prefetching s3a input stream



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18175) test failures with prefetching s3a input stream

2022-04-21 Thread Monthon Klongklaew (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Monthon Klongklaew reassigned HADOOP-18175:
---

Assignee: Monthon Klongklaew

> test failures with prefetching s3a input stream
> ---
>
> Key: HADOOP-18175
> URL: https://issues.apache.org/jira/browse/HADOOP-18175
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Monthon Klongklaew
>Priority: Major
>
> identify and fix all test regressions from the prefetching s3a input stream



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13294) Test hadoop fs shell against s3a; fix problems

2022-04-12 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-13294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17521296#comment-17521296
 ] 

Monthon Klongklaew commented on HADOOP-13294:
-

got it, that might be the best way. I will update my PR for the new option and 
make a change to s3a files only.

> Test hadoop fs shell against s3a; fix problems
> --
>
> Key: HADOOP-13294
> URL: https://issues.apache.org/jira/browse/HADOOP-13294
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> There's no tests of {{hadoop -fs}} commands against s3a; add some. Ideally, 
> generic to all object stores.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13294) Test hadoop fs shell against s3a; fix problems

2022-03-09 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-13294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17503481#comment-17503481
 ] 

Monthon Klongklaew commented on HADOOP-13294:
-

I've been thinking about this; adding an option like 
`fs.contract.supports-root-delete` makes more sense to me. We can set it to 
false and expect exception for the store that does not support it. The downside 
is that we have to modify xml for every store that enable root tests. What do 
you think?

> Test hadoop fs shell against s3a; fix problems
> --
>
> Key: HADOOP-13294
> URL: https://issues.apache.org/jira/browse/HADOOP-13294
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> There's no tests of {{hadoop -fs}} commands against s3a; add some. Ideally, 
> generic to all object stores.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13294) Test hadoop fs shell against s3a; fix problems

2022-02-16 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-13294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17493116#comment-17493116
 ] 

Monthon Klongklaew commented on HADOOP-13294:
-

I would like to add these tests.

 

I tried to change delete bucket error message by throwing an exception instead 
of returning false here 
[DeleteOperation|https://github.com/apache/hadoop/blob/bddc9bf63c3adb3d7445547bd1f8272e53b40bf7/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/DeleteOperation.java#L271].

The console print it out nicely but it breaks the ContractRootDir test. I 
wonder if it's ok to make this change because the result seems to be the same; 
we will not delete it.

> Test hadoop fs shell against s3a; fix problems
> --
>
> Key: HADOOP-13294
> URL: https://issues.apache.org/jira/browse/HADOOP-13294
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Major
>
> There's no tests of {{hadoop -fs}} commands against s3a; add some. Ideally, 
> generic to all object stores.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16101) Use lighter-weight alternatives to innerGetFileStatus where possible

2022-02-08 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17488919#comment-17488919
 ] 

Monthon Klongklaew commented on HADOOP-16101:
-

After some investigation I think it is light-weight enough at this point. The 
innerGetFileStatus became a lot simpler since S3Guard was removed. We have 
https://issues.apache.org/jira/browse/HADOOP-17415 which cover the file reading 
without initial HEAD request. Should we consider closing this one and create a 
new task for rename builder with FileStatus param?

> Use lighter-weight alternatives to innerGetFileStatus where possible
> 
>
> Key: HADOOP-16101
> URL: https://issues.apache.org/jira/browse/HADOOP-16101
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Sean Mackrory
>Priority: Major
>
> Discussion in HADOOP-15999 highlighted the heaviness of a full 
> innerGetFileStatus call, where many usages of it may need a lighter weight 
> fileExists, etc. Let's investigate usage of innerGetFileStatus and slim it 
> down where possible.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-17415) Use S3 content-range header to update length of an object during reads

2022-01-27 Thread Monthon Klongklaew (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Monthon Klongklaew reassigned HADOOP-17415:
---

Assignee: Monthon Klongklaew

> Use S3 content-range header to update length of an object during reads
> --
>
> Key: HADOOP-17415
> URL: https://issues.apache.org/jira/browse/HADOOP-17415
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Monthon Klongklaew
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As part of all the openFile work, knowing full length of an object allows for 
> a HEAD to be skipped. But: code knowing only the splits don't know the final 
> length of the file.
> If the content-range header is used, then as soon as a single GET is 
> initiated against an object, if the field is returned then we can update the 
> length of the S3A stream to its real/final length
> Also: when any input stream fails with an EOF exception, we can distinguish 
> stream-interrupted from "no, too far"



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-17415) Use S3 content-range header to update length of an object during reads

2022-01-19 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17477858#comment-17477858
 ] 

Monthon Klongklaew edited comment on HADOOP-17415 at 1/19/22, 10:52 AM:


Need more information about this one.

We can get the object length and update it on first read, but how can we 
determine if it is a file or directory, or doesn't exist, without a probe?

>From what I see, an exception is expected when using openFile against a 
>directory or no file.


was (Author: JIRAUSER281330):
Need more information about this one.

We can get the object length and update it on first read, but how can we 
determine if it is a file or directory without a probe?

>From what I see, an exception is expected when using openFile against a 
>directory.

> Use S3 content-range header to update length of an object during reads
> --
>
> Key: HADOOP-17415
> URL: https://issues.apache.org/jira/browse/HADOOP-17415
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Priority: Major
>
> As part of all the openFile work, knowing full length of an object allows for 
> a HEAD to be skipped. But: code knowing only the splits don't know the final 
> length of the file.
> If the content-range header is used, then as soon as a single GET is 
> initiated against an object, if the field is returned then we can update the 
> length of the S3A stream to its real/final length
> Also: when any input stream fails with an EOF exception, we can distinguish 
> stream-interrupted from "no, too far"



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17415) Use S3 content-range header to update length of an object during reads

2022-01-18 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17477858#comment-17477858
 ] 

Monthon Klongklaew commented on HADOOP-17415:
-

Need more information about this one.

We can get the object length and update it on first read, but how can we 
determine if it is a file or directory without a probe?

>From what I see, an exception is expected when using openFile against a 
>directory.

> Use S3 content-range header to update length of an object during reads
> --
>
> Key: HADOOP-17415
> URL: https://issues.apache.org/jira/browse/HADOOP-17415
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Priority: Major
>
> As part of all the openFile work, knowing full length of an object allows for 
> a HEAD to be skipped. But: code knowing only the splits don't know the final 
> length of the file.
> If the content-range header is used, then as soon as a single GET is 
> initiated against an object, if the field is returned then we can update the 
> length of the S3A stream to its real/final length
> Also: when any input stream fails with an EOF exception, we can distinguish 
> stream-interrupted from "no, too far"



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12020) Support AWS S3 reduced redundancy storage class

2022-01-10 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17472073#comment-17472073
 ] 

Monthon Klongklaew commented on HADOOP-12020:
-

I'm implementing this feature. I think the reference about 405 response is out 
of date, I don't see it in the latest documentation.

Does this mean we don't have to handle it anymore?

> Support AWS S3 reduced redundancy storage class
> ---
>
> Key: HADOOP-12020
> URL: https://issues.apache.org/jira/browse/HADOOP-12020
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: Hadoop on AWS
>Reporter: Yann Landrin-Schweitzer
>Priority: Major
>
> Amazon S3 uses, by default, the NORMAL_STORAGE class for s3 objects.
> This offers, according to Amazon's material, 99.% reliability.
> For many applications, however, the 99.99% reliability offered by the 
> REDUCED_REDUNDANCY storage class is amply sufficient, and comes with a 
> significant cost saving.
> HDFS, when using the legacy s3n protocol, or the new s3a scheme, should 
> support overriding the default storage class of created s3 objects so that 
> users can take advantage of this cost benefit.
> This would require minor changes of the s3n and s3a drivers, using 
> a configuration property fs.s3n.storage.class to override the default storage 
> when desirable. 
> This override could be implemented in Jets3tNativeFileSystemStore with:
>   S3Object object = new S3Object(key);
>   ...
>   if(storageClass!=null)  object.setStorageClass(storageClass);
> It would take a more complex form in s3a, e.g. setting:
> InitiateMultipartUploadRequest initiateMPURequest =
> new InitiateMultipartUploadRequest(bucket, key, om);
> if(storageClass !=null ) {
> initiateMPURequest = 
> initiateMPURequest.withStorageClass(storageClass);
> }
> and similar statements in various places.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-12020) Support AWS S3 reduced redundancy storage class

2022-01-06 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469810#comment-17469810
 ] 

Monthon Klongklaew edited comment on HADOOP-12020 at 1/6/22, 10:07 AM:
---

This is also enable applications to write in archive storage classes.

It can't be read directly and so the copy operation, status 403 
InvalidObjectState would be thrown.

How should we handle it?


was (Author: JIRAUSER281330):
This is also enable applications to write in archive storage classes.

It's can't be read directly and so the copy operation, status 403 
InvalidObjectState would be thrown.

How should we handle it?

> Support AWS S3 reduced redundancy storage class
> ---
>
> Key: HADOOP-12020
> URL: https://issues.apache.org/jira/browse/HADOOP-12020
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: Hadoop on AWS
>Reporter: Yann Landrin-Schweitzer
>Priority: Major
>
> Amazon S3 uses, by default, the NORMAL_STORAGE class for s3 objects.
> This offers, according to Amazon's material, 99.% reliability.
> For many applications, however, the 99.99% reliability offered by the 
> REDUCED_REDUNDANCY storage class is amply sufficient, and comes with a 
> significant cost saving.
> HDFS, when using the legacy s3n protocol, or the new s3a scheme, should 
> support overriding the default storage class of created s3 objects so that 
> users can take advantage of this cost benefit.
> This would require minor changes of the s3n and s3a drivers, using 
> a configuration property fs.s3n.storage.class to override the default storage 
> when desirable. 
> This override could be implemented in Jets3tNativeFileSystemStore with:
>   S3Object object = new S3Object(key);
>   ...
>   if(storageClass!=null)  object.setStorageClass(storageClass);
> It would take a more complex form in s3a, e.g. setting:
> InitiateMultipartUploadRequest initiateMPURequest =
> new InitiateMultipartUploadRequest(bucket, key, om);
> if(storageClass !=null ) {
> initiateMPURequest = 
> initiateMPURequest.withStorageClass(storageClass);
> }
> and similar statements in various places.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12020) Support AWS S3 reduced redundancy storage class

2022-01-06 Thread Monthon Klongklaew (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469810#comment-17469810
 ] 

Monthon Klongklaew commented on HADOOP-12020:
-

This is also enable applications to write in archive storage classes.

It's can't be read directly and so the copy operation, status 403 
InvalidObjectState would be thrown.

How should we handle it?

> Support AWS S3 reduced redundancy storage class
> ---
>
> Key: HADOOP-12020
> URL: https://issues.apache.org/jira/browse/HADOOP-12020
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: Hadoop on AWS
>Reporter: Yann Landrin-Schweitzer
>Priority: Major
>
> Amazon S3 uses, by default, the NORMAL_STORAGE class for s3 objects.
> This offers, according to Amazon's material, 99.% reliability.
> For many applications, however, the 99.99% reliability offered by the 
> REDUCED_REDUNDANCY storage class is amply sufficient, and comes with a 
> significant cost saving.
> HDFS, when using the legacy s3n protocol, or the new s3a scheme, should 
> support overriding the default storage class of created s3 objects so that 
> users can take advantage of this cost benefit.
> This would require minor changes of the s3n and s3a drivers, using 
> a configuration property fs.s3n.storage.class to override the default storage 
> when desirable. 
> This override could be implemented in Jets3tNativeFileSystemStore with:
>   S3Object object = new S3Object(key);
>   ...
>   if(storageClass!=null)  object.setStorageClass(storageClass);
> It would take a more complex form in s3a, e.g. setting:
> InitiateMultipartUploadRequest initiateMPURequest =
> new InitiateMultipartUploadRequest(bucket, key, om);
> if(storageClass !=null ) {
> initiateMPURequest = 
> initiateMPURequest.withStorageClass(storageClass);
> }
> and similar statements in various places.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org