from:"Viraj Jasani \(Jira\)"

[jira] [Assigned] (HADOOP-19148) Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298

2024-09-14 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-19148:
-

Assignee: (was: Viraj Jasani)

> Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298
> ---
>
> Key: HADOOP-19148
> URL: https://issues.apache.org/jira/browse/HADOOP-19148
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Brahma Reddy Battula
>Priority: Major
>  Labels: pull-request-available
>
> Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-19072) S3A: expand optimisations on stores with "fs.s3a.performance.flags" for mkdir

2024-08-23 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HADOOP-19072.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

> S3A: expand optimisations on stores with "fs.s3a.performance.flags" for mkdir
> -
>
> Key: HADOOP-19072
> URL: https://issues.apache.org/jira/browse/HADOOP-19072
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> on an s3a store with fs.s3a.create.performance set, speed up other operations
> *  mkdir to skip parent directory check: just do a HEAD to see if there's a 
> file at the target location



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19256) Support S3 Conditional Writes

2024-08-23 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876367#comment-17876367
 ] 

Viraj Jasani commented on HADOOP-19256:
---

Ahmar, reg the new SDK, IIUC, only PutObjectRequest and 
CompleteMultipartUploadRequest need new input param "ifNoneMatch()", else 
GetObjectRequest, HeadObjectRequest and CopyObjectRequest already have required 
inputs in SDK right?

> Support S3 Conditional Writes
> -
>
> Key: HADOOP-19256
> URL: https://issues.apache.org/jira/browse/HADOOP-19256
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Priority: Major
>
> S3 Conditional Write (Put-if-absent) capability is now generally available - 
> [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/]
>  
> S3A should allow passing in this put-if-absent header to prevent over writing 
> of files. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-19256) Support S3 Conditional Writes

2024-08-22 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876056#comment-17876056
 ] 

Viraj Jasani edited comment on HADOOP-19256 at 8/22/24 7:12 PM:


{quote}we have *exactly* this for openFile() and createFile()
{quote}
Ah you mean openFileWithOptions() category of APIs right? i missed this, never 
got to explore this API.
{quote}is a new SDK needed here?
{quote}
i skimmed through docs yesterday and it seems the docs do not mention anything 
about new SDK. Also, the options to provide these params in header are already 
available in the SDK we use e.g. we use ifMatch() for default 
ChangeDetectionPolicy.


was (Author: vjasani):
{quote}we have *exactly* this for openFile() and createFile()
{quote}
Ah you mean openFileWithOptions() category of APIs right? i missed this, never 
got to explore this API.
{quote}is a new SDK needed here?
{quote}
i skimmed through docs yesterday and it seems the docs do not mention anything 
about new SDK. Also, the options to provide these params in header are already 
available in the SDK we use.

> Support S3 Conditional Writes
> -
>
> Key: HADOOP-19256
> URL: https://issues.apache.org/jira/browse/HADOOP-19256
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Priority: Major
>
> S3 Conditional Write (Put-if-absent) capability is now generally available - 
> [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/]
>  
> S3A should allow passing in this put-if-absent header to prevent over writing 
> of files. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19256) Support S3 Conditional Writes

2024-08-22 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876076#comment-17876076
 ] 

Viraj Jasani commented on HADOOP-19256:
---

{quote}we already use conditional headers in read operations, using version or 
etag of a file to ensure that every GET in an input stream either always picks 
up the same file version (versioned option) or just etag validation (default)
{quote}
Steve, i agree that we have the changeDetectionPolicy 
(fs.s3a.change.detection.source), but it is still generic config and the config 
name does not say that it is used by getObject only, correct? i was thinking 
about having API level header as s3afs config but now i think whatever the 
header value we want to use for conditional writes, we want to use it for all 
APIs: getObject, headObject and copyObject headers rather than only for select 
ones.

Even in that case, does it still make sense to have a config that can take 
multiple key-value pairs like i mentioned above? because as per the docs, 
multiple headers can also be provided e.g. If-Match and If-Unmodified-Since.

> Support S3 Conditional Writes
> -
>
> Key: HADOOP-19256
> URL: https://issues.apache.org/jira/browse/HADOOP-19256
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Priority: Major
>
> S3 Conditional Write (Put-if-absent) capability is now generally available - 
> [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/]
>  
> S3A should allow passing in this put-if-absent header to prevent over writing 
> of files. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19256) Support S3 Conditional Writes

2024-08-22 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876056#comment-17876056
 ] 

Viraj Jasani commented on HADOOP-19256:
---

{quote}we have *exactly* this for openFile() and createFile()
{quote}
Ah you mean openFileWithOptions() category of APIs right? i missed this, never 
got to explore this API.
{quote}is a new SDK needed here?
{quote}
i skimmed through docs yesterday and it seems the docs do not mention anything 
about new SDK. Also, the options to provide these params in header are already 
available in the SDK we use.

> Support S3 Conditional Writes
> -
>
> Key: HADOOP-19256
> URL: https://issues.apache.org/jira/browse/HADOOP-19256
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Priority: Major
>
> S3 Conditional Write (Put-if-absent) capability is now generally available - 
> [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/]
>  
> S3A should allow passing in this put-if-absent header to prevent over writing 
> of files. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-19256) Support S3 Conditional Writes

2024-08-21 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875670#comment-17875670
 ] 

Viraj Jasani edited comment on HADOOP-19256 at 8/21/24 10:15 PM:
-

Does this mean we need S3A configs for each of the getObject, headObject and 
copyObject headers? Probably we can introduce something similar to 
"fs.s3a.aws.credentials.provider.mapping"?

 

e.g.
{code:java}

  fs.s3a.getobject.headers
  
    If-Match=,
    If-Modified-Since=2024-02-03T10:15:30.00Z,
    If-None-Match=,
    If-Unmodified-Since=2024-02-03T10:15:30.00Z
  
{code}
Both "If-Modified-Since" and "If-Unmodified-Since" can be made relative values 
too, from s3a viewpoint.


was (Author: vjasani):
Does this mean we need S3A configs for each of the getObject, headObject and 
copyObject headers? Probably we can introduce something similar to 
"fs.s3a.aws.credentials.provider.mapping"?

 

e.g.
{code:java}

  fs.s3a.getobject.headers
  
    If-Match=,
    If-Modified-Since=2024-02-03T10:15:30.00Z,
    If-None-Match=,
    If-Unmodified-Since=2024-02-03T10:15:30.00Z
  
{code}

> Support S3 Conditional Writes
> -
>
> Key: HADOOP-19256
> URL: https://issues.apache.org/jira/browse/HADOOP-19256
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Priority: Major
>
> S3 Conditional Write (Put-if-absent) capability is now generally available - 
> [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/]
>  
> S3A should allow passing in this put-if-absent header to prevent over writing 
> of files. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19256) Support S3 Conditional Writes

2024-08-21 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875673#comment-17875673
 ] 

Viraj Jasani commented on HADOOP-19256:
---

FileSystem APIs do not have "Map" type input for file operation 
metadata, otherwise S3A could leverage it.

On the other hand, having config means it will be applicable to all file 
operations performed on the given s3afs instance.

 

> Support S3 Conditional Writes
> -
>
> Key: HADOOP-19256
> URL: https://issues.apache.org/jira/browse/HADOOP-19256
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Priority: Major
>
> S3 Conditional Write (Put-if-absent) capability is now generally available - 
> [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/]
>  
> S3A should allow passing in this put-if-absent header to prevent over writing 
> of files. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19256) Support S3 Conditional Writes

2024-08-21 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875670#comment-17875670
 ] 

Viraj Jasani commented on HADOOP-19256:
---

Does this mean we need S3A configs for each of the getObject, headObject and 
copyObject headers? Probably we can introduce something similar to 
"fs.s3a.aws.credentials.provider.mapping"?

 

e.g.
{code:java}

  fs.s3a.getobject.headers
  
    If-Match=,
    If-Modified-Since=2024-02-03T10:15:30.00Z,
    If-None-Match=,
    If-Unmodified-Since=2024-02-03T10:15:30.00Z
  
{code}

> Support S3 Conditional Writes
> -
>
> Key: HADOOP-19256
> URL: https://issues.apache.org/jira/browse/HADOOP-19256
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Priority: Major
>
> S3 Conditional Write (Put-if-absent) capability is now generally available - 
> [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/]
>  
> S3A should allow passing in this put-if-absent header to prevent over writing 
> of files. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-19072) S3A: expand optimisations on stores with "fs.s3a.performance.flags" for mkdir

2024-07-30 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-19072:
--
Summary: S3A: expand optimisations on stores with 
"fs.s3a.performance.flags" for mkdir  (was: S3A: expand optimisations on stores 
with "fs.s3a.create.performance")

> S3A: expand optimisations on stores with "fs.s3a.performance.flags" for mkdir
> -
>
> Key: HADOOP-19072
> URL: https://issues.apache.org/jira/browse/HADOOP-19072
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> on an s3a store with fs.s3a.create.performance set, speed up other operations
> *  mkdir to skip parent directory check: just do a HEAD to see if there's a 
> file at the target location



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-19 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867426#comment-17867426
 ] 

Viraj Jasani commented on HADOOP-19218:
---

Please review [https://github.com/apache/hadoop/pull/6951]

> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic. Every DNS lookup takes 1ms 
> in general.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch. This would also help with 
> ~1ms extra time spent even for healthy DNS lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-19 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867232#comment-17867232
 ] 

Viraj Jasani edited comment on HADOOP-19218 at 7/19/24 8:10 AM:


yeah i am also bit confused, i can create addendum PR based on your decision 
[~hexiaoqiao] [~ayushtkn] 
{quote}Is this issue only in 3.4.0 and trunk?
{quote}
that is correct because the test and the improvement to log the longest lock 
holder (HDFS-15217) is available since 3.4.0 only, whereas HADOOP-18628 is 
present since 3.3.6/3.4.0.


was (Author: vjasani):
yeah i am also bit confused, i can create addendum PR based on your decision 
[~hexiaoqiao] [~ayushtkn] 
{quote}Is this issue only in 3.4.0 and trunk?
{quote}
that is correct because the test and the improvement to log the longest lock 
holder (HDFS-15217) is available since 3.4.0 only.

> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic. Every DNS lookup takes 1ms 
> in general.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch. This would also help with 
> ~1ms extra time spent even for healthy DNS lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-19 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867232#comment-17867232
 ] 

Viraj Jasani edited comment on HADOOP-19218 at 7/19/24 8:08 AM:


yeah i am also bit confused, i can create addendum PR based on your decision 
[~hexiaoqiao] [~ayushtkn] 
{quote}Is this issue only in 3.4.0 and trunk?
{quote}
that is correct because the test and the improvement to log the longest lock 
holder (HDFS-15217) is available since 3.4.0 only.


was (Author: vjasani):
yeah i am also bit confused, i can create addendum PR based on your decision 
[~hexiaoqiao] [~ayushtkn] 

> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic. Every DNS lookup takes 1ms 
> in general.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch. This would also help with 
> ~1ms extra time spent even for healthy DNS lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-19 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867232#comment-17867232
 ] 

Viraj Jasani commented on HADOOP-19218:
---

yeah i am also bit confused, i can create addendum PR based on your decision 
[~hexiaoqiao] [~ayushtkn] 

> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic. Every DNS lookup takes 1ms 
> in general.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch. This would also help with 
> ~1ms extra time spent even for healthy DNS lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-19 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867231#comment-17867231
 ] 

Viraj Jasani commented on HADOOP-19218:
---

Anyway, if we want to keep (host + ip) format (available since 3.4.0) for 
longest lock holder (HDFS-15217), we can still make it happen with simple patch:
{code:java}
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
index 2cb29dfef8e..4a308bce9cc 100644
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
@@ -8838,6 +8838,9 @@ private Supplier getLockReportInfoSupplier(String 
src, String dst,
       UserGroupInformation ugi = Server.getRemoteUser();
       String userName = ugi != null ? ugi.toString() : null;
       InetAddress addr = Server.getRemoteIp();
+      if (addr != null) {
+        addr.getHostName();
+      }
       StringBuilder sb = new StringBuilder();
       String s = escapeJava(src);
       String d = escapeJava(dst); {code}
Otherwise if we decide to follow same format (ip only) for all types of audit 
logs including longest lock holder (HDFS-15217), then we will need to update 
the test.

Though given that we already rolled out 3.4.0 with HDFS-15217, we can go with 
above simple fix. Having host name is always useful for k8s environments, it's 
just that we can optimize by not performing DNS lookup while creating IPC 
Connection object, that was the main purpose of this Jira.

> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic. Every DNS lookup takes 1ms 
> in general.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch. This would also help with 
> ~1ms extra time spent even for healthy DNS lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-19 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867222#comment-17867222
 ] 

Viraj Jasani commented on HADOOP-19218:
---

For TestFSNamesystemLockReport, yes it is broken by this patch. 
TestFSNamesystemLockReport audit log comparison in the test was updated with 
[https://github.com/apache/hadoop/pull/5407]

Pattern "[a-zA-Z0-9.]+" was added to the test as per the feedback (and the fact 
that HDFS-15217 was meant for 3.4.0 only, which was not released at that time).

As far as FSNamesystem audit log is concerned, no compatibility is broken. As 
far as HDFS-15217 is concerned, 3.4.0 is released with hostname + ip address 
(whereas FSNamesystem audit log always had ip address only). If we were to 
check with FSNamesystem style audit log, then now we can say that HDFS-15217 
could also follow the same pattern, but the only concern is 3.4.0 is released.

> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic. Every DNS lookup takes 1ms 
> in general.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch. This would also help with 
> ~1ms extra time spent even for healthy DNS lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-18 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867077#comment-17867077
 ] 

Viraj Jasani commented on HADOOP-19218:
---

Thank you once again!

> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic. Every DNS lookup takes 1ms 
> in general.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch. This would also help with 
> ~1ms extra time spent even for healthy DNS lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-17 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866882#comment-17866882
 ] 

Viraj Jasani edited comment on HADOOP-19218 at 7/17/24 11:50 PM:
-

Thanks for the reviews [~shahrs87] [~dmanning] [~hexiaoqiao] and thanks for 
merging the PR [~hexiaoqiao]! Could you please also help backport the commit to 
3.4 and 3.3 branches? This will be clean backport. Or do you want me to create 
PRs for both 3.4 and 3.3 branches?


was (Author: vjasani):
Thanks for the reviews [~shahrs87] [~dmanning] [~hexiaoqiao] and thanks for 
merging the PR [~hexiaoqiao]! Could you please also help backport the commit to 
3.4 and 3.3 branches? This will be clean backport.

> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic. Every DNS lookup takes 1ms 
> in general.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch. This would also help with 
> ~1ms extra time spent even for healthy DNS lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-17 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866882#comment-17866882
 ] 

Viraj Jasani commented on HADOOP-19218:
---

Thanks for the reviews [~shahrs87] [~dmanning] [~hexiaoqiao] and thanks for 
merging the PR [~hexiaoqiao]! Could you please also help backport the commit to 
3.4 and 3.3 branches? This will be clean backport.

> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic. Every DNS lookup takes 1ms 
> in general.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch. This would also help with 
> ~1ms extra time spent even for healthy DNS lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-02 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-19218:
--
Component/s: ipc

> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic. Every DNS lookup takes 1ms 
> in general.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch. This would also help with 
> ~1ms extra time spent even for healthy DNS lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-02 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17862591#comment-17862591
 ] 

Viraj Jasani commented on HADOOP-19218:
---

FYI [~UselessCoder] [~dmanning] [~shahrs87] [~apurtell]

> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic. Every DNS lookup takes 1ms 
> in general.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch. This would also help with 
> ~1ms extra time spent even for healthy DNS lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-02 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17862579#comment-17862579
 ] 

Viraj Jasani commented on HADOOP-19218:
---

Thread dump ref:
{code:java}
"IPC Server listener on 8020" #92 daemon prio=5 os_prio=0 
tid=0x7f23a9592800 nid=0x81744 runnable [0x7f23ad38a000]
   java.lang.Thread.State: RUNNABLE
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:867)
at 
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1302)
at java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:815)
- locked <0x7f2bc29c6a10> (a 
java.net.InetAddress$NameServiceAddresses)
at java.net.InetAddress.getAllByName0(InetAddress.java:1291)
at java.net.InetAddress.getAllByName0(InetAddress.java:1211)
at java.net.InetAddress.getHostFromNameService(InetAddress.java:637)
at java.net.InetAddress.getHostName(InetAddress.java:562)
at java.net.InetAddress.getHostName(InetAddress.java:534)
at org.apache.hadoop.ipc.Server$Connection.(Server.java:1916)
at 
org.apache.hadoop.ipc.Server$ConnectionManager.register(Server.java:3841)
at org.apache.hadoop.ipc.Server$Listener.doAccept(Server.java:1448)
at org.apache.hadoop.ipc.Server$Listener.run(Server.java:1389) {code}

> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic. Every DNS lookup takes 1ms 
> in general.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch. This would also help with 
> ~1ms extra time spent even for healthy DNS lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-02 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-19218:
--
Description: 
Been running HADOOP-18628 in production for quite sometime, everything works 
fine as long as DNS servers in HA are available. Upgrading single NS server at 
a time is also a common case, not problematic. Every DNS lookup takes 1ms in 
general.

However, recently we encountered a case where 2 out of 4 NS servers went down 
(temporarily but it's a rare case). With small duration DNS cache and 2s of NS 
fallback timeout configured in resolv.conf, now any client performing DNS 
lookup can encounter 4s+ delay. This caused namenode outage as listener thread 
is single threaded and it was not able to keep up with large num of unique 
clients (in direct proportion with num of DNS resolutions every few seconds) 
initiating connection on listener port.

While having 2 out of 4 DNS servers offline is rare case and NS fallback 
settings could also be improved, it is important to note that we don't need to 
perform DNS resolution for every new connection if the intention is to improve 
the insights into VersionMistmatch errors thrown by the server.

The proposal is the delay the DNS resolution until the server throws the error 
for incompatible header or version mismatch. This would also help with ~1ms 
extra time spent even for healthy DNS lookup.

  was:
Been running HADOOP-18628 in production for quite sometime, everything works 
fine as long as DNS servers in HA are available. Upgrading single NS server at 
a time is also a common case, not problematic.

However, recently we encountered a case where 2 out of 4 NS servers went down 
(temporarily but it's a rare case). With small duration DNS cache and 2s of NS 
fallback timeout configured in resolv.conf, now any client performing DNS 
lookup can encounter 4s+ delay. This caused namenode outage as listener thread 
is single threaded and it was not able to keep up with large num of unique 
clients (in direct proportion with num of DNS resolutions every few seconds) 
initiating connection on listener port.

While having 2 out of 4 DNS servers offline is rare case and NS fallback 
settings could also be improved, it is important to note that we don't need to 
perform DNS resolution for every new connection if the intention is to improve 
the insights into VersionMistmatch errors thrown by the server.

The proposal is the delay the DNS resolution until the server throws the error 
for incompatible header or version mismatch.


> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic. Every DNS lookup takes 1ms 
> in general.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch. This would also help with 
> ~1ms extra time spent even for healthy DNS lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-02 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-19218:
-

Assignee: Viraj Jasani

> Avoid DNS lookup while creating IPC Connection object
> -
>
> Key: HADOOP-19218
> URL: https://issues.apache.org/jira/browse/HADOOP-19218
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> Been running HADOOP-18628 in production for quite sometime, everything works 
> fine as long as DNS servers in HA are available. Upgrading single NS server 
> at a time is also a common case, not problematic.
> However, recently we encountered a case where 2 out of 4 NS servers went down 
> (temporarily but it's a rare case). With small duration DNS cache and 2s of 
> NS fallback timeout configured in resolv.conf, now any client performing DNS 
> lookup can encounter 4s+ delay. This caused namenode outage as listener 
> thread is single threaded and it was not able to keep up with large num of 
> unique clients (in direct proportion with num of DNS resolutions every few 
> seconds) initiating connection on listener port.
> While having 2 out of 4 DNS servers offline is rare case and NS fallback 
> settings could also be improved, it is important to note that we don't need 
> to perform DNS resolution for every new connection if the intention is to 
> improve the insights into VersionMistmatch errors thrown by the server.
> The proposal is the delay the DNS resolution until the server throws the 
> error for incompatible header or version mismatch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object

2024-07-02 Thread Viraj Jasani (Jira)

Viraj Jasani created HADOOP-19218:
-

 Summary: Avoid DNS lookup while creating IPC Connection object
 Key: HADOOP-19218
 URL: https://issues.apache.org/jira/browse/HADOOP-19218
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Viraj Jasani


Been running HADOOP-18628 in production for quite sometime, everything works 
fine as long as DNS servers in HA are available. Upgrading single NS server at 
a time is also a common case, not problematic.

However, recently we encountered a case where 2 out of 4 NS servers went down 
(temporarily but it's a rare case). With small duration DNS cache and 2s of NS 
fallback timeout configured in resolv.conf, now any client performing DNS 
lookup can encounter 4s+ delay. This caused namenode outage as listener thread 
is single threaded and it was not able to keep up with large num of unique 
clients (in direct proportion with num of DNS resolutions every few seconds) 
initiating connection on listener port.

While having 2 out of 4 DNS servers offline is rare case and NS fallback 
settings could also be improved, it is important to note that we don't need to 
perform DNS resolution for every new connection if the intention is to improve 
the insights into VersionMistmatch errors thrown by the server.

The proposal is the delay the DNS resolution until the server throws the error 
for incompatible header or version mismatch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19197) S3A: Support AWS KMS Encryption Context

2024-06-07 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853327#comment-17853327
 ] 

Viraj Jasani commented on HADOOP-19197:
---

Amazing, will take a look. Thanks for working on this!

> S3A: Support AWS KMS Encryption Context
> ---
>
> Key: HADOOP-19197
> URL: https://issues.apache.org/jira/browse/HADOOP-19197
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Raphael Azzolini
>Priority: Major
>  Labels: pull-request-available
>
> S3A properties allow users to choose the AWS KMS key 
> ({_}fs.s3a.encryption.key{_}) and S3 encryption algorithm to be used 
> (f{_}s.s3a.encryption.algorithm{_}). In addition to the AWS KMS Key, an 
> encryption context can be used as non-secret data that adds additional 
> integrity and authenticity to check the encrypted data. However, there is no 
> option to specify the [AWS KMS Encryption 
> Context|https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#encrypt_context]
>  in S3A.
> In AWS SDK v2 the encryption context in S3 requests is set by the parameter 
> [ssekmsEncryptionContext.|https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/model/CreateMultipartUploadRequest.Builder.html#ssekmsEncryptionContext(java.lang.String)]
>  It receives a base64-encoded UTF-8 string holding JSON with the encryption 
> context key-value pairs. The value of this parameter could be set by the user 
> in a new property {_}*fs.s3a.encryption.context*{_}, and be stored in the 
> [EncryptionSecrets|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/delegation/EncryptionSecrets.java]
>  to later be used when setting the encryption parameters in 
> [RequestFactoryImpl|https://github.com/apache/hadoop/blob/f92a8ab8ae54f11946412904973eb60404dee7ff/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-19197) S3A: Support AWS KMS Encryption Context

2024-06-07 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-19197:
-

Assignee: (was: Viraj Jasani)

> S3A: Support AWS KMS Encryption Context
> ---
>
> Key: HADOOP-19197
> URL: https://issues.apache.org/jira/browse/HADOOP-19197
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Raphael Azzolini
>Priority: Major
>  Labels: pull-request-available
>
> S3A properties allow users to choose the AWS KMS key 
> ({_}fs.s3a.encryption.key{_}) and S3 encryption algorithm to be used 
> (f{_}s.s3a.encryption.algorithm{_}). In addition to the AWS KMS Key, an 
> encryption context can be used as non-secret data that adds additional 
> integrity and authenticity to check the encrypted data. However, there is no 
> option to specify the [AWS KMS Encryption 
> Context|https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#encrypt_context]
>  in S3A.
> In AWS SDK v2 the encryption context in S3 requests is set by the parameter 
> [ssekmsEncryptionContext.|https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/model/CreateMultipartUploadRequest.Builder.html#ssekmsEncryptionContext(java.lang.String)]
>  It receives a base64-encoded UTF-8 string holding JSON with the encryption 
> context key-value pairs. The value of this parameter could be set by the user 
> in a new property {_}*fs.s3a.encryption.context*{_}, and be stored in the 
> [EncryptionSecrets|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/delegation/EncryptionSecrets.java]
>  to later be used when setting the encryption parameters in 
> [RequestFactoryImpl|https://github.com/apache/hadoop/blob/f92a8ab8ae54f11946412904973eb60404dee7ff/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-19197) S3A: Support AWS KMS Encryption Context

2024-06-07 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-19197:
-

Assignee: Viraj Jasani

> S3A: Support AWS KMS Encryption Context
> ---
>
> Key: HADOOP-19197
> URL: https://issues.apache.org/jira/browse/HADOOP-19197
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Raphael Azzolini
>Assignee: Viraj Jasani
>Priority: Major
>
> S3A properties allow users to choose the AWS KMS key 
> ({_}fs.s3a.encryption.key{_}) and S3 encryption algorithm to be used 
> (f{_}s.s3a.encryption.algorithm{_}). In addition to the AWS KMS Key, an 
> encryption context can be used as non-secret data that adds additional 
> integrity and authenticity to check the encrypted data. However, there is no 
> option to specify the [AWS KMS Encryption 
> Context|https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#encrypt_context]
>  in S3A.
> In AWS SDK v2 the encryption context in S3 requests is set by the parameter 
> [ssekmsEncryptionContext.|https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/model/CreateMultipartUploadRequest.Builder.html#ssekmsEncryptionContext(java.lang.String)]
>  It receives a base64-encoded UTF-8 string holding JSON with the encryption 
> context key-value pairs. The value of this parameter could be set by the user 
> in a new property {_}*fs.s3a.encryption.context*{_}, and be stored in the 
> [EncryptionSecrets|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/delegation/EncryptionSecrets.java]
>  to later be used when setting the encryption parameters in 
> [RequestFactoryImpl|https://github.com/apache/hadoop/blob/f92a8ab8ae54f11946412904973eb60404dee7ff/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19197) S3A: Support AWS KMS Encryption Context

2024-06-07 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853304#comment-17853304
 ] 

Viraj Jasani commented on HADOOP-19197:
---

How about we allow user to configure _fs.s3a.encryption.context_ similar to how 
we allow for {_}fs.s3a.aws.credentials.provider.mapping{_}? i.e. key-value pair 
of String values, let S3A take care of converting the key-value pairs to Base64 
encoded JSON of String key-value pairs.

Given that the context is anyways sent in plain text (it's just Base64 encoded 
JSON String, not a secret key), we can allow user to configure plain text 
key-value pairs separate by "=" with {_}fs.s3a.encryption.context{_}.

 

Sample validation error when we pass anything other than Base64 encoded Json:
{code:java}
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: The header 
'x-amz-server-side-encryption-context' shall be Base64-encoded UTF-8 string 
holding JSON which represents a string-string map (Service: S3, Status Code: 
400, Request ID: SC3CA6BGC8B8RBRD, Extended Request ID: 
8iCVA0qZsxlPXxkDpR49Gtah5LlcgTojtoHyvSEvdY25Kqow5/SPMtXIzuIKzgra16t5e23VQIc6iNle0FhcGw==){code}

> S3A: Support AWS KMS Encryption Context
> ---
>
> Key: HADOOP-19197
> URL: https://issues.apache.org/jira/browse/HADOOP-19197
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Raphael Azzolini
>Priority: Major
>
> S3A properties allow users to choose the AWS KMS key 
> ({_}fs.s3a.encryption.key{_}) and S3 encryption algorithm to be used 
> (f{_}s.s3a.encryption.algorithm{_}). In addition to the AWS KMS Key, an 
> encryption context can be used as non-secret data that adds additional 
> integrity and authenticity to check the encrypted data. However, there is no 
> option to specify the [AWS KMS Encryption 
> Context|https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#encrypt_context]
>  in S3A.
> In AWS SDK v2 the encryption context in S3 requests is set by the parameter 
> [ssekmsEncryptionContext.|https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/model/CreateMultipartUploadRequest.Builder.html#ssekmsEncryptionContext(java.lang.String)]
>  It receives a base64-encoded UTF-8 string holding JSON with the encryption 
> context key-value pairs. The value of this parameter could be set by the user 
> in a new property {_}*fs.s3a.encryption.context*{_}, and be stored in the 
> [EncryptionSecrets|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/delegation/EncryptionSecrets.java]
>  to later be used when setting the encryption parameters in 
> [RequestFactoryImpl|https://github.com/apache/hadoop/blob/f92a8ab8ae54f11946412904973eb60404dee7ff/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19197) S3A: Support AWS KMS Encryption Context

2024-06-06 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853016#comment-17853016
 ] 

Viraj Jasani commented on HADOOP-19197:
---

We need to use it at 3 places: CopyObjectRequest, PutObjectRequest and 
CreateMultipartUploadRequest.

> S3A: Support AWS KMS Encryption Context
> ---
>
> Key: HADOOP-19197
> URL: https://issues.apache.org/jira/browse/HADOOP-19197
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Raphael Azzolini
>Priority: Major
>
> S3A properties allow users to choose the AWS KMS key 
> ({_}fs.s3a.encryption.key{_}) and S3 encryption algorithm to be used 
> (f{_}s.s3a.encryption.algorithm{_}). In addition to the AWS KMS Key, an 
> encryption context can be used as non-secret data that adds additional 
> integrity and authenticity to check the encrypted data. However, there is no 
> option to specify the [AWS KMS Encryption 
> Context|https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#encrypt_context]
>  in S3A.
> In AWS SDK v2 the encryption context in S3 requests is set by the parameter 
> [ssekmsEncryptionContext.|https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/model/CreateMultipartUploadRequest.Builder.html#ssekmsEncryptionContext(java.lang.String)]
>  It receives a base64-encoded UTF-8 string holding JSON with the encryption 
> context key-value pairs. The value of this parameter could be set by the user 
> in a new property {_}*fs.s3a.encryption.context*{_}, and be stored in the 
> [EncryptionSecrets|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/delegation/EncryptionSecrets.java]
>  to later be used when setting the encryption parameters in 
> [RequestFactoryImpl|https://github.com/apache/hadoop/blob/f92a8ab8ae54f11946412904973eb60404dee7ff/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19148) Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298

2024-05-15 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846806#comment-17846806
 ] 

Viraj Jasani commented on HADOOP-19148:
---

Build is fine, dependency tree looks good (except it has zookeeper-jute 
transitive version coming as 3.6.2 instead of 3.8.4), let me create PR to run 
the whole build with tests.
{code:java}
[INFO] +- org.apache.solr:solr-solrj:jar:8.11.3:compile
[INFO] |  +- com.fasterxml.woodstox:woodstox-core:jar:5.4.0:compile
[INFO] |  +- commons-io:commons-io:jar:2.14.0:compile
[INFO] |  +- commons-lang:commons-lang:jar:2.6:compile
[INFO] |  +- io.netty:netty-buffer:jar:4.1.100.Final:compile
[INFO] |  +- io.netty:netty-codec:jar:4.1.100.Final:compile
[INFO] |  +- io.netty:netty-common:jar:4.1.100.Final:compile
[INFO] |  +- io.netty:netty-handler:jar:4.1.100.Final:compile
[INFO] |  +- io.netty:netty-resolver:jar:4.1.100.Final:compile
[INFO] |  +- io.netty:netty-transport:jar:4.1.100.Final:compile
[INFO] |  +- io.netty:netty-transport-native-epoll:jar:4.1.100.Final:compile
[INFO] |  +- 
io.netty:netty-transport-native-unix-common:jar:4.1.100.Final:compile
[INFO] |  +- org.apache.commons:commons-math3:jar:3.6.1:compile
[INFO] |  +- org.apache.httpcomponents:httpclient:jar:4.5.13:compile
[INFO] |  +- org.apache.httpcomponents:httpcore:jar:4.4.13:compile
[INFO] |  +- org.apache.httpcomponents:httpmime:jar:4.5.13:compile
[INFO] |  +- org.apache.zookeeper:zookeeper:jar:3.8.4:compile
[INFO] |  +- org.apache.zookeeper:zookeeper-jute:jar:3.6.2:compile
[INFO] |  +- org.codehaus.woodstox:stax2-api:jar:4.2.1:compile
...
... {code}

> Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298
> ---
>
> Key: HADOOP-19148
> URL: https://issues.apache.org/jira/browse/HADOOP-19148
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Brahma Reddy Battula
>Assignee: Viraj Jasani
>Priority: Major
>
> Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19148) Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298

2024-05-14 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846458#comment-17846458
 ] 

Viraj Jasani commented on HADOOP-19148:
---

[~brahmareddy] is anyone picking this up? If not, let me create the PR?

> Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298
> ---
>
> Key: HADOOP-19148
> URL: https://issues.apache.org/jira/browse/HADOOP-19148
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Brahma Reddy Battula
>Priority: Major
>
> Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-11 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-19146:
--
Component/s: test

> noaa-cors-pds bucket access with global endpoint fails
> --
>
> Key: HADOOP-19146
> URL: https://issues.apache.org/jira/browse/HADOOP-19146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> All tests accessing noaa-cors-pds use us-east-1 region, as configured at 
> bucket level. If global endpoint is configured (e.g. us-west-2), they fail to 
> access to bucket.
>  
> Sample error:
> {code:java}
> org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
> response to region [us-east-1].  This likely indicates that the S3 region 
> configured in fs.s3a.endpoint.region does not match the AWS region containing 
> the bucket.: null (Service: S3, Status Code: 301, Request ID: 
> PMRWMQC9S91CNEJR, Extended Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
>     at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
>     at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
>     at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
>  {code}
> {code:java}
> Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null 
> (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended 
> Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43)
>

[jira] [Created] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-11 Thread Viraj Jasani (Jira)

Viraj Jasani created HADOOP-19146:
-

 Summary: noaa-cors-pds bucket access with global endpoint fails
 Key: HADOOP-19146
 URL: https://issues.apache.org/jira/browse/HADOOP-19146
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Viraj Jasani


All tests accessing noaa-cors-pds use us-east-1 region, as configured at bucket 
level. If global endpoint is configured (e.g. us-west-2), they fail to access 
to bucket.

 

Sample error:
{code:java}
org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
response to region [us-east-1].  This likely indicates that the S3 region 
configured in fs.s3a.endpoint.region does not match the AWS region containing 
the bucket.: null (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, 
Extended Request ID: 
6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
    at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
    at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
    at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
    at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
    at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
    at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
    at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
    at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
    at 
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
    at 
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
    at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
 {code}
{code:java}
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null (Service: 
S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended Request ID: 
6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43)
    at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93)
    at 
software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279)
    ...
    ...
    ...
    at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)

[jira] [Assigned] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-11 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-19146:
-

Assignee: Viraj Jasani

> noaa-cors-pds bucket access with global endpoint fails
> --
>
> Key: HADOOP-19146
> URL: https://issues.apache.org/jira/browse/HADOOP-19146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> All tests accessing noaa-cors-pds use us-east-1 region, as configured at 
> bucket level. If global endpoint is configured (e.g. us-west-2), they fail to 
> access to bucket.
>  
> Sample error:
> {code:java}
> org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
> response to region [us-east-1].  This likely indicates that the S3 region 
> configured in fs.s3a.endpoint.region does not match the AWS region containing 
> the bucket.: null (Service: S3, Status Code: 301, Request ID: 
> PMRWMQC9S91CNEJR, Extended Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
>     at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
>     at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
>     at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
>  {code}
> {code:java}
> Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null 
> (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended 
> Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43)

[jira] [Commented] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint

2024-03-12 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825912#comment-17825912
 ] 

Viraj Jasani commented on HADOOP-19066:
---

Addendum PR: [https://github.com/apache/hadoop/pull/6624]

> AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
> --
>
> Key: HADOOP-19066
> URL: https://issues.apache.org/jira/browse/HADOOP-19066
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.5.0, 3.4.1
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK 
> considers overriding endpoint and enabling fips as mutually exclusive, we 
> fail fast if fs.s3a.endpoint is set with fips support (details on 
> HADOOP-18975).
> Now, we no longer override SDK endpoint for central endpoint since we enable 
> cross region access (details on HADOOP-19044) but we would still fail fast if 
> endpoint is central and fips is enabled.
> Changes proposed:
>  * S3A to fail fast only if FIPS is enabled and non-central endpoint is 
> configured.
>  * Tests to ensure S3 bucket is accessible with default region us-east-2 with 
> cross region access (expected with central endpoint).
>  * Document FIPS support with central endpoint on connecting.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18980) S3A credential provider remapping: make extensible

2024-02-11 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17816513#comment-17816513
 ] 

Viraj Jasani commented on HADOOP-18980:
---

Addressed edge cases with addendum PR: 
[https://github.com/apache/hadoop/pull/6546]

> S3A credential provider remapping: make extensible
> --
>
> Key: HADOOP-18980
> URL: https://issues.apache.org/jira/browse/HADOOP-18980
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.5.0, 3.4.1
>
>
> s3afs will now remap the common com.amazonaws credential providers to 
> equivalents in the v2 sdk or in hadoop-aws
> We could do the same for third party credential providers by taking a 
> key=value list in a configuration property and adding to the map. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint

2024-02-08 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-19066:
--
Status: Patch Available  (was: In Progress)

> AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
> --
>
> Key: HADOOP-19066
> URL: https://issues.apache.org/jira/browse/HADOOP-19066
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.5.0, 3.4.1
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK 
> considers overriding endpoint and enabling fips as mutually exclusive, we 
> fail fast if fs.s3a.endpoint is set with fips support (details on 
> HADOOP-18975).
> Now, we no longer override SDK endpoint for central endpoint since we enable 
> cross region access (details on HADOOP-19044) but we would still fail fast if 
> endpoint is central and fips is enabled.
> Changes proposed:
>  * S3A to fail fast only if FIPS is enabled and non-central endpoint is 
> configured.
>  * Tests to ensure S3 bucket is accessible with default region us-east-2 with 
> cross region access (expected with central endpoint).
>  * Document FIPS support with central endpoint on connecting.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-19072) S3A: expand optimisations on stores with "fs.s3a.create.performance"

2024-02-08 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-19072:
-

Assignee: Viraj Jasani

> S3A: expand optimisations on stores with "fs.s3a.create.performance"
> 
>
> Key: HADOOP-19072
> URL: https://issues.apache.org/jira/browse/HADOOP-19072
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Major
>
> on an s3a store with fs.s3a.create.performance set, speed up other operations
> *  mkdir to skip parent directory check: just do a HEAD to see if there's a 
> file at the target location



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19072) S3A: expand optimisations on stores with "fs.s3a.create.performance"

2024-02-08 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815822#comment-17815822
 ] 

Viraj Jasani commented on HADOOP-19072:
---

The improvement makes sense, as long as downstreamer knows where they are 
creating the dir.

> S3A: expand optimisations on stores with "fs.s3a.create.performance"
> 
>
> Key: HADOOP-19072
> URL: https://issues.apache.org/jira/browse/HADOOP-19072
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Major
>
> on an s3a store with fs.s3a.create.performance set, speed up other operations
> *  mkdir to skip parent directory check: just do a HEAD to see if there's a 
> file at the target location



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint

2024-02-05 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814576#comment-17814576
 ] 

Viraj Jasani commented on HADOOP-19066:
---

Indeed! hopefully some final stabilization work.

> AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
> --
>
> Key: HADOOP-19066
> URL: https://issues.apache.org/jira/browse/HADOOP-19066
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.5.0, 3.4.1
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK 
> considers overriding endpoint and enabling fips as mutually exclusive, we 
> fail fast if fs.s3a.endpoint is set with fips support (details on 
> HADOOP-18975).
> Now, we no longer override SDK endpoint for central endpoint since we enable 
> cross region access (details on HADOOP-19044) but we would still fail fast if 
> endpoint is central and fips is enabled.
> Changes proposed:
>  * S3A to fail fast only if FIPS is enabled and non-central endpoint is 
> configured.
>  * Tests to ensure S3 bucket is accessible with default region us-east-2 with 
> cross region access (expected with central endpoint).
>  * Document FIPS support with central endpoint on connecting.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint

2024-02-04 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814171#comment-17814171
 ] 

Viraj Jasani commented on HADOOP-19066:
---

Will run the whole suite with FIPS support + central endpoint.

> AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
> --
>
> Key: HADOOP-19066
> URL: https://issues.apache.org/jira/browse/HADOOP-19066
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.5.0, 3.4.1
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK 
> considers overriding endpoint and enabling fips as mutually exclusive, we 
> fail fast if fs.s3a.endpoint is set with fips support (details on 
> HADOOP-18975).
> Now, we no longer override SDK endpoint for central endpoint since we enable 
> cross region access (details on HADOOP-19044) but we would still fail fast if 
> endpoint is central and fips is enabled.
> Changes proposed:
>  * S3A to fail fast only if FIPS is enabled and non-central endpoint is 
> configured.
>  * Tests to ensure S3 bucket is accessible with default region us-east-2 with 
> cross region access (expected with central endpoint).
>  * Document FIPS support with central endpoint on connecting.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint

2024-02-04 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-19066:
-

Assignee: Viraj Jasani

> AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
> --
>
> Key: HADOOP-19066
> URL: https://issues.apache.org/jira/browse/HADOOP-19066
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.5.0, 3.4.1
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK 
> considers overriding endpoint and enabling fips as mutually exclusive, we 
> fail fast if fs.s3a.endpoint is set with fips support (details on 
> HADOOP-18975).
> Now, we no longer override SDK endpoint for central endpoint since we enable 
> cross region access (details on HADOOP-19044) but we would still fail fast if 
> endpoint is central and fips is enabled.
> Changes proposed:
>  * S3A to fail fast only if FIPS is enabled and non-central endpoint is 
> configured.
>  * Tests to ensure S3 bucket is accessible with default region us-east-2 with 
> cross region access (expected with central endpoint).
>  * Document FIPS support with central endpoint on connecting.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint

2024-02-04 Thread Viraj Jasani (Jira)

Viraj Jasani created HADOOP-19066:
-

 Summary: AWS SDK V2 - Enabling FIPS should be allowed with central 
endpoint
 Key: HADOOP-19066
 URL: https://issues.apache.org/jira/browse/HADOOP-19066
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.5.0, 3.4.1
Reporter: Viraj Jasani


FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK 
considers overriding endpoint and enabling fips as mutually exclusive, we fail 
fast if fs.s3a.endpoint is set with fips support (details on HADOOP-18975).

Now, we no longer override SDK endpoint for central endpoint since we enable 
cross region access (details on HADOOP-19044) but we would still fail fast if 
endpoint is central and fips is enabled.

Changes proposed:
 * S3A to fail fast only if FIPS is enabled and non-central endpoint is 
configured.
 * Tests to ensure S3 bucket is accessible with default region us-east-2 with 
cross region access (expected with central endpoint).
 * Document FIPS support with central endpoint on connecting.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19022) S3A : ITestS3AConfiguration#testRequestTimeout failure

2024-01-29 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812142#comment-17812142
 ] 

Viraj Jasani commented on HADOOP-19022:
---

It's fine [~ste...@apache.org], i anyways need to make some changes for 
updating cross region logic, so i can take care of that and also fixing timeout 
value for the current test (only if required after your PR 
[https://github.com/apache/hadoop/pull/6470)] and then add some more coverage.

Once your PR gets merged and cross region logic part is also done, i will 
re-run this with different endpoint/region settings and if needed, i will take 
care of ITestS3AConfiguration issues as part of this Jira, otherwise will close 
the Jira.

> S3A : ITestS3AConfiguration#testRequestTimeout failure
> --
>
> Key: HADOOP-19022
> URL: https://issues.apache.org/jira/browse/HADOOP-19022
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Priority: Minor
>
> "fs.s3a.connection.request.timeout" should be specified in milliseconds as per
> {code:java}
> Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT,
> DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); 
> {code}
> The test fails consistently because it sets 120 ms timeout which is less than 
> 15s (min network operation duration), and hence gets reset to 15000 ms based 
> on the enforcement.
>  
> {code:java}
> [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration)  
> Time elapsed: 0.016 s  <<< FAILURE!
> java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is 
> different than what AWS sdk configuration uses internally expected:<12> 
> but was:<15000>
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at 
> org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18975) AWS SDK v2: extend support for FIPS endpoints

2024-01-22 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809636#comment-17809636
 ] 

Viraj Jasani commented on HADOOP-18975:
---

{quote}you must have set a global endpoint, rather than one for your test 
bucket -correct?
{quote}
Exactly.

> AWS SDK v2:  extend support for FIPS endpoints
> --
>
> Key: HADOOP-18975
> URL: https://issues.apache.org/jira/browse/HADOOP-18975
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> v1 SDK supported FIPS just by changing the endpoint.
> Now we have a new builder setting to use.
> * add new  fs.s3a.endpoint.fips option
> * pass it down
> * test



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-18975) AWS SDK v2: extend support for FIPS endpoints

2024-01-21 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809271#comment-17809271
 ] 

Viraj Jasani edited comment on HADOOP-18975 at 1/22/24 7:33 AM:


{code:java}
  
    fs.s3a.bucket.landsat-pds.endpoint.fips
    true
    Use the fips endpoint
   {code}
[~ste...@apache.org] [~ahmar] do we really need fips enabled for landsat in 
hadoop-tools/hadoop-aws/src/test/resources/core-site.xml ?

 

This is breaking several tests from full suite that i am running against 
us-west-2 for PR [https://github.com/apache/hadoop/pull/6479]

e.g.
{code:java}
[ERROR] 
testSelectOddRecordsIgnoreHeaderV1(org.apache.hadoop.fs.s3a.select.ITestS3Select)
  Time elapsed: 2.917 s  <<< ERROR!
java.lang.IllegalArgumentException: An endpoint cannot set when 
fs.s3a.endpoint.fips is true : https://s3-us-west-2.amazonaws.com
at 
org.apache.hadoop.util.Preconditions.checkArgument(Preconditions.java:213)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureEndpointAndRegion(DefaultS3ClientFactory.java:292)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureClientBuilder(DefaultS3ClientFactory.java:179)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:126)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:1063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:677)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3601)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:171)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3702)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3653)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:555)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:366)
at 
org.apache.hadoop.fs.s3a.select.AbstractS3SelectTest.setup(AbstractS3SelectTest.java:304)
at 
org.apache.hadoop.fs.s3a.select.ITestS3Select.setup(ITestS3Select.java:112) 
{code}
 

[ERROR] Tests run: 1264, Failures: 4, Errors: 87, Skipped: 164


was (Author: vjasani):
 
{code:java}
  
    fs.s3a.bucket.landsat-pds.endpoint.fips
    true
    Use the fips endpoint
   {code}
[~ste...@apache.org] [~ahmar] do we really need fips enabled for landsat in 
hadoop-tools/hadoop-aws/src/test/resources/core-site.xml ?

 

This is breaking several tests from full suite that i am running against 
us-west-2 for PR [https://github.com/apache/hadoop/pull/6479]

e.g.
{code:java}
[ERROR] 
testSelectOddRecordsIgnoreHeaderV1(org.apache.hadoop.fs.s3a.select.ITestS3Select)
  Time elapsed: 2.917 s  <<< ERROR!
java.lang.IllegalArgumentException: An endpoint cannot set when 
fs.s3a.endpoint.fips is true : https://s3-us-west-2.amazonaws.com
at 
org.apache.hadoop.util.Preconditions.checkArgument(Preconditions.java:213)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureEndpointAndRegion(DefaultS3ClientFactory.java:292)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureClientBuilder(DefaultS3ClientFactory.java:179)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:126)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:1063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:677)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3601)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:171)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3702)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3653)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:555)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:366)
at 
org.apache.hadoop.fs.s3a.select.AbstractS3SelectTest.setup(AbstractS3SelectTest.java:304)
at 
org.apache.hadoop.fs.s3a.select.ITestS3Select.setup(ITestS3Select.java:112) 
{code}

> AWS SDK v2:  extend support for FIPS endpoints
> --
>
> Key: HADOOP-18975
> URL: https://issues.apache.org/jira/browse/HADOOP-18975
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> v1 SDK supported FIPS just by changing the endpoint.
> Now we have a new builder setting to use.
> * add new  fs.s3a.endpoint.fips option
> * pass it down
> * test



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HADOOP-18975) AWS SDK v2: extend support for FIPS endpoints

2024-01-21 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809271#comment-17809271
 ] 

Viraj Jasani commented on HADOOP-18975:
---

 
{code:java}
  
    fs.s3a.bucket.landsat-pds.endpoint.fips
    true
    Use the fips endpoint
   {code}
[~ste...@apache.org] [~ahmar] do we really need fips enabled for landsat in 
hadoop-tools/hadoop-aws/src/test/resources/core-site.xml ?

 

This is breaking several tests from full suite that i am running against 
us-west-2 for PR [https://github.com/apache/hadoop/pull/6479]

e.g.
{code:java}
[ERROR] 
testSelectOddRecordsIgnoreHeaderV1(org.apache.hadoop.fs.s3a.select.ITestS3Select)
  Time elapsed: 2.917 s  <<< ERROR!
java.lang.IllegalArgumentException: An endpoint cannot set when 
fs.s3a.endpoint.fips is true : https://s3-us-west-2.amazonaws.com
at 
org.apache.hadoop.util.Preconditions.checkArgument(Preconditions.java:213)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureEndpointAndRegion(DefaultS3ClientFactory.java:292)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureClientBuilder(DefaultS3ClientFactory.java:179)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:126)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:1063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:677)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3601)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:171)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3702)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3653)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:555)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:366)
at 
org.apache.hadoop.fs.s3a.select.AbstractS3SelectTest.setup(AbstractS3SelectTest.java:304)
at 
org.apache.hadoop.fs.s3a.select.ITestS3Select.setup(ITestS3Select.java:112) 
{code}

> AWS SDK v2:  extend support for FIPS endpoints
> --
>
> Key: HADOOP-18975
> URL: https://issues.apache.org/jira/browse/HADOOP-18975
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> v1 SDK supported FIPS just by changing the endpoint.
> Now we have a new builder setting to use.
> * add new  fs.s3a.endpoint.fips option
> * pass it down
> * test



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-19044) AWS SDK V2 - Update S3A region logic

2024-01-20 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-19044:
-

Assignee: Viraj Jasani

> AWS SDK V2 - Update S3A region logic 
> -
>
> Key: HADOOP-19044
> URL: https://issues.apache.org/jira/browse/HADOOP-19044
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Viraj Jasani
>Priority: Major
>
> If both fs.s3a.endpoint & fs.s3a.endpoint.region are empty, Spark will set 
> fs.s3a.endpoint to 
> s3.amazonaws.com here:
> [https://github.com/apache/spark/blob/9a2f39318e3af8b3817dc5e4baf52e548d82063c/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L540]
>  
>  
> HADOOP-18908, updated the region logic such that if fs.s3a.endpoint.region is 
> set, or if a region can be parsed from fs.s3a.endpoint (which will happen in 
> this case, region will be US_EAST_1), cross region access is not enabled. 
> This will cause 400 errors if the bucket is not in US_EAST_1. 
>  
> Proposed: Updated the logic so that if the endpoint is the global 
> s3.amazonaws.com , cross region access is enabled.  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19023) S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure

2024-01-07 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804132#comment-17804132
 ] 

Viraj Jasani commented on HADOOP-19023:
---

{quote} * make sure you've not got a site config with an aggressive 
timeout{quote}
Can confirm that this is not the case.
{quote} * do set version/component in the issue fields...it's not picked up 
from the parent{quote}
Sure, will keep this in mind.

 

While HADOOP-19022 has test failure that is consistent, this one 
testParallelRename is intermediate failure. It happened only when I ran the 
whole suite (-Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch), when 
the setup was connected to VPN.

Running the test individually is not failing. Since testParallelRename is 
already aggressive, I think we might want to set higher connection timeout for 
the test.

> S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
> ---
>
> Key: HADOOP-19023
> URL: https://issues.apache.org/jira/browse/HADOOP-19023
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Priority: Major
>
> Need to configure higher timeout for the test.
>  
> {code:java}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 256.281 s <<< FAILURE! - in 
> org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps
> [ERROR] 
> testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps)  
> Time elapsed: 72.565 s  <<< ERROR!
> org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on 
> fork-0005/test/testParallelRename-source0: 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client 
> execution did not complete before the specified timeout configuration: 15000 
> millis
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
>   at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
>   at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
>   at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
>   at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347)
>   at 
> org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214)
>   at 
> org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532)
>   at 
> org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
>   at 
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
>   at 
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:750)
> Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: 
> Client execution did not complete before the specified timeout configuration: 
> 15000 millis
>   at 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97)
>   at 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
>   at 
> software.amazon.

[jira] [Commented] (HADOOP-19022) S3A : ITestS3AConfiguration#testRequestTimeout failure

2024-01-07 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804129#comment-17804129
 ] 

Viraj Jasani commented on HADOOP-19022:
---

 
{quote}have you explicitly set it in your site config?
{quote}
Can confirm that it is not set explicitly, this test fails consistently because 
it takes 120 as 120 ms by default, and since it is less than 15 s, so 15s is 
selected:

 
{code:java}
apiCallTimeout = enforceMinimumDuration(REQUEST_TIMEOUT,
apiCallTimeout, minimumOperationDuration); {code}
Here, minimumOperationDuration is 15s.

 

 

For this Jira, we can
 # Make the test use "120s" instead of "120" so that it will not set 15s by 
default.
 # Add a test with timeout value smaller than 15s and verify that actual 
timeout in S3A client config object is 15s.
 # Add a test by setting "0" as timeout and verify that 
SdkClientOption.API_CALL_ATTEMPT_TIMEOUT does not even get set.
 # Document "fs.s3a.connection.request.timeout" as having 15s default behavior 
if any client sets it with value > 0 and < 15s.

WDYT?

> S3A : ITestS3AConfiguration#testRequestTimeout failure
> --
>
> Key: HADOOP-19022
> URL: https://issues.apache.org/jira/browse/HADOOP-19022
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Priority: Minor
>
> "fs.s3a.connection.request.timeout" should be specified in milliseconds as per
> {code:java}
> Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT,
> DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); 
> {code}
> The test fails consistently because it sets 120 ms timeout which is less than 
> 15s (min network operation duration), and hence gets reset to 15000 ms based 
> on the enforcement.
>  
> {code:java}
> [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration)  
> Time elapsed: 0.016 s  <<< FAILURE!
> java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is 
> different than what AWS sdk configuration uses internally expected:<12> 
> but was:<15000>
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at 
> org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-19023) ITestS3AConcurrentOps#testParallelRename intermittent timeout failure

2024-01-07 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-19023:
--
Component/s: test

> ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
> -
>
> Key: HADOOP-19023
> URL: https://issues.apache.org/jira/browse/HADOOP-19023
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Priority: Major
>
> Need to configure higher timeout for the test.
>  
> {code:java}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 256.281 s <<< FAILURE! - in 
> org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps
> [ERROR] 
> testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps)  
> Time elapsed: 72.565 s  <<< ERROR!
> org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on 
> fork-0005/test/testParallelRename-source0: 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client 
> execution did not complete before the specified timeout configuration: 15000 
> millis
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
>   at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
>   at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
>   at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
>   at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347)
>   at 
> org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214)
>   at 
> org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532)
>   at 
> org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
>   at 
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
>   at 
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:750)
> Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: 
> Client execution did not complete before the specified timeout configuration: 
> 15000 millis
>   at 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97)
>   at 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExcept

[jira] [Updated] (HADOOP-19022) S3A : ITestS3AConfiguration#testRequestTimeout failure

2024-01-07 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-19022:
--
Summary: S3A : ITestS3AConfiguration#testRequestTimeout failure  (was: 
ITestS3AConfiguration#testRequestTimeout failure)

> S3A : ITestS3AConfiguration#testRequestTimeout failure
> --
>
> Key: HADOOP-19022
> URL: https://issues.apache.org/jira/browse/HADOOP-19022
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Priority: Minor
>
> "fs.s3a.connection.request.timeout" should be specified in milliseconds as per
> {code:java}
> Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT,
> DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); 
> {code}
> The test fails consistently because it sets 120 ms timeout which is less than 
> 15s (min network operation duration), and hence gets reset to 15000 ms based 
> on the enforcement.
>  
> {code:java}
> [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration)  
> Time elapsed: 0.016 s  <<< FAILURE!
> java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is 
> different than what AWS sdk configuration uses internally expected:<12> 
> but was:<15000>
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at 
> org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-19023) S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure

2024-01-07 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-19023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-19023:
--
Summary: S3A : ITestS3AConcurrentOps#testParallelRename intermittent 
timeout failure  (was: ITestS3AConcurrentOps#testParallelRename intermittent 
timeout failure)

> S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
> ---
>
> Key: HADOOP-19023
> URL: https://issues.apache.org/jira/browse/HADOOP-19023
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Priority: Major
>
> Need to configure higher timeout for the test.
>  
> {code:java}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 256.281 s <<< FAILURE! - in 
> org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps
> [ERROR] 
> testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps)  
> Time elapsed: 72.565 s  <<< ERROR!
> org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on 
> fork-0005/test/testParallelRename-source0: 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client 
> execution did not complete before the specified timeout configuration: 15000 
> millis
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
>   at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
>   at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
>   at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
>   at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347)
>   at 
> org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214)
>   at 
> org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532)
>   at 
> org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
>   at 
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
>   at 
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:750)
> Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: 
> Client execution did not complete before the specified timeout configuration: 
> 15000 millis
>   at 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97)
>   at 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineB

[jira] [Updated] (HADOOP-18980) S3A credential provider remapping: make extensible

2024-01-04 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18980:
--
Status: Patch Available  (was: In Progress)

> S3A credential provider remapping: make extensible
> --
>
> Key: HADOOP-18980
> URL: https://issues.apache.org/jira/browse/HADOOP-18980
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
>
> s3afs will now remap the common com.amazonaws credential providers to 
> equivalents in the v2 sdk or in hadoop-aws
> We could do the same for third party credential providers by taking a 
> key=value list in a configuration property and adding to the map. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18959) Use builder for prefetch CachingBlockManager

2024-01-04 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17803401#comment-17803401
 ] 

Viraj Jasani commented on HADOOP-18959:
---

[~slfan1989] this is already committed to trunk, only backport PR is pending 
for merge.

> Use builder for prefetch CachingBlockManager
> 
>
> Key: HADOOP-18959
> URL: https://issues.apache.org/jira/browse/HADOOP-18959
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> Some of the recent changes (HADOOP-18399, HADOOP-18291, HADOOP-18829 etc) 
> have added more params for prefetch CachingBlockManager c'tor to process 
> read/write block requests. They have added too many params and more are 
> likely to be introduced later. We should use builder pattern to pass params.
> This would also help consolidating required prefetch params into one single 
> place within S3ACachingInputStream, from scattered locations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-19023) ITestS3AConcurrentOps#testParallelRename intermittent timeout failure

2024-01-03 Thread Viraj Jasani (Jira)

Viraj Jasani created HADOOP-19023:
-

 Summary: ITestS3AConcurrentOps#testParallelRename intermittent 
timeout failure
 Key: HADOOP-19023
 URL: https://issues.apache.org/jira/browse/HADOOP-19023
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani


Need to configure higher timeout for the test.

 
{code:java}
[ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 256.281 
s <<< FAILURE! - in org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps
[ERROR] 
testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps)  Time 
elapsed: 72.565 s  <<< ERROR!
org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on 
fork-0005/test/testParallelRename-source0: 
software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client execution 
did not complete before the specified timeout configuration: 15000 millis
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347)
at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214)
at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620)
at 
org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at 
org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at 
org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at 
org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
at 
org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: 
Client execution did not complete before the specified timeout configuration: 
15000 millis
at 
software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97)
at 
software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
at 
software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:224)
at 
software.amazon.awssdk.core.intern

[jira] [Commented] (HADOOP-19022) ITestS3AConfiguration#testRequestTimeout failure

2024-01-03 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17802395#comment-17802395
 ] 

Viraj Jasani commented on HADOOP-19022:
---

It's small test, but perhaps good to cover both cases: more than 15s and less 
than 15s timeouts.

> ITestS3AConfiguration#testRequestTimeout failure
> 
>
> Key: HADOOP-19022
> URL: https://issues.apache.org/jira/browse/HADOOP-19022
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Viraj Jasani
>Priority: Minor
>
> "fs.s3a.connection.request.timeout" should be specified in milliseconds as per
> {code:java}
> Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT,
> DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); 
> {code}
> The test fails consistently because it sets 120 ms timeout which is less than 
> 15s (min network operation duration), and hence gets reset to 15000 ms based 
> on the enforcement.
>  
> {code:java}
> [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration)  
> Time elapsed: 0.016 s  <<< FAILURE!
> java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is 
> different than what AWS sdk configuration uses internally expected:<12> 
> but was:<15000>
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at 
> org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-19022) ITestS3AConfiguration#testRequestTimeout failure

2024-01-03 Thread Viraj Jasani (Jira)

Viraj Jasani created HADOOP-19022:
-

 Summary: ITestS3AConfiguration#testRequestTimeout failure
 Key: HADOOP-19022
 URL: https://issues.apache.org/jira/browse/HADOOP-19022
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani


"fs.s3a.connection.request.timeout" should be specified in milliseconds as per
{code:java}
Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT,
DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); 
{code}
The test fails consistently because it sets 120 ms timeout which is less than 
15s (min network operation duration), and hence gets reset to 15000 ms based on 
the enforcement.

 
{code:java}
[ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration)  
Time elapsed: 0.016 s  <<< FAILURE!
java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is 
different than what AWS sdk configuration uses internally expected:<12> but 
was:<15000>
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:647)
at 
org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18991) Remove commons-benautils dependency from Hadoop 3

2023-11-28 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17790816#comment-17790816
 ] 

Viraj Jasani commented on HADOOP-18991:
---

As per HADOOP-16542, if we remove this, hive build fails. Hive can explicitly 
use common-beanutils directly?

FYI [~weichiu] 

> Remove commons-benautils dependency from Hadoop 3
> -
>
> Key: HADOOP-18991
> URL: https://issues.apache.org/jira/browse/HADOOP-18991
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Istvan Toth
>Priority: Major
>
> Hadoop doesn't acually use it, and it pollutes the classpath of dependent 
> projects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18991) Remove commons-benautils dependency from Hadoop 3

2023-11-28 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17790788#comment-17790788
 ] 

Viraj Jasani commented on HADOOP-18991:
---

[~stoty] is this the cause for managing it in phoenix even after excluding it 
from omid?

> Remove commons-benautils dependency from Hadoop 3
> -
>
> Key: HADOOP-18991
> URL: https://issues.apache.org/jira/browse/HADOOP-18991
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Istvan Toth
>Priority: Major
>
> Hadoop doesn't acually use it, and it pollutes the classpath of dependent 
> projects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18980) S3A credential provider remapping: make extensible

2023-11-20 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788128#comment-17788128
 ] 

Viraj Jasani commented on HADOOP-18980:
---

{quote}exactly; though i'd expect the remapping to be from com.amazonaws to 
software.amazonaws or private implementations

key goal: you can use the same credentials.provider list for v1 and v2 sdk 
clients.
{quote}
In addition to having same credentials.provider list for v1 and v2 sdk, maybe 
we can also remove static mapping for v1 to v2 credential providers and let new 
config have default key value pairs:

 
{code:java}

  fs.s3a.aws.credentials.provider.mapping
  
   
com.amazonaws.auth.AnonymousAWSCredentials=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider,
   
com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider,
   
com.amazonaws.auth.InstanceProfileCredentialsProvider=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider,
   
com.amazonaws.auth.EnvironmentVariableCredentialsProvider=software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider,
   
com.amazonaws.auth.profile.ProfileCredentialsProvider=software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider
  
 {code}
 

With this being default value, any new third-party credential provider can be 
added to this list by users. Does that sound good?

 

> S3A credential provider remapping: make extensible
> --
>
> Key: HADOOP-18980
> URL: https://issues.apache.org/jira/browse/HADOOP-18980
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Minor
>
> s3afs will now remap the common com.amazonaws credential providers to 
> equivalents in the v2 sdk or in hadoop-aws
> We could do the same for third party credential providers by taking a 
> key=value list in a configuration property and adding to the map. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-18980) S3A credential provider remapping: make extensible

2023-11-20 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788128#comment-17788128
 ] 

Viraj Jasani edited comment on HADOOP-18980 at 11/20/23 6:44 PM:
-

In addition to having same credentials.provider list for v1 and v2 sdk, maybe 
we can also remove static mapping for v1 to v2 credential providers and let new 
config have default key value pairs:
{code:java}

  fs.s3a.aws.credentials.provider.mapping
  
   
com.amazonaws.auth.AnonymousAWSCredentials=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider,
   
com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider,
   
com.amazonaws.auth.InstanceProfileCredentialsProvider=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider,
   
com.amazonaws.auth.EnvironmentVariableCredentialsProvider=software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider,
   
com.amazonaws.auth.profile.ProfileCredentialsProvider=software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider
  
 {code}
With this being default value, any new third-party credential provider can be 
added to this list by users. Does that sound good?


was (Author: vjasani):
{quote}exactly; though i'd expect the remapping to be from com.amazonaws to 
software.amazonaws or private implementations

key goal: you can use the same credentials.provider list for v1 and v2 sdk 
clients.
{quote}
In addition to having same credentials.provider list for v1 and v2 sdk, maybe 
we can also remove static mapping for v1 to v2 credential providers and let new 
config have default key value pairs:

 
{code:java}

  fs.s3a.aws.credentials.provider.mapping
  
   
com.amazonaws.auth.AnonymousAWSCredentials=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider,
   
com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider,
   
com.amazonaws.auth.InstanceProfileCredentialsProvider=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider,
   
com.amazonaws.auth.EnvironmentVariableCredentialsProvider=software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider,
   
com.amazonaws.auth.profile.ProfileCredentialsProvider=software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider
  
 {code}
 

With this being default value, any new third-party credential provider can be 
added to this list by users. Does that sound good?

 

> S3A credential provider remapping: make extensible
> --
>
> Key: HADOOP-18980
> URL: https://issues.apache.org/jira/browse/HADOOP-18980
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Minor
>
> s3afs will now remap the common com.amazonaws credential providers to 
> equivalents in the v2 sdk or in hadoop-aws
> We could do the same for third party credential providers by taking a 
> key=value list in a configuration property and adding to the map. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18980) S3A credential provider remapping: make extensible

2023-11-19 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787854#comment-17787854
 ] 

Viraj Jasani commented on HADOOP-18980:
---

Something like this maybe?

 
{code:java}

  fs.s3a.aws.credentials.provider.mapping
  
    
com.amazon.xyz.auth.provider.key1=org.apache.hadoop.fs.s3a.CustomCredsProvider1,
    
com.amazon.xyz.auth.provider.key2=org.apache.hadoop.fs.s3a.CustomCredsProvider2,
    
com.amazon.xyz.auth.provider.key3=org.apache.hadoop.fs.s3a.CustomCredsProvider3
  



  fs.s3a.aws.credentials.provider
  
    com.amazon.xyz.auth.provider.key1,
    com.amazon.xyz.auth.provider.key2
  
 {code}
 

 

> S3A credential provider remapping: make extensible
> --
>
> Key: HADOOP-18980
> URL: https://issues.apache.org/jira/browse/HADOOP-18980
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Minor
>
> s3afs will now remap the common com.amazonaws credential providers to 
> equivalents in the v2 sdk or in hadoop-aws
> We could do the same for third party credential providers by taking a 
> key=value list in a configuration property and adding to the map. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-18959) Use builder for prefetch CachingBlockManager

2023-10-29 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18959:
-

Assignee: Viraj Jasani

> Use builder for prefetch CachingBlockManager
> 
>
> Key: HADOOP-18959
> URL: https://issues.apache.org/jira/browse/HADOOP-18959
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> Some of the recent changes (HADOOP-18399, HADOOP-18291, HADOOP-18829 etc) 
> have added more params for prefetch CachingBlockManager c'tor to process 
> read/write block requests. They have added too many params and more are 
> likely to be introduced later. We should use builder pattern to pass params.
> This would also help consolidating required prefetch params into one single 
> place within S3ACachingInputStream, from scattered locations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-18959) Use builder for prefetch CachingBlockManager

2023-10-29 Thread Viraj Jasani (Jira)

Viraj Jasani created HADOOP-18959:
-

 Summary: Use builder for prefetch CachingBlockManager
 Key: HADOOP-18959
 URL: https://issues.apache.org/jira/browse/HADOOP-18959
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani


Some of the recent changes (HADOOP-18399, HADOOP-18291, HADOOP-18829 etc) have 
added more params for prefetch CachingBlockManager c'tor to process read/write 
block requests. They have added too many params and more are likely to be 
introduced later. We should use builder pattern to pass params.

This would also help consolidating required prefetch params into one single 
place within S3ACachingInputStream, from scattered locations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE/DSSE encryption is used

2023-10-27 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18918:
--
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> ITestS3GuardTool fails if SSE/DSSE encryption is used
> -
>
> Key: HADOOP-18918
> URL: https://issues.apache.org/jira/browse/HADOOP-18918
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> {code:java}
> [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool
> [ERROR] 
> testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool)
>   Time elapsed: 0.807 s  <<< ERROR!
> 46: Bucket s3a://landsat-pds: required encryption is none but actual 
> encryption is DSSE-KMS
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at java.lang.Thread.run(Thread.java:750)
>  {code}
> Since landsat requires none encryption, the test should be skipped for any 
> encryption algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18952) FsCommand Stat class set the timeZone"UTC", which is different from the machine's timeZone

2023-10-26 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17780014#comment-17780014
 ] 

Viraj Jasani commented on HADOOP-18952:
---

This has been the case since the beginning:

Stat:
{code:java}
protected final SimpleDateFormat timeFmt;
{
  timeFmt = new SimpleDateFormat("-MM-dd HH:mm:ss");
  timeFmt.setTimeZone(TimeZone.getTimeZone("UTC"));
}{code}
Ls:
{code:java}
protected final SimpleDateFormat dateFormat =
  new SimpleDateFormat("-MM-dd HH:mm"); {code}

> FsCommand Stat class set the timeZone"UTC", which is different from the 
> machine's timeZone
> --
>
> Key: HADOOP-18952
> URL: https://issues.apache.org/jira/browse/HADOOP-18952
> Project: Hadoop Common
>  Issue Type: Bug
> Environment: Using Hadoop 3.3.4-release
>Reporter: liang yu
>Priority: Major
> Attachments: image-2023-10-26-10-07-11-637.png
>
>
> Using Hadoop version 3.3.4
>  
> When executing Ls command and Stat command on the same hadoop file, I get two 
> timestamps.
>  
> {code:java}
> hdfs dfs -stat "modify_time %y, access_time%x" /path/to/file{code}
>  returns:
> modify_time {_}*2023-10-17 01:43:05*{_}, access_time _*2023-10-17 01:41:00*_ 
>  
> {code:java}
> hdfs dfs -ls /path/to/file{code}
>   returns:
> {-}rw{-}rw-r–+     3    user_name     user_group     247400339     
> _*2023-10-17 09:43*_     /path/to/file
>  
> these two timestamps has the difference 8hours.
> I am in China, the timezone is “UTC+8”， so the timestamp from LS command is 
> correct and timestamp from STAT command is wrong.
>  
> !image-2023-10-26-10-07-11-637.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-18829) s3a prefetch LRU cache eviction metric

2023-10-19 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HADOOP-18829.
---
Fix Version/s: 3.4.0
   3.3.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

> s3a prefetch LRU cache eviction metric
> --
>
> Key: HADOOP-18829
> URL: https://issues.apache.org/jira/browse/HADOOP-18829
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> Follow-up from HADOOP-18291:
> Add new IO statistics metric to capture s3a prefetch LRU cache eviction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-18931) FileSystem.getFileSystemClass() to log at debug the jar the .class came from

2023-10-17 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17776356#comment-17776356
 ] 

Viraj Jasani edited comment on HADOOP-18931 at 10/17/23 7:16 PM:
-

sounds good, it makes sense to log for all fs invocation by keeping the log 
separate from the heavy service load.


was (Author: vjasani):
sounds good, it makes sense to log for all fs invocation

> FileSystem.getFileSystemClass() to log at debug the jar the .class came from
> 
>
> Key: HADOOP-18931
> URL: https://issues.apache.org/jira/browse/HADOOP-18931
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Priority: Minor
>
> we want to be able to log the jar the filesystem implementation class, so 
> that we can identify which version of a module the class came from.
> this is to help track down problems where different machines in the cluster 
> or the .tar.gz bundle is out of date. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18931) FileSystem.getFileSystemClass() to log at debug the jar the .class came from

2023-10-17 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17776356#comment-17776356
 ] 

Viraj Jasani commented on HADOOP-18931:
---

sounds good, it makes sense to log for all fs invocation

> FileSystem.getFileSystemClass() to log at debug the jar the .class came from
> 
>
> Key: HADOOP-18931
> URL: https://issues.apache.org/jira/browse/HADOOP-18931
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Priority: Minor
>
> we want to be able to log the jar the filesystem implementation class, so 
> that we can identify which version of a module the class came from.
> this is to help track down problems where different machines in the cluster 
> or the .tar.gz bundle is out of date. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE/DSSE encryption is used

2023-10-15 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18918:
--
Status: Patch Available  (was: In Progress)

> ITestS3GuardTool fails if SSE/DSSE encryption is used
> -
>
> Key: HADOOP-18918
> URL: https://issues.apache.org/jira/browse/HADOOP-18918
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
>
> {code:java}
> [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool
> [ERROR] 
> testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool)
>   Time elapsed: 0.807 s  <<< ERROR!
> 46: Bucket s3a://landsat-pds: required encryption is none but actual 
> encryption is DSSE-KMS
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at java.lang.Thread.run(Thread.java:750)
>  {code}
> Since landsat requires none encryption, the test should be skipped for any 
> encryption algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)

2023-10-15 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18850:
--
Status: Patch Available  (was: In Progress)

> Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
> -
>
> Key: HADOOP-18850
> URL: https://issues.apache.org/jira/browse/HADOOP-18850
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, security
>Reporter: Akira Ajisaka
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> Add support for DSSE-KMS
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18931) FileSystem.getFileSystemClass() to log at debug the jar the .class came from

2023-10-14 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775281#comment-17775281
 ] 

Viraj Jasani commented on HADOOP-18931:
---

i thought we were already logging it during the first time init of fs for the 
given JVM
{code:java}
try {
  SERVICE_FILE_SYSTEMS.put(fs.getScheme(), fs.getClass());
  if (LOGGER.isDebugEnabled()) {
LOGGER.debug("{}:// = {} from {}",
fs.getScheme(), fs.getClass(),
ClassUtil.findContainingJar(fs.getClass()));
  }
} catch (Exception e) {
  LOGGER.warn("Cannot load: {} from {}", fs,
  ClassUtil.findContainingJar(fs.getClass()));
  LOGGER.info("Full exception loading: {}", fs, e);
}
{code}
maybe you are suggesting that we should log it for every call to 
{_}getFileSystemClass(){_}, correct?

> FileSystem.getFileSystemClass() to log at debug the jar the .class came from
> 
>
> Key: HADOOP-18931
> URL: https://issues.apache.org/jira/browse/HADOOP-18931
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Priority: Minor
>
> we want to be able to log the jar the filesystem implementation class, so 
> that we can identify which version of a module the class came from.
> this is to help track down problems where different machines in the cluster 
> or the .tar.gz bundle is out of date. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE/DSSE encryption is used

2023-10-09 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18918:
--
Summary: ITestS3GuardTool fails if SSE/DSSE encryption is used  (was: 
ITestS3GuardTool fails if SSE encryption is used)

> ITestS3GuardTool fails if SSE/DSSE encryption is used
> -
>
> Key: HADOOP-18918
> URL: https://issues.apache.org/jira/browse/HADOOP-18918
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>
> {code:java}
> [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool
> [ERROR] 
> testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool)
>   Time elapsed: 0.807 s  <<< ERROR!
> 46: Bucket s3a://landsat-pds: required encryption is none but actual 
> encryption is DSSE-KMS
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at java.lang.Thread.run(Thread.java:750)
>  {code}
> Since landsat requires none encryption, the test should be skipped for any 
> encryption algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE encryption is used

2023-10-02 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18918:
--
Priority: Minor  (was: Major)

> ITestS3GuardTool fails if SSE encryption is used
> 
>
> Key: HADOOP-18918
> URL: https://issues.apache.org/jira/browse/HADOOP-18918
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>
> {code:java}
> [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool
> [ERROR] 
> testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool)
>   Time elapsed: 0.807 s  <<< ERROR!
> 46: Bucket s3a://landsat-pds: required encryption is none but actual 
> encryption is DSSE-KMS
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at java.lang.Thread.run(Thread.java:750)
>  {code}
> Since landsat requires none encryption, the test should be skipped for any 
> encryption algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-18918) ITestS3GuardTool fails if SSE encryption is used

2023-10-01 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18918:
-

Assignee: Viraj Jasani

> ITestS3GuardTool fails if SSE encryption is used
> 
>
> Key: HADOOP-18918
> URL: https://issues.apache.org/jira/browse/HADOOP-18918
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> {code:java}
> [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool
> [ERROR] 
> testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool)
>   Time elapsed: 0.807 s  <<< ERROR!
> 46: Bucket s3a://landsat-pds: required encryption is none but actual 
> encryption is DSSE-KMS
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at java.lang.Thread.run(Thread.java:750)
>  {code}
> Since landsat requires none encryption, the test should be skipped for any 
> encryption algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-18918) ITestS3GuardTool fails if SSE encryption is used

2023-10-01 Thread Viraj Jasani (Jira)

Viraj Jasani created HADOOP-18918:
-

 Summary: ITestS3GuardTool fails if SSE encryption is used
 Key: HADOOP-18918
 URL: https://issues.apache.org/jira/browse/HADOOP-18918
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3, test
Affects Versions: 3.3.6
Reporter: Viraj Jasani


{code:java}
[ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 25.989 
s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool
[ERROR] 
testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool)
  Time elapsed: 0.807 s  <<< ERROR!
46: Bucket s3a://landsat-pds: required encryption is none but actual encryption 
is DSSE-KMS
    at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915)
    at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881)
    at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511)
    at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
    at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963)
    at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147)
    at 
org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114)
    at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
    at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
    at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
    at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
    at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.lang.Thread.run(Thread.java:750)
 {code}
Since landsat requires none encryption, the test should be skipped for any 
encryption algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)

2023-09-30 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18850:
-

Assignee: Viraj Jasani

> Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
> -
>
> Key: HADOOP-18850
> URL: https://issues.apache.org/jira/browse/HADOOP-18850
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, security
>Reporter: Akira Ajisaka
>Assignee: Viraj Jasani
>Priority: Major
>
> Add support for DSSE-KMS
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18915) HTTP timeouts are not set correctly

2023-09-30 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770745#comment-17770745
 ] 

Viraj Jasani commented on HADOOP-18915:
---

Nice find!

> HTTP timeouts are not set correctly
> ---
>
> Key: HADOOP-18915
> URL: https://issues.apache.org/jira/browse/HADOOP-18915
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> In the client config builders, when [setting 
> timeouts|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSClientConfig.java#L120],
>  it uses Duration.ofSeconds(), configs all use milliseconds so this needs to 
> be updated to Duration.ofMillis().
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-18208) Remove all the log4j reference in modules other than hadoop-logging

2023-09-19 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18208:
-

Assignee: (was: Viraj Jasani)

> Remove all the log4j reference in modules other than hadoop-logging
> ---
>
> Key: HADOOP-18208
> URL: https://issues.apache.org/jira/browse/HADOOP-18208
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-16206) Migrate from Log4j1 to Log4j2

2023-09-19 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-16206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-16206:
-

Assignee: (was: Viraj Jasani)

> Migrate from Log4j1 to Log4j2
> -
>
> Key: HADOOP-16206
> URL: https://issues.apache.org/jira/browse/HADOOP-16206
> Project: Hadoop Common
>  Issue Type: Task
>Affects Versions: 3.3.0
>Reporter: Akira Ajisaka
>Priority: Major
> Attachments: HADOOP-16206-wip.001.patch
>
>
> This sub-task is to remove log4j1 dependency and add log4j2 dependency.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-18207) Introduce hadoop-logging module

2023-09-19 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18207:
-

Assignee: (was: Viraj Jasani)

> Introduce hadoop-logging module
> ---
>
> Key: HADOOP-18207
> URL: https://issues.apache.org/jira/browse/HADOOP-18207
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> There are several goals here:
>  # Provide the ability to change log level, get log level, etc.
>  # Place all the appender implementation(?)
>  # Hide the real logging implementation.
>  # Later we could remove all the log4j references in other hadoop module.
>  # Move as much log4j usage to the module as possible.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-15984) Update jersey from 1.19 to 2.x

2023-09-19 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-15984:
-

Assignee: (was: Viraj Jasani)

> Update jersey from 1.19 to 2.x
> --
>
> Key: HADOOP-15984
> URL: https://issues.apache.org/jira/browse/HADOOP-15984
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> jersey-json 1.19 depends on Jackson 1.9.2. Let's upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)

2023-08-17 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755406#comment-17755406
 ] 

Viraj Jasani commented on HADOOP-18850:
---

[~ste...@apache.org] are you in favor of this before v2 sdk upgrade?

> Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
> -
>
> Key: HADOOP-18850
> URL: https://issues.apache.org/jira/browse/HADOOP-18850
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, security
>Reporter: Akira Ajisaka
>Priority: Major
>
> Add support for DSSE-KMS
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)

2023-08-17 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755394#comment-17755394
 ] 

Viraj Jasani commented on HADOOP-18850:
---

only recently HADOOP-18832 bumped sdk bundle to 1.12.499, so looks like we can 
support this

> Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
> -
>
> Key: HADOOP-18850
> URL: https://issues.apache.org/jira/browse/HADOOP-18850
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, security
>Reporter: Akira Ajisaka
>Priority: Major
>
> Add support for DSSE-KMS
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)

2023-08-17 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755392#comment-17755392
 ] 

Viraj Jasani edited comment on HADOOP-18850 at 8/17/23 7:13 AM:


it seems SSEAlgorithm added DSSE as part of 1.12.488 release: 
[https://github.com/aws/aws-sdk-java/releases/tag/1.12.488]
{code:java}
public enum SSEAlgorithm {
AES256("AES256"),
KMS("aws:kms"),
DSSE("aws:kms:dsse"),
;{code}


was (Author: vjasani):
SSEAlgorithm added DSSE as part of 1.12.488 release: 
[https://github.com/aws/aws-sdk-java/releases/tag/1.12.488]
{code:java}
public enum SSEAlgorithm {
AES256("AES256"),
KMS("aws:kms"),
DSSE("aws:kms:dsse"),
;{code}

> Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
> -
>
> Key: HADOOP-18850
> URL: https://issues.apache.org/jira/browse/HADOOP-18850
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, security
>Reporter: Akira Ajisaka
>Priority: Major
>
> Add support for DSSE-KMS
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)

2023-08-17 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755392#comment-17755392
 ] 

Viraj Jasani commented on HADOOP-18850:
---

SSEAlgorithm added DSSE as part of 1.12.488 release: 
[https://github.com/aws/aws-sdk-java/releases/tag/1.12.488]
{code:java}
public enum SSEAlgorithm {
AES256("AES256"),
KMS("aws:kms"),
DSSE("aws:kms:dsse"),
;{code}

> Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
> -
>
> Key: HADOOP-18850
> URL: https://issues.apache.org/jira/browse/HADOOP-18850
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, security
>Reporter: Akira Ajisaka
>Priority: Major
>
> Add support for DSSE-KMS
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18852) S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look like random IO

2023-08-16 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755385#comment-17755385
 ] 

Viraj Jasani commented on HADOOP-18852:
---

{quote}for other reads, we may want a bigger prefech count than 1, depending 
on: split start/end, file read policy (random, sequential, whole-file)
{quote}
this means we first need prefetch read policy (HADOOP-18791), correct?

> S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look 
> like random IO
> --
>
> Key: HADOOP-18852
> URL: https://issues.apache.org/jira/browse/HADOOP-18852
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Priority: Major
>
> noticed in HADOOP-18184, but I think it's a big enough issue to be dealt with 
> separately.
> # all seeks are lazy; no fetching is kicked off after an open
> # the first read is treated as an out of order read, so cancels any active 
> reads (don't think there are any) and then only asks for 1 block
> {code}
> if (outOfOrderRead) {
>   LOG.debug("lazy-seek({})", getOffsetStr(readPos));
>   blockManager.cancelPrefetches();
>   // We prefetch only 1 block immediately after a seek operation.
>   prefetchCount = 1;
> }
> {code}
> * for any read fully we should prefetch all blocks in the range requested
> * for other reads, we may want a bigger prefech count than 1, depending on: 
> split start/end, file read policy (random, sequential, whole-file)
> * also, if a read is in a block other than the current one, but which is 
> already being fetched or cached, is this really an OOO read to the extent 
> that outstanding fetches should be cancelled?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18852) S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look like random IO

2023-08-16 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755384#comment-17755384
 ] 

Viraj Jasani commented on HADOOP-18852:
---

{quote}also, if a read is in a block other than the current one, but which is 
already being fetched or cached, is this really an OOO read to the extent that 
outstanding fetches should be cancelled?
{quote}
+1 to this, now that i checked some logs, can see lazy-seek for every first 
seek + read on the given block:
{code:java}
DEBUG prefetch.S3ACachingInputStream 
(S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(0:0)
DEBUG prefetch.S3ACachingInputStream 
(S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(4:40960)
DEBUG prefetch.S3ACachingInputStream 
(S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(3:30720)
DEBUG prefetch.S3ACachingInputStream 
(S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(2:20480){code}
but it's also valid that if the block was being cached, why cancel the 
outstanding fetches.

> S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look 
> like random IO
> --
>
> Key: HADOOP-18852
> URL: https://issues.apache.org/jira/browse/HADOOP-18852
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Priority: Major
>
> noticed in HADOOP-18184, but I think it's a big enough issue to be dealt with 
> separately.
> # all seeks are lazy; no fetching is kicked off after an open
> # the first read is treated as an out of order read, so cancels any active 
> reads (don't think there are any) and then only asks for 1 block
> {code}
> if (outOfOrderRead) {
>   LOG.debug("lazy-seek({})", getOffsetStr(readPos));
>   blockManager.cancelPrefetches();
>   // We prefetch only 1 block immediately after a seek operation.
>   prefetchCount = 1;
> }
> {code}
> * for any read fully we should prefetch all blocks in the range requested
> * for other reads, we may want a bigger prefech count than 1, depending on: 
> split start/end, file read policy (random, sequential, whole-file)
> * also, if a read is in a block other than the current one, but which is 
> already being fetched or cached, is this really an OOO read to the extent 
> that outstanding fetches should be cancelled?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18829) s3a prefetch LRU cache eviction metric

2023-08-01 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17750035#comment-17750035
 ] 

Viraj Jasani commented on HADOOP-18829:
---

sure thing, i think this can wait for sure. thanks

> s3a prefetch LRU cache eviction metric
> --
>
> Key: HADOOP-18829
> URL: https://issues.apache.org/jira/browse/HADOOP-18829
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> Follow-up from HADOOP-18291:
> Add new IO statistics metric to capture s3a prefetch LRU cache eviction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+

2023-07-30 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17748981#comment-17748981
 ] 

Viraj Jasani commented on HADOOP-18832:
---

ITestS3AFileContextStatistics#testStatistics is flaky:
{code:java}
[ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.983 s 
<<< FAILURE! - in 
org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextStatistics
[ERROR] 
testStatistics(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextStatistics)
  Time elapsed: 1.776 s  <<< FAILURE!
java.lang.AssertionError: expected:<512> but was:<448>
    at org.junit.Assert.fail(Assert.java:89)
    at org.junit.Assert.failNotEquals(Assert.java:835)
    at org.junit.Assert.assertEquals(Assert.java:647)
    at org.junit.Assert.assertEquals(Assert.java:633)
    at 
org.apache.hadoop.fs.FCStatisticsBaseTest.testStatistics(FCStatisticsBaseTest.java:108)
 {code}
This only happened once, now unable to reproduce it locally.

> Upgrade aws-java-sdk to 1.12.499+
> -
>
> Key: HADOOP-18832
> URL: https://issues.apache.org/jira/browse/HADOOP-18832
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence 
> showing up in security CVE scans (CVE-2023-34462). The safe version for netty 
> is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+

2023-07-30 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17748980#comment-17748980
 ] 

Viraj Jasani commented on HADOOP-18832:
---

Testing in progress: Test results look good with -scale and -prefetch so far.

Now running some encryption tests (bucket with algo: SSE-KMS).

> Upgrade aws-java-sdk to 1.12.499+
> -
>
> Key: HADOOP-18832
> URL: https://issues.apache.org/jira/browse/HADOOP-18832
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence 
> showing up in security CVE scans (CVE-2023-34462). The safe version for netty 
> is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+

2023-07-30 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18832:
--
Description: aws sdk versions < 1.12.499 uses a vulnerable version of netty 
and hence showing up in security CVE scans (CVE-2023-34462). The safe version 
for netty is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+  (was: aws 
sdk versions < 1.12.499 uses a vulnerable version of netty and hence showing up 
in security CVE scans (CVE-2023-34462). The safe version for netty is 
4.1.94.Final and this is used by aws-java-adk:1.12.499+)

> Upgrade aws-java-sdk to 1.12.499+
> -
>
> Key: HADOOP-18832
> URL: https://issues.apache.org/jira/browse/HADOOP-18832
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence 
> showing up in security CVE scans (CVE-2023-34462). The safe version for netty 
> is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+

2023-07-30 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18832:
-

Assignee: Viraj Jasani

> Upgrade aws-java-sdk to 1.12.499+
> -
>
> Key: HADOOP-18832
> URL: https://issues.apache.org/jira/browse/HADOOP-18832
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence 
> showing up in security CVE scans (CVE-2023-34462). The safe version for netty 
> is 4.1.94.Final and this is used by aws-java-adk:1.12.499+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+

2023-07-30 Thread Viraj Jasani (Jira)

Viraj Jasani created HADOOP-18832:
-

 Summary: Upgrade aws-java-sdk to 1.12.499+
 Key: HADOOP-18832
 URL: https://issues.apache.org/jira/browse/HADOOP-18832
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Viraj Jasani


aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence 
showing up in security CVE scans (CVE-2023-34462). The safe version for netty 
is 4.1.94.Final and this is used by aws-java-adk:1.12.499+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-18829) s3a prefetch LRU cache eviction metric

2023-07-26 Thread Viraj Jasani (Jira)

Viraj Jasani created HADOOP-18829:
-

 Summary: s3a prefetch LRU cache eviction metric
 Key: HADOOP-18829
 URL: https://issues.apache.org/jira/browse/HADOOP-18829
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


Follow-up from HADOOP-18291:

Add new IO statistics metric to capture s3a prefetch LRU cache eviction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-18809) s3a prefetch read/write file operations should guard channel close

2023-07-17 Thread Viraj Jasani (Jira)

Viraj Jasani created HADOOP-18809:
-

 Summary: s3a prefetch read/write file operations should guard 
channel close
 Key: HADOOP-18809
 URL: https://issues.apache.org/jira/browse/HADOOP-18809
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


As per Steve's suggestion from s3a prefetch LRU cache,

s3a prefetch disk based cache file read and write operations should guard 
against close of FileChannel and WritableByteChannel, close them even if 
read/write operations throw IOException.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data

2023-07-17 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743344#comment-17743344
 ] 

Viraj Jasani edited comment on HADOOP-18805 at 7/17/23 8:15 PM:


sorry Steve, i was not aware you already created this Jira, i created PR for 
letting LRU tests use small files rather than landsat: 
[https://github.com/apache/hadoop/pull/5851]
{quote}also, and this is very, very important, they can't validate the data
{quote}
i was about to create a sub-task for this as i am planning to refactor Entry to 
it's own class and have the contents of the linked list data tested in UT 
(discussed with Mehakmeet in the earlier part of the review). i can take this 
up as new sub-task and for the current Jira, we can focus on tests using small 
files for the better break-down?

 

PR review discussion: 
[https://github.com/apache/hadoop/pull/5754#discussion_r1247476231]


was (Author: vjasani):
sorry Steve, i was not aware you already created this Jira, i created addendum 
for letting LRU test depend on small file rather than large one: 
[https://github.com/apache/hadoop/pull/5843]
{quote}also, and this is very, very important, they can't validate the data
{quote}
i was about to create a sub-task for this as i am planning to refactor Entry to 
it's own class and have the contents of the linked list data tested in UT 
(discussed with Mehakmeet in the earlier part of the review). maybe i can do 
the work as part of this Jira.

 

are you fine with?
 * the above addendum PR for using small file in the test (so that we don't 
need to put the test under -scale)
 * this Jira to refactor Entry and allowing a UT to test the contents of the 
linked list

 

if you think above PR is not good for an addendum and should rather be linked 
to this Jira, i can change PR title to reflect this Jira number and i can 
create another sub-task to write simple UT that can test contents of the linked 
list from head to tail.

> s3a large file prefetch tests are too slow, don't validate data
> ---
>
> Key: HADOOP-18805
> URL: https://issues.apache.org/jira/browse/HADOOP-18805
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.9
>Reporter: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> the large file prefetch tests (including LRU cache eviction) are really slow.
> moving under -scale may hide the problem for most runs, but they are still 
> too slow, can time out, etc etc.
> also, and this is very, very important, they can't validate the data.
> Better: 
> * test on smaller files by setting a very small block size (1k bytes or less) 
> just to force paged reads of a small 16k file.
> * with known contents to the values of all forms of read can be validated
> * maybe the LRU tests can work with a fake remote object which can then be 
> used in a unit test
> * extend one of the huge file tests to read from there -including s3-CSE 
> encryption coverage.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data

2023-07-14 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743344#comment-17743344
 ] 

Viraj Jasani edited comment on HADOOP-18805 at 7/15/23 6:48 AM:


sorry Steve, i was not aware you already created this Jira, i created addendum 
for letting LRU test depend on small file rather than large one: 
[https://github.com/apache/hadoop/pull/5843]
{quote}also, and this is very, very important, they can't validate the data
{quote}
i was about to create a sub-task for this as i am planning to refactor Entry to 
it's own class and have the contents of the linked list data tested in UT 
(discussed with Mehakmeet in the earlier part of the review). maybe i can do 
the work as part of this Jira.

 

are you fine with?
 * the above addendum PR for using small file in the test (so that we don't 
need to put the test under -scale)
 * this Jira to refactor Entry and allowing a UT to test the contents of the 
linked list

 

if you think above PR is not good for an addendum and should rather be linked 
to this Jira, i can change PR title to reflect this Jira number and i can 
create another sub-task to write simple UT that can test contents of the linked 
list from head to tail.


was (Author: vjasani):
sorry Steve, i was not aware you already created this Jira, i created addendum 
for letting LRU test depend on small file rather than large one: 
[https://github.com/apache/hadoop/pull/5843]
{quote}also, and this is very, very important, they can't validate the data
{quote}
i was about to create a sub-task for this as i am planning to refactor Entry to 
it's own class and have the contents of the linked list data tested in UT 
(discussed with Mehakmeet in the earlier part of the review). maybe i can do 
the work as part of this Jira.

 

are you fine with the above addendum PR taking care of using small file in the 
test (so that we don't need to put the test under -scale) and this Jira being 
used for refactoring Entry and allowing a UT to test the contents of the linked 
list?

> s3a large file prefetch tests are too slow, don't validate data
> ---
>
> Key: HADOOP-18805
> URL: https://issues.apache.org/jira/browse/HADOOP-18805
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.9
>Reporter: Steve Loughran
>Priority: Major
>
> the large file prefetch tests (including LRU cache eviction) are really slow.
> moving under -scale may hide the problem for most runs, but they are still 
> too slow, can time out, etc etc.
> also, and this is very, very important, they can't validate the data.
> Better: 
> * test on smaller files by setting a very small block size (1k bytes or less) 
> just to force paged reads of a small 16k file.
> * with known contents to the values of all forms of read can be validated
> * maybe the LRU tests can work with a fake remote object which can then be 
> used in a unit test
> * extend one of the huge file tests to read from there -including s3-CSE 
> encryption coverage.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

1 2 3 4 5 6 7 >

1 - 100 of 659 matches

Mail list logo