[jira] [Assigned] (HADOOP-19148) Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298
[ https://issues.apache.org/jira/browse/HADOOP-19148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-19148: - Assignee: (was: Viraj Jasani) > Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298 > --- > > Key: HADOOP-19148 > URL: https://issues.apache.org/jira/browse/HADOOP-19148 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Reporter: Brahma Reddy Battula >Priority: Major > Labels: pull-request-available > > Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19072) S3A: expand optimisations on stores with "fs.s3a.performance.flags" for mkdir
[ https://issues.apache.org/jira/browse/HADOOP-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani resolved HADOOP-19072. --- Hadoop Flags: Reviewed Resolution: Fixed > S3A: expand optimisations on stores with "fs.s3a.performance.flags" for mkdir > - > > Key: HADOOP-19072 > URL: https://issues.apache.org/jira/browse/HADOOP-19072 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > on an s3a store with fs.s3a.create.performance set, speed up other operations > * mkdir to skip parent directory check: just do a HEAD to see if there's a > file at the target location -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19256) Support S3 Conditional Writes
[ https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876367#comment-17876367 ] Viraj Jasani commented on HADOOP-19256: --- Ahmar, reg the new SDK, IIUC, only PutObjectRequest and CompleteMultipartUploadRequest need new input param "ifNoneMatch()", else GetObjectRequest, HeadObjectRequest and CopyObjectRequest already have required inputs in SDK right? > Support S3 Conditional Writes > - > > Key: HADOOP-19256 > URL: https://issues.apache.org/jira/browse/HADOOP-19256 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Ahmar Suhail >Priority: Major > > S3 Conditional Write (Put-if-absent) capability is now generally available - > [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/] > > S3A should allow passing in this put-if-absent header to prevent over writing > of files. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-19256) Support S3 Conditional Writes
[ https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876056#comment-17876056 ] Viraj Jasani edited comment on HADOOP-19256 at 8/22/24 7:12 PM: {quote}we have *exactly* this for openFile() and createFile() {quote} Ah you mean openFileWithOptions() category of APIs right? i missed this, never got to explore this API. {quote}is a new SDK needed here? {quote} i skimmed through docs yesterday and it seems the docs do not mention anything about new SDK. Also, the options to provide these params in header are already available in the SDK we use e.g. we use ifMatch() for default ChangeDetectionPolicy. was (Author: vjasani): {quote}we have *exactly* this for openFile() and createFile() {quote} Ah you mean openFileWithOptions() category of APIs right? i missed this, never got to explore this API. {quote}is a new SDK needed here? {quote} i skimmed through docs yesterday and it seems the docs do not mention anything about new SDK. Also, the options to provide these params in header are already available in the SDK we use. > Support S3 Conditional Writes > - > > Key: HADOOP-19256 > URL: https://issues.apache.org/jira/browse/HADOOP-19256 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Ahmar Suhail >Priority: Major > > S3 Conditional Write (Put-if-absent) capability is now generally available - > [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/] > > S3A should allow passing in this put-if-absent header to prevent over writing > of files. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19256) Support S3 Conditional Writes
[ https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876076#comment-17876076 ] Viraj Jasani commented on HADOOP-19256: --- {quote}we already use conditional headers in read operations, using version or etag of a file to ensure that every GET in an input stream either always picks up the same file version (versioned option) or just etag validation (default) {quote} Steve, i agree that we have the changeDetectionPolicy (fs.s3a.change.detection.source), but it is still generic config and the config name does not say that it is used by getObject only, correct? i was thinking about having API level header as s3afs config but now i think whatever the header value we want to use for conditional writes, we want to use it for all APIs: getObject, headObject and copyObject headers rather than only for select ones. Even in that case, does it still make sense to have a config that can take multiple key-value pairs like i mentioned above? because as per the docs, multiple headers can also be provided e.g. If-Match and If-Unmodified-Since. > Support S3 Conditional Writes > - > > Key: HADOOP-19256 > URL: https://issues.apache.org/jira/browse/HADOOP-19256 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Ahmar Suhail >Priority: Major > > S3 Conditional Write (Put-if-absent) capability is now generally available - > [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/] > > S3A should allow passing in this put-if-absent header to prevent over writing > of files. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19256) Support S3 Conditional Writes
[ https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876056#comment-17876056 ] Viraj Jasani commented on HADOOP-19256: --- {quote}we have *exactly* this for openFile() and createFile() {quote} Ah you mean openFileWithOptions() category of APIs right? i missed this, never got to explore this API. {quote}is a new SDK needed here? {quote} i skimmed through docs yesterday and it seems the docs do not mention anything about new SDK. Also, the options to provide these params in header are already available in the SDK we use. > Support S3 Conditional Writes > - > > Key: HADOOP-19256 > URL: https://issues.apache.org/jira/browse/HADOOP-19256 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Ahmar Suhail >Priority: Major > > S3 Conditional Write (Put-if-absent) capability is now generally available - > [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/] > > S3A should allow passing in this put-if-absent header to prevent over writing > of files. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-19256) Support S3 Conditional Writes
[ https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875670#comment-17875670 ] Viraj Jasani edited comment on HADOOP-19256 at 8/21/24 10:15 PM: - Does this mean we need S3A configs for each of the getObject, headObject and copyObject headers? Probably we can introduce something similar to "fs.s3a.aws.credentials.provider.mapping"? e.g. {code:java} fs.s3a.getobject.headers If-Match=, If-Modified-Since=2024-02-03T10:15:30.00Z, If-None-Match=, If-Unmodified-Since=2024-02-03T10:15:30.00Z {code} Both "If-Modified-Since" and "If-Unmodified-Since" can be made relative values too, from s3a viewpoint. was (Author: vjasani): Does this mean we need S3A configs for each of the getObject, headObject and copyObject headers? Probably we can introduce something similar to "fs.s3a.aws.credentials.provider.mapping"? e.g. {code:java} fs.s3a.getobject.headers If-Match=, If-Modified-Since=2024-02-03T10:15:30.00Z, If-None-Match=, If-Unmodified-Since=2024-02-03T10:15:30.00Z {code} > Support S3 Conditional Writes > - > > Key: HADOOP-19256 > URL: https://issues.apache.org/jira/browse/HADOOP-19256 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Ahmar Suhail >Priority: Major > > S3 Conditional Write (Put-if-absent) capability is now generally available - > [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/] > > S3A should allow passing in this put-if-absent header to prevent over writing > of files. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19256) Support S3 Conditional Writes
[ https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875673#comment-17875673 ] Viraj Jasani commented on HADOOP-19256: --- FileSystem APIs do not have "Map" type input for file operation metadata, otherwise S3A could leverage it. On the other hand, having config means it will be applicable to all file operations performed on the given s3afs instance. > Support S3 Conditional Writes > - > > Key: HADOOP-19256 > URL: https://issues.apache.org/jira/browse/HADOOP-19256 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Ahmar Suhail >Priority: Major > > S3 Conditional Write (Put-if-absent) capability is now generally available - > [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/] > > S3A should allow passing in this put-if-absent header to prevent over writing > of files. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19256) Support S3 Conditional Writes
[ https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875670#comment-17875670 ] Viraj Jasani commented on HADOOP-19256: --- Does this mean we need S3A configs for each of the getObject, headObject and copyObject headers? Probably we can introduce something similar to "fs.s3a.aws.credentials.provider.mapping"? e.g. {code:java} fs.s3a.getobject.headers If-Match=, If-Modified-Since=2024-02-03T10:15:30.00Z, If-None-Match=, If-Unmodified-Since=2024-02-03T10:15:30.00Z {code} > Support S3 Conditional Writes > - > > Key: HADOOP-19256 > URL: https://issues.apache.org/jira/browse/HADOOP-19256 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Ahmar Suhail >Priority: Major > > S3 Conditional Write (Put-if-absent) capability is now generally available - > [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/] > > S3A should allow passing in this put-if-absent header to prevent over writing > of files. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-19072) S3A: expand optimisations on stores with "fs.s3a.performance.flags" for mkdir
[ https://issues.apache.org/jira/browse/HADOOP-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-19072: -- Summary: S3A: expand optimisations on stores with "fs.s3a.performance.flags" for mkdir (was: S3A: expand optimisations on stores with "fs.s3a.create.performance") > S3A: expand optimisations on stores with "fs.s3a.performance.flags" for mkdir > - > > Key: HADOOP-19072 > URL: https://issues.apache.org/jira/browse/HADOOP-19072 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > on an s3a store with fs.s3a.create.performance set, speed up other operations > * mkdir to skip parent directory check: just do a HEAD to see if there's a > file at the target location -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867426#comment-17867426 ] Viraj Jasani commented on HADOOP-19218: --- Please review [https://github.com/apache/hadoop/pull/6951] > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. Every DNS lookup takes 1ms > in general. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. This would also help with > ~1ms extra time spent even for healthy DNS lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867232#comment-17867232 ] Viraj Jasani edited comment on HADOOP-19218 at 7/19/24 8:10 AM: yeah i am also bit confused, i can create addendum PR based on your decision [~hexiaoqiao] [~ayushtkn] {quote}Is this issue only in 3.4.0 and trunk? {quote} that is correct because the test and the improvement to log the longest lock holder (HDFS-15217) is available since 3.4.0 only, whereas HADOOP-18628 is present since 3.3.6/3.4.0. was (Author: vjasani): yeah i am also bit confused, i can create addendum PR based on your decision [~hexiaoqiao] [~ayushtkn] {quote}Is this issue only in 3.4.0 and trunk? {quote} that is correct because the test and the improvement to log the longest lock holder (HDFS-15217) is available since 3.4.0 only. > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. Every DNS lookup takes 1ms > in general. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. This would also help with > ~1ms extra time spent even for healthy DNS lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867232#comment-17867232 ] Viraj Jasani edited comment on HADOOP-19218 at 7/19/24 8:08 AM: yeah i am also bit confused, i can create addendum PR based on your decision [~hexiaoqiao] [~ayushtkn] {quote}Is this issue only in 3.4.0 and trunk? {quote} that is correct because the test and the improvement to log the longest lock holder (HDFS-15217) is available since 3.4.0 only. was (Author: vjasani): yeah i am also bit confused, i can create addendum PR based on your decision [~hexiaoqiao] [~ayushtkn] > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. Every DNS lookup takes 1ms > in general. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. This would also help with > ~1ms extra time spent even for healthy DNS lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867232#comment-17867232 ] Viraj Jasani commented on HADOOP-19218: --- yeah i am also bit confused, i can create addendum PR based on your decision [~hexiaoqiao] [~ayushtkn] > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. Every DNS lookup takes 1ms > in general. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. This would also help with > ~1ms extra time spent even for healthy DNS lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867231#comment-17867231 ] Viraj Jasani commented on HADOOP-19218: --- Anyway, if we want to keep (host + ip) format (available since 3.4.0) for longest lock holder (HDFS-15217), we can still make it happen with simple patch: {code:java} diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java index 2cb29dfef8e..4a308bce9cc 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java @@ -8838,6 +8838,9 @@ private Supplier getLockReportInfoSupplier(String src, String dst, UserGroupInformation ugi = Server.getRemoteUser(); String userName = ugi != null ? ugi.toString() : null; InetAddress addr = Server.getRemoteIp(); + if (addr != null) { + addr.getHostName(); + } StringBuilder sb = new StringBuilder(); String s = escapeJava(src); String d = escapeJava(dst); {code} Otherwise if we decide to follow same format (ip only) for all types of audit logs including longest lock holder (HDFS-15217), then we will need to update the test. Though given that we already rolled out 3.4.0 with HDFS-15217, we can go with above simple fix. Having host name is always useful for k8s environments, it's just that we can optimize by not performing DNS lookup while creating IPC Connection object, that was the main purpose of this Jira. > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. Every DNS lookup takes 1ms > in general. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. This would also help with > ~1ms extra time spent even for healthy DNS lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867222#comment-17867222 ] Viraj Jasani commented on HADOOP-19218: --- For TestFSNamesystemLockReport, yes it is broken by this patch. TestFSNamesystemLockReport audit log comparison in the test was updated with [https://github.com/apache/hadoop/pull/5407] Pattern "[a-zA-Z0-9.]+" was added to the test as per the feedback (and the fact that HDFS-15217 was meant for 3.4.0 only, which was not released at that time). As far as FSNamesystem audit log is concerned, no compatibility is broken. As far as HDFS-15217 is concerned, 3.4.0 is released with hostname + ip address (whereas FSNamesystem audit log always had ip address only). If we were to check with FSNamesystem style audit log, then now we can say that HDFS-15217 could also follow the same pattern, but the only concern is 3.4.0 is released. > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. Every DNS lookup takes 1ms > in general. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. This would also help with > ~1ms extra time spent even for healthy DNS lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867077#comment-17867077 ] Viraj Jasani commented on HADOOP-19218: --- Thank you once again! > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. Every DNS lookup takes 1ms > in general. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. This would also help with > ~1ms extra time spent even for healthy DNS lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866882#comment-17866882 ] Viraj Jasani edited comment on HADOOP-19218 at 7/17/24 11:50 PM: - Thanks for the reviews [~shahrs87] [~dmanning] [~hexiaoqiao] and thanks for merging the PR [~hexiaoqiao]! Could you please also help backport the commit to 3.4 and 3.3 branches? This will be clean backport. Or do you want me to create PRs for both 3.4 and 3.3 branches? was (Author: vjasani): Thanks for the reviews [~shahrs87] [~dmanning] [~hexiaoqiao] and thanks for merging the PR [~hexiaoqiao]! Could you please also help backport the commit to 3.4 and 3.3 branches? This will be clean backport. > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. Every DNS lookup takes 1ms > in general. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. This would also help with > ~1ms extra time spent even for healthy DNS lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866882#comment-17866882 ] Viraj Jasani commented on HADOOP-19218: --- Thanks for the reviews [~shahrs87] [~dmanning] [~hexiaoqiao] and thanks for merging the PR [~hexiaoqiao]! Could you please also help backport the commit to 3.4 and 3.3 branches? This will be clean backport. > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. Every DNS lookup takes 1ms > in general. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. This would also help with > ~1ms extra time spent even for healthy DNS lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-19218: -- Component/s: ipc > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. Every DNS lookup takes 1ms > in general. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. This would also help with > ~1ms extra time spent even for healthy DNS lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17862591#comment-17862591 ] Viraj Jasani commented on HADOOP-19218: --- FYI [~UselessCoder] [~dmanning] [~shahrs87] [~apurtell] > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. Every DNS lookup takes 1ms > in general. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. This would also help with > ~1ms extra time spent even for healthy DNS lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17862579#comment-17862579 ] Viraj Jasani commented on HADOOP-19218: --- Thread dump ref: {code:java} "IPC Server listener on 8020" #92 daemon prio=5 os_prio=0 tid=0x7f23a9592800 nid=0x81744 runnable [0x7f23ad38a000] java.lang.Thread.State: RUNNABLE at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:867) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1302) at java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:815) - locked <0x7f2bc29c6a10> (a java.net.InetAddress$NameServiceAddresses) at java.net.InetAddress.getAllByName0(InetAddress.java:1291) at java.net.InetAddress.getAllByName0(InetAddress.java:1211) at java.net.InetAddress.getHostFromNameService(InetAddress.java:637) at java.net.InetAddress.getHostName(InetAddress.java:562) at java.net.InetAddress.getHostName(InetAddress.java:534) at org.apache.hadoop.ipc.Server$Connection.(Server.java:1916) at org.apache.hadoop.ipc.Server$ConnectionManager.register(Server.java:3841) at org.apache.hadoop.ipc.Server$Listener.doAccept(Server.java:1448) at org.apache.hadoop.ipc.Server$Listener.run(Server.java:1389) {code} > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. Every DNS lookup takes 1ms > in general. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. This would also help with > ~1ms extra time spent even for healthy DNS lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-19218: -- Description: Been running HADOOP-18628 in production for quite sometime, everything works fine as long as DNS servers in HA are available. Upgrading single NS server at a time is also a common case, not problematic. Every DNS lookup takes 1ms in general. However, recently we encountered a case where 2 out of 4 NS servers went down (temporarily but it's a rare case). With small duration DNS cache and 2s of NS fallback timeout configured in resolv.conf, now any client performing DNS lookup can encounter 4s+ delay. This caused namenode outage as listener thread is single threaded and it was not able to keep up with large num of unique clients (in direct proportion with num of DNS resolutions every few seconds) initiating connection on listener port. While having 2 out of 4 DNS servers offline is rare case and NS fallback settings could also be improved, it is important to note that we don't need to perform DNS resolution for every new connection if the intention is to improve the insights into VersionMistmatch errors thrown by the server. The proposal is the delay the DNS resolution until the server throws the error for incompatible header or version mismatch. This would also help with ~1ms extra time spent even for healthy DNS lookup. was: Been running HADOOP-18628 in production for quite sometime, everything works fine as long as DNS servers in HA are available. Upgrading single NS server at a time is also a common case, not problematic. However, recently we encountered a case where 2 out of 4 NS servers went down (temporarily but it's a rare case). With small duration DNS cache and 2s of NS fallback timeout configured in resolv.conf, now any client performing DNS lookup can encounter 4s+ delay. This caused namenode outage as listener thread is single threaded and it was not able to keep up with large num of unique clients (in direct proportion with num of DNS resolutions every few seconds) initiating connection on listener port. While having 2 out of 4 DNS servers offline is rare case and NS fallback settings could also be improved, it is important to note that we don't need to perform DNS resolution for every new connection if the intention is to improve the insights into VersionMistmatch errors thrown by the server. The proposal is the delay the DNS resolution until the server throws the error for incompatible header or version mismatch. > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. Every DNS lookup takes 1ms > in general. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. This would also help with > ~1ms extra time spent even for healthy DNS lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
[ https://issues.apache.org/jira/browse/HADOOP-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-19218: - Assignee: Viraj Jasani > Avoid DNS lookup while creating IPC Connection object > - > > Key: HADOOP-19218 > URL: https://issues.apache.org/jira/browse/HADOOP-19218 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > Been running HADOOP-18628 in production for quite sometime, everything works > fine as long as DNS servers in HA are available. Upgrading single NS server > at a time is also a common case, not problematic. > However, recently we encountered a case where 2 out of 4 NS servers went down > (temporarily but it's a rare case). With small duration DNS cache and 2s of > NS fallback timeout configured in resolv.conf, now any client performing DNS > lookup can encounter 4s+ delay. This caused namenode outage as listener > thread is single threaded and it was not able to keep up with large num of > unique clients (in direct proportion with num of DNS resolutions every few > seconds) initiating connection on listener port. > While having 2 out of 4 DNS servers offline is rare case and NS fallback > settings could also be improved, it is important to note that we don't need > to perform DNS resolution for every new connection if the intention is to > improve the insights into VersionMistmatch errors thrown by the server. > The proposal is the delay the DNS resolution until the server throws the > error for incompatible header or version mismatch. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19218) Avoid DNS lookup while creating IPC Connection object
Viraj Jasani created HADOOP-19218: - Summary: Avoid DNS lookup while creating IPC Connection object Key: HADOOP-19218 URL: https://issues.apache.org/jira/browse/HADOOP-19218 Project: Hadoop Common Issue Type: Improvement Reporter: Viraj Jasani Been running HADOOP-18628 in production for quite sometime, everything works fine as long as DNS servers in HA are available. Upgrading single NS server at a time is also a common case, not problematic. However, recently we encountered a case where 2 out of 4 NS servers went down (temporarily but it's a rare case). With small duration DNS cache and 2s of NS fallback timeout configured in resolv.conf, now any client performing DNS lookup can encounter 4s+ delay. This caused namenode outage as listener thread is single threaded and it was not able to keep up with large num of unique clients (in direct proportion with num of DNS resolutions every few seconds) initiating connection on listener port. While having 2 out of 4 DNS servers offline is rare case and NS fallback settings could also be improved, it is important to note that we don't need to perform DNS resolution for every new connection if the intention is to improve the insights into VersionMistmatch errors thrown by the server. The proposal is the delay the DNS resolution until the server throws the error for incompatible header or version mismatch. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19197) S3A: Support AWS KMS Encryption Context
[ https://issues.apache.org/jira/browse/HADOOP-19197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853327#comment-17853327 ] Viraj Jasani commented on HADOOP-19197: --- Amazing, will take a look. Thanks for working on this! > S3A: Support AWS KMS Encryption Context > --- > > Key: HADOOP-19197 > URL: https://issues.apache.org/jira/browse/HADOOP-19197 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Raphael Azzolini >Priority: Major > Labels: pull-request-available > > S3A properties allow users to choose the AWS KMS key > ({_}fs.s3a.encryption.key{_}) and S3 encryption algorithm to be used > (f{_}s.s3a.encryption.algorithm{_}). In addition to the AWS KMS Key, an > encryption context can be used as non-secret data that adds additional > integrity and authenticity to check the encrypted data. However, there is no > option to specify the [AWS KMS Encryption > Context|https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#encrypt_context] > in S3A. > In AWS SDK v2 the encryption context in S3 requests is set by the parameter > [ssekmsEncryptionContext.|https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/model/CreateMultipartUploadRequest.Builder.html#ssekmsEncryptionContext(java.lang.String)] > It receives a base64-encoded UTF-8 string holding JSON with the encryption > context key-value pairs. The value of this parameter could be set by the user > in a new property {_}*fs.s3a.encryption.context*{_}, and be stored in the > [EncryptionSecrets|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/delegation/EncryptionSecrets.java] > to later be used when setting the encryption parameters in > [RequestFactoryImpl|https://github.com/apache/hadoop/blob/f92a8ab8ae54f11946412904973eb60404dee7ff/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-19197) S3A: Support AWS KMS Encryption Context
[ https://issues.apache.org/jira/browse/HADOOP-19197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-19197: - Assignee: (was: Viraj Jasani) > S3A: Support AWS KMS Encryption Context > --- > > Key: HADOOP-19197 > URL: https://issues.apache.org/jira/browse/HADOOP-19197 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Raphael Azzolini >Priority: Major > Labels: pull-request-available > > S3A properties allow users to choose the AWS KMS key > ({_}fs.s3a.encryption.key{_}) and S3 encryption algorithm to be used > (f{_}s.s3a.encryption.algorithm{_}). In addition to the AWS KMS Key, an > encryption context can be used as non-secret data that adds additional > integrity and authenticity to check the encrypted data. However, there is no > option to specify the [AWS KMS Encryption > Context|https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#encrypt_context] > in S3A. > In AWS SDK v2 the encryption context in S3 requests is set by the parameter > [ssekmsEncryptionContext.|https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/model/CreateMultipartUploadRequest.Builder.html#ssekmsEncryptionContext(java.lang.String)] > It receives a base64-encoded UTF-8 string holding JSON with the encryption > context key-value pairs. The value of this parameter could be set by the user > in a new property {_}*fs.s3a.encryption.context*{_}, and be stored in the > [EncryptionSecrets|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/delegation/EncryptionSecrets.java] > to later be used when setting the encryption parameters in > [RequestFactoryImpl|https://github.com/apache/hadoop/blob/f92a8ab8ae54f11946412904973eb60404dee7ff/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-19197) S3A: Support AWS KMS Encryption Context
[ https://issues.apache.org/jira/browse/HADOOP-19197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-19197: - Assignee: Viraj Jasani > S3A: Support AWS KMS Encryption Context > --- > > Key: HADOOP-19197 > URL: https://issues.apache.org/jira/browse/HADOOP-19197 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Raphael Azzolini >Assignee: Viraj Jasani >Priority: Major > > S3A properties allow users to choose the AWS KMS key > ({_}fs.s3a.encryption.key{_}) and S3 encryption algorithm to be used > (f{_}s.s3a.encryption.algorithm{_}). In addition to the AWS KMS Key, an > encryption context can be used as non-secret data that adds additional > integrity and authenticity to check the encrypted data. However, there is no > option to specify the [AWS KMS Encryption > Context|https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#encrypt_context] > in S3A. > In AWS SDK v2 the encryption context in S3 requests is set by the parameter > [ssekmsEncryptionContext.|https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/model/CreateMultipartUploadRequest.Builder.html#ssekmsEncryptionContext(java.lang.String)] > It receives a base64-encoded UTF-8 string holding JSON with the encryption > context key-value pairs. The value of this parameter could be set by the user > in a new property {_}*fs.s3a.encryption.context*{_}, and be stored in the > [EncryptionSecrets|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/delegation/EncryptionSecrets.java] > to later be used when setting the encryption parameters in > [RequestFactoryImpl|https://github.com/apache/hadoop/blob/f92a8ab8ae54f11946412904973eb60404dee7ff/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19197) S3A: Support AWS KMS Encryption Context
[ https://issues.apache.org/jira/browse/HADOOP-19197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853304#comment-17853304 ] Viraj Jasani commented on HADOOP-19197: --- How about we allow user to configure _fs.s3a.encryption.context_ similar to how we allow for {_}fs.s3a.aws.credentials.provider.mapping{_}? i.e. key-value pair of String values, let S3A take care of converting the key-value pairs to Base64 encoded JSON of String key-value pairs. Given that the context is anyways sent in plain text (it's just Base64 encoded JSON String, not a secret key), we can allow user to configure plain text key-value pairs separate by "=" with {_}fs.s3a.encryption.context{_}. Sample validation error when we pass anything other than Base64 encoded Json: {code:java} Caused by: software.amazon.awssdk.services.s3.model.S3Exception: The header 'x-amz-server-side-encryption-context' shall be Base64-encoded UTF-8 string holding JSON which represents a string-string map (Service: S3, Status Code: 400, Request ID: SC3CA6BGC8B8RBRD, Extended Request ID: 8iCVA0qZsxlPXxkDpR49Gtah5LlcgTojtoHyvSEvdY25Kqow5/SPMtXIzuIKzgra16t5e23VQIc6iNle0FhcGw==){code} > S3A: Support AWS KMS Encryption Context > --- > > Key: HADOOP-19197 > URL: https://issues.apache.org/jira/browse/HADOOP-19197 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Raphael Azzolini >Priority: Major > > S3A properties allow users to choose the AWS KMS key > ({_}fs.s3a.encryption.key{_}) and S3 encryption algorithm to be used > (f{_}s.s3a.encryption.algorithm{_}). In addition to the AWS KMS Key, an > encryption context can be used as non-secret data that adds additional > integrity and authenticity to check the encrypted data. However, there is no > option to specify the [AWS KMS Encryption > Context|https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#encrypt_context] > in S3A. > In AWS SDK v2 the encryption context in S3 requests is set by the parameter > [ssekmsEncryptionContext.|https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/model/CreateMultipartUploadRequest.Builder.html#ssekmsEncryptionContext(java.lang.String)] > It receives a base64-encoded UTF-8 string holding JSON with the encryption > context key-value pairs. The value of this parameter could be set by the user > in a new property {_}*fs.s3a.encryption.context*{_}, and be stored in the > [EncryptionSecrets|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/delegation/EncryptionSecrets.java] > to later be used when setting the encryption parameters in > [RequestFactoryImpl|https://github.com/apache/hadoop/blob/f92a8ab8ae54f11946412904973eb60404dee7ff/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19197) S3A: Support AWS KMS Encryption Context
[ https://issues.apache.org/jira/browse/HADOOP-19197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853016#comment-17853016 ] Viraj Jasani commented on HADOOP-19197: --- We need to use it at 3 places: CopyObjectRequest, PutObjectRequest and CreateMultipartUploadRequest. > S3A: Support AWS KMS Encryption Context > --- > > Key: HADOOP-19197 > URL: https://issues.apache.org/jira/browse/HADOOP-19197 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Raphael Azzolini >Priority: Major > > S3A properties allow users to choose the AWS KMS key > ({_}fs.s3a.encryption.key{_}) and S3 encryption algorithm to be used > (f{_}s.s3a.encryption.algorithm{_}). In addition to the AWS KMS Key, an > encryption context can be used as non-secret data that adds additional > integrity and authenticity to check the encrypted data. However, there is no > option to specify the [AWS KMS Encryption > Context|https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#encrypt_context] > in S3A. > In AWS SDK v2 the encryption context in S3 requests is set by the parameter > [ssekmsEncryptionContext.|https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/model/CreateMultipartUploadRequest.Builder.html#ssekmsEncryptionContext(java.lang.String)] > It receives a base64-encoded UTF-8 string holding JSON with the encryption > context key-value pairs. The value of this parameter could be set by the user > in a new property {_}*fs.s3a.encryption.context*{_}, and be stored in the > [EncryptionSecrets|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/delegation/EncryptionSecrets.java] > to later be used when setting the encryption parameters in > [RequestFactoryImpl|https://github.com/apache/hadoop/blob/f92a8ab8ae54f11946412904973eb60404dee7ff/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19148) Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298
[ https://issues.apache.org/jira/browse/HADOOP-19148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846806#comment-17846806 ] Viraj Jasani commented on HADOOP-19148: --- Build is fine, dependency tree looks good (except it has zookeeper-jute transitive version coming as 3.6.2 instead of 3.8.4), let me create PR to run the whole build with tests. {code:java} [INFO] +- org.apache.solr:solr-solrj:jar:8.11.3:compile [INFO] | +- com.fasterxml.woodstox:woodstox-core:jar:5.4.0:compile [INFO] | +- commons-io:commons-io:jar:2.14.0:compile [INFO] | +- commons-lang:commons-lang:jar:2.6:compile [INFO] | +- io.netty:netty-buffer:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-codec:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-common:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-handler:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-resolver:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-transport:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-transport-native-epoll:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-transport-native-unix-common:jar:4.1.100.Final:compile [INFO] | +- org.apache.commons:commons-math3:jar:3.6.1:compile [INFO] | +- org.apache.httpcomponents:httpclient:jar:4.5.13:compile [INFO] | +- org.apache.httpcomponents:httpcore:jar:4.4.13:compile [INFO] | +- org.apache.httpcomponents:httpmime:jar:4.5.13:compile [INFO] | +- org.apache.zookeeper:zookeeper:jar:3.8.4:compile [INFO] | +- org.apache.zookeeper:zookeeper-jute:jar:3.6.2:compile [INFO] | +- org.codehaus.woodstox:stax2-api:jar:4.2.1:compile ... ... {code} > Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298 > --- > > Key: HADOOP-19148 > URL: https://issues.apache.org/jira/browse/HADOOP-19148 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Reporter: Brahma Reddy Battula >Assignee: Viraj Jasani >Priority: Major > > Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19148) Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298
[ https://issues.apache.org/jira/browse/HADOOP-19148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846458#comment-17846458 ] Viraj Jasani commented on HADOOP-19148: --- [~brahmareddy] is anyone picking this up? If not, let me create the PR? > Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298 > --- > > Key: HADOOP-19148 > URL: https://issues.apache.org/jira/browse/HADOOP-19148 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Reporter: Brahma Reddy Battula >Priority: Major > > Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails
[ https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-19146: -- Component/s: test > noaa-cors-pds bucket access with global endpoint fails > -- > > Key: HADOOP-19146 > URL: https://issues.apache.org/jira/browse/HADOOP-19146 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > All tests accessing noaa-cors-pds use us-east-1 region, as configured at > bucket level. If global endpoint is configured (e.g. us-west-2), they fail to > access to bucket. > > Sample error: > {code:java} > org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect > response to region [us-east-1]. This likely indicates that the S3 region > configured in fs.s3a.endpoint.region does not match the AWS region containing > the bucket.: null (Service: S3, Status Code: 301, Request ID: > PMRWMQC9S91CNEJR, Extended Request ID: > 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) > at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253) > at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922) > at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115) > at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349) > at org.apache.hadoop.fs.Globber.glob(Globber.java:202) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674) > {code} > {code:java} > Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null > (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended > Request ID: > 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43) >
[jira] [Created] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails
Viraj Jasani created HADOOP-19146: - Summary: noaa-cors-pds bucket access with global endpoint fails Key: HADOOP-19146 URL: https://issues.apache.org/jira/browse/HADOOP-19146 Project: Hadoop Common Issue Type: Improvement Components: fs/s3 Affects Versions: 3.4.0 Reporter: Viraj Jasani All tests accessing noaa-cors-pds use us-east-1 region, as configured at bucket level. If global endpoint is configured (e.g. us-west-2), they fail to access to bucket. Sample error: {code:java} org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect response to region [us-east-1]. This likely indicates that the S3 region configured in fs.s3a.endpoint.region does not match the AWS region containing the bucket.: null (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended Request ID: 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253) at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155) at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922) at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115) at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349) at org.apache.hadoop.fs.Globber.glob(Globber.java:202) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) at org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674) {code} {code:java} Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended Request ID: 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43) at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93) at software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279) ... ... ... at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
[jira] [Assigned] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails
[ https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-19146: - Assignee: Viraj Jasani > noaa-cors-pds bucket access with global endpoint fails > -- > > Key: HADOOP-19146 > URL: https://issues.apache.org/jira/browse/HADOOP-19146 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > All tests accessing noaa-cors-pds use us-east-1 region, as configured at > bucket level. If global endpoint is configured (e.g. us-west-2), they fail to > access to bucket. > > Sample error: > {code:java} > org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect > response to region [us-east-1]. This likely indicates that the S3 region > configured in fs.s3a.endpoint.region does not match the AWS region containing > the bucket.: null (Service: S3, Status Code: 301, Request ID: > PMRWMQC9S91CNEJR, Extended Request ID: > 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) > at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253) > at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922) > at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115) > at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349) > at org.apache.hadoop.fs.Globber.glob(Globber.java:202) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674) > {code} > {code:java} > Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null > (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended > Request ID: > 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43)
[jira] [Commented] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
[ https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825912#comment-17825912 ] Viraj Jasani commented on HADOOP-19066: --- Addendum PR: [https://github.com/apache/hadoop/pull/6624] > AWS SDK V2 - Enabling FIPS should be allowed with central endpoint > -- > > Key: HADOOP-19066 > URL: https://issues.apache.org/jira/browse/HADOOP-19066 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.5.0, 3.4.1 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK > considers overriding endpoint and enabling fips as mutually exclusive, we > fail fast if fs.s3a.endpoint is set with fips support (details on > HADOOP-18975). > Now, we no longer override SDK endpoint for central endpoint since we enable > cross region access (details on HADOOP-19044) but we would still fail fast if > endpoint is central and fips is enabled. > Changes proposed: > * S3A to fail fast only if FIPS is enabled and non-central endpoint is > configured. > * Tests to ensure S3 bucket is accessible with default region us-east-2 with > cross region access (expected with central endpoint). > * Document FIPS support with central endpoint on connecting.html. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18980) S3A credential provider remapping: make extensible
[ https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17816513#comment-17816513 ] Viraj Jasani commented on HADOOP-18980: --- Addressed edge cases with addendum PR: [https://github.com/apache/hadoop/pull/6546] > S3A credential provider remapping: make extensible > -- > > Key: HADOOP-18980 > URL: https://issues.apache.org/jira/browse/HADOOP-18980 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Viraj Jasani >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.5.0, 3.4.1 > > > s3afs will now remap the common com.amazonaws credential providers to > equivalents in the v2 sdk or in hadoop-aws > We could do the same for third party credential providers by taking a > key=value list in a configuration property and adding to the map. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
[ https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-19066: -- Status: Patch Available (was: In Progress) > AWS SDK V2 - Enabling FIPS should be allowed with central endpoint > -- > > Key: HADOOP-19066 > URL: https://issues.apache.org/jira/browse/HADOOP-19066 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.5.0, 3.4.1 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK > considers overriding endpoint and enabling fips as mutually exclusive, we > fail fast if fs.s3a.endpoint is set with fips support (details on > HADOOP-18975). > Now, we no longer override SDK endpoint for central endpoint since we enable > cross region access (details on HADOOP-19044) but we would still fail fast if > endpoint is central and fips is enabled. > Changes proposed: > * S3A to fail fast only if FIPS is enabled and non-central endpoint is > configured. > * Tests to ensure S3 bucket is accessible with default region us-east-2 with > cross region access (expected with central endpoint). > * Document FIPS support with central endpoint on connecting.html. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-19072) S3A: expand optimisations on stores with "fs.s3a.create.performance"
[ https://issues.apache.org/jira/browse/HADOOP-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-19072: - Assignee: Viraj Jasani > S3A: expand optimisations on stores with "fs.s3a.create.performance" > > > Key: HADOOP-19072 > URL: https://issues.apache.org/jira/browse/HADOOP-19072 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Viraj Jasani >Priority: Major > > on an s3a store with fs.s3a.create.performance set, speed up other operations > * mkdir to skip parent directory check: just do a HEAD to see if there's a > file at the target location -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19072) S3A: expand optimisations on stores with "fs.s3a.create.performance"
[ https://issues.apache.org/jira/browse/HADOOP-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815822#comment-17815822 ] Viraj Jasani commented on HADOOP-19072: --- The improvement makes sense, as long as downstreamer knows where they are creating the dir. > S3A: expand optimisations on stores with "fs.s3a.create.performance" > > > Key: HADOOP-19072 > URL: https://issues.apache.org/jira/browse/HADOOP-19072 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Priority: Major > > on an s3a store with fs.s3a.create.performance set, speed up other operations > * mkdir to skip parent directory check: just do a HEAD to see if there's a > file at the target location -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
[ https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814576#comment-17814576 ] Viraj Jasani commented on HADOOP-19066: --- Indeed! hopefully some final stabilization work. > AWS SDK V2 - Enabling FIPS should be allowed with central endpoint > -- > > Key: HADOOP-19066 > URL: https://issues.apache.org/jira/browse/HADOOP-19066 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.5.0, 3.4.1 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK > considers overriding endpoint and enabling fips as mutually exclusive, we > fail fast if fs.s3a.endpoint is set with fips support (details on > HADOOP-18975). > Now, we no longer override SDK endpoint for central endpoint since we enable > cross region access (details on HADOOP-19044) but we would still fail fast if > endpoint is central and fips is enabled. > Changes proposed: > * S3A to fail fast only if FIPS is enabled and non-central endpoint is > configured. > * Tests to ensure S3 bucket is accessible with default region us-east-2 with > cross region access (expected with central endpoint). > * Document FIPS support with central endpoint on connecting.html. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
[ https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814171#comment-17814171 ] Viraj Jasani commented on HADOOP-19066: --- Will run the whole suite with FIPS support + central endpoint. > AWS SDK V2 - Enabling FIPS should be allowed with central endpoint > -- > > Key: HADOOP-19066 > URL: https://issues.apache.org/jira/browse/HADOOP-19066 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.5.0, 3.4.1 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK > considers overriding endpoint and enabling fips as mutually exclusive, we > fail fast if fs.s3a.endpoint is set with fips support (details on > HADOOP-18975). > Now, we no longer override SDK endpoint for central endpoint since we enable > cross region access (details on HADOOP-19044) but we would still fail fast if > endpoint is central and fips is enabled. > Changes proposed: > * S3A to fail fast only if FIPS is enabled and non-central endpoint is > configured. > * Tests to ensure S3 bucket is accessible with default region us-east-2 with > cross region access (expected with central endpoint). > * Document FIPS support with central endpoint on connecting.html. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
[ https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-19066: - Assignee: Viraj Jasani > AWS SDK V2 - Enabling FIPS should be allowed with central endpoint > -- > > Key: HADOOP-19066 > URL: https://issues.apache.org/jira/browse/HADOOP-19066 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.5.0, 3.4.1 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK > considers overriding endpoint and enabling fips as mutually exclusive, we > fail fast if fs.s3a.endpoint is set with fips support (details on > HADOOP-18975). > Now, we no longer override SDK endpoint for central endpoint since we enable > cross region access (details on HADOOP-19044) but we would still fail fast if > endpoint is central and fips is enabled. > Changes proposed: > * S3A to fail fast only if FIPS is enabled and non-central endpoint is > configured. > * Tests to ensure S3 bucket is accessible with default region us-east-2 with > cross region access (expected with central endpoint). > * Document FIPS support with central endpoint on connecting.html. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
Viraj Jasani created HADOOP-19066: - Summary: AWS SDK V2 - Enabling FIPS should be allowed with central endpoint Key: HADOOP-19066 URL: https://issues.apache.org/jira/browse/HADOOP-19066 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 3.5.0, 3.4.1 Reporter: Viraj Jasani FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK considers overriding endpoint and enabling fips as mutually exclusive, we fail fast if fs.s3a.endpoint is set with fips support (details on HADOOP-18975). Now, we no longer override SDK endpoint for central endpoint since we enable cross region access (details on HADOOP-19044) but we would still fail fast if endpoint is central and fips is enabled. Changes proposed: * S3A to fail fast only if FIPS is enabled and non-central endpoint is configured. * Tests to ensure S3 bucket is accessible with default region us-east-2 with cross region access (expected with central endpoint). * Document FIPS support with central endpoint on connecting.html. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19022) S3A : ITestS3AConfiguration#testRequestTimeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812142#comment-17812142 ] Viraj Jasani commented on HADOOP-19022: --- It's fine [~ste...@apache.org], i anyways need to make some changes for updating cross region logic, so i can take care of that and also fixing timeout value for the current test (only if required after your PR [https://github.com/apache/hadoop/pull/6470)] and then add some more coverage. Once your PR gets merged and cross region logic part is also done, i will re-run this with different endpoint/region settings and if needed, i will take care of ITestS3AConfiguration issues as part of this Jira, otherwise will close the Jira. > S3A : ITestS3AConfiguration#testRequestTimeout failure > -- > > Key: HADOOP-19022 > URL: https://issues.apache.org/jira/browse/HADOOP-19022 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Priority: Minor > > "fs.s3a.connection.request.timeout" should be specified in milliseconds as per > {code:java} > Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT, > DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); > {code} > The test fails consistently because it sets 120 ms timeout which is less than > 15s (min network operation duration), and hence gets reset to 15000 ms based > on the enforcement. > > {code:java} > [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration) > Time elapsed: 0.016 s <<< FAILURE! > java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is > different than what AWS sdk configuration uses internally expected:<12> > but was:<15000> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotEquals(Assert.java:835) > at org.junit.Assert.assertEquals(Assert.java:647) > at > org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18975) AWS SDK v2: extend support for FIPS endpoints
[ https://issues.apache.org/jira/browse/HADOOP-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809636#comment-17809636 ] Viraj Jasani commented on HADOOP-18975: --- {quote}you must have set a global endpoint, rather than one for your test bucket -correct? {quote} Exactly. > AWS SDK v2: extend support for FIPS endpoints > -- > > Key: HADOOP-18975 > URL: https://issues.apache.org/jira/browse/HADOOP-18975 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > v1 SDK supported FIPS just by changing the endpoint. > Now we have a new builder setting to use. > * add new fs.s3a.endpoint.fips option > * pass it down > * test -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18975) AWS SDK v2: extend support for FIPS endpoints
[ https://issues.apache.org/jira/browse/HADOOP-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809271#comment-17809271 ] Viraj Jasani edited comment on HADOOP-18975 at 1/22/24 7:33 AM: {code:java} fs.s3a.bucket.landsat-pds.endpoint.fips true Use the fips endpoint {code} [~ste...@apache.org] [~ahmar] do we really need fips enabled for landsat in hadoop-tools/hadoop-aws/src/test/resources/core-site.xml ? This is breaking several tests from full suite that i am running against us-west-2 for PR [https://github.com/apache/hadoop/pull/6479] e.g. {code:java} [ERROR] testSelectOddRecordsIgnoreHeaderV1(org.apache.hadoop.fs.s3a.select.ITestS3Select) Time elapsed: 2.917 s <<< ERROR! java.lang.IllegalArgumentException: An endpoint cannot set when fs.s3a.endpoint.fips is true : https://s3-us-west-2.amazonaws.com at org.apache.hadoop.util.Preconditions.checkArgument(Preconditions.java:213) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureEndpointAndRegion(DefaultS3ClientFactory.java:292) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureClientBuilder(DefaultS3ClientFactory.java:179) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:126) at org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:1063) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:677) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3601) at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:171) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3702) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3653) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:555) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:366) at org.apache.hadoop.fs.s3a.select.AbstractS3SelectTest.setup(AbstractS3SelectTest.java:304) at org.apache.hadoop.fs.s3a.select.ITestS3Select.setup(ITestS3Select.java:112) {code} [ERROR] Tests run: 1264, Failures: 4, Errors: 87, Skipped: 164 was (Author: vjasani): {code:java} fs.s3a.bucket.landsat-pds.endpoint.fips true Use the fips endpoint {code} [~ste...@apache.org] [~ahmar] do we really need fips enabled for landsat in hadoop-tools/hadoop-aws/src/test/resources/core-site.xml ? This is breaking several tests from full suite that i am running against us-west-2 for PR [https://github.com/apache/hadoop/pull/6479] e.g. {code:java} [ERROR] testSelectOddRecordsIgnoreHeaderV1(org.apache.hadoop.fs.s3a.select.ITestS3Select) Time elapsed: 2.917 s <<< ERROR! java.lang.IllegalArgumentException: An endpoint cannot set when fs.s3a.endpoint.fips is true : https://s3-us-west-2.amazonaws.com at org.apache.hadoop.util.Preconditions.checkArgument(Preconditions.java:213) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureEndpointAndRegion(DefaultS3ClientFactory.java:292) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureClientBuilder(DefaultS3ClientFactory.java:179) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:126) at org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:1063) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:677) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3601) at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:171) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3702) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3653) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:555) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:366) at org.apache.hadoop.fs.s3a.select.AbstractS3SelectTest.setup(AbstractS3SelectTest.java:304) at org.apache.hadoop.fs.s3a.select.ITestS3Select.setup(ITestS3Select.java:112) {code} > AWS SDK v2: extend support for FIPS endpoints > -- > > Key: HADOOP-18975 > URL: https://issues.apache.org/jira/browse/HADOOP-18975 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > v1 SDK supported FIPS just by changing the endpoint. > Now we have a new builder setting to use. > * add new fs.s3a.endpoint.fips option > * pass it down > * test -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HADOOP-18975) AWS SDK v2: extend support for FIPS endpoints
[ https://issues.apache.org/jira/browse/HADOOP-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809271#comment-17809271 ] Viraj Jasani commented on HADOOP-18975: --- {code:java} fs.s3a.bucket.landsat-pds.endpoint.fips true Use the fips endpoint {code} [~ste...@apache.org] [~ahmar] do we really need fips enabled for landsat in hadoop-tools/hadoop-aws/src/test/resources/core-site.xml ? This is breaking several tests from full suite that i am running against us-west-2 for PR [https://github.com/apache/hadoop/pull/6479] e.g. {code:java} [ERROR] testSelectOddRecordsIgnoreHeaderV1(org.apache.hadoop.fs.s3a.select.ITestS3Select) Time elapsed: 2.917 s <<< ERROR! java.lang.IllegalArgumentException: An endpoint cannot set when fs.s3a.endpoint.fips is true : https://s3-us-west-2.amazonaws.com at org.apache.hadoop.util.Preconditions.checkArgument(Preconditions.java:213) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureEndpointAndRegion(DefaultS3ClientFactory.java:292) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureClientBuilder(DefaultS3ClientFactory.java:179) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:126) at org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:1063) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:677) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3601) at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:171) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3702) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3653) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:555) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:366) at org.apache.hadoop.fs.s3a.select.AbstractS3SelectTest.setup(AbstractS3SelectTest.java:304) at org.apache.hadoop.fs.s3a.select.ITestS3Select.setup(ITestS3Select.java:112) {code} > AWS SDK v2: extend support for FIPS endpoints > -- > > Key: HADOOP-18975 > URL: https://issues.apache.org/jira/browse/HADOOP-18975 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > v1 SDK supported FIPS just by changing the endpoint. > Now we have a new builder setting to use. > * add new fs.s3a.endpoint.fips option > * pass it down > * test -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-19044) AWS SDK V2 - Update S3A region logic
[ https://issues.apache.org/jira/browse/HADOOP-19044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-19044: - Assignee: Viraj Jasani > AWS SDK V2 - Update S3A region logic > - > > Key: HADOOP-19044 > URL: https://issues.apache.org/jira/browse/HADOOP-19044 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Ahmar Suhail >Assignee: Viraj Jasani >Priority: Major > > If both fs.s3a.endpoint & fs.s3a.endpoint.region are empty, Spark will set > fs.s3a.endpoint to > s3.amazonaws.com here: > [https://github.com/apache/spark/blob/9a2f39318e3af8b3817dc5e4baf52e548d82063c/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L540] > > > HADOOP-18908, updated the region logic such that if fs.s3a.endpoint.region is > set, or if a region can be parsed from fs.s3a.endpoint (which will happen in > this case, region will be US_EAST_1), cross region access is not enabled. > This will cause 400 errors if the bucket is not in US_EAST_1. > > Proposed: Updated the logic so that if the endpoint is the global > s3.amazonaws.com , cross region access is enabled. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19023) S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804132#comment-17804132 ] Viraj Jasani commented on HADOOP-19023: --- {quote} * make sure you've not got a site config with an aggressive timeout{quote} Can confirm that this is not the case. {quote} * do set version/component in the issue fields...it's not picked up from the parent{quote} Sure, will keep this in mind. While HADOOP-19022 has test failure that is consistent, this one testParallelRename is intermediate failure. It happened only when I ran the whole suite (-Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch), when the setup was connected to VPN. Running the test individually is not failing. Since testParallelRename is already aggressive, I think we might want to set higher connection timeout for the test. > S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure > --- > > Key: HADOOP-19023 > URL: https://issues.apache.org/jira/browse/HADOOP-19023 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Priority: Major > > Need to configure higher timeout for the test. > > {code:java} > [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 256.281 s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps > [ERROR] > testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps) > Time elapsed: 72.565 s <<< ERROR! > org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on > fork-0005/test/testParallelRename-source0: > software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client > execution did not complete before the specified timeout configuration: 15000 > millis > at > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215) > at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) > at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376) > at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468) > at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372) > at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347) > at > org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214) > at > org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) > at > org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) > at > org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: > Client execution did not complete before the specified timeout configuration: > 15000 millis > at > software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97) > at > software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) > at > software.amazon.
[jira] [Commented] (HADOOP-19022) S3A : ITestS3AConfiguration#testRequestTimeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804129#comment-17804129 ] Viraj Jasani commented on HADOOP-19022: --- {quote}have you explicitly set it in your site config? {quote} Can confirm that it is not set explicitly, this test fails consistently because it takes 120 as 120 ms by default, and since it is less than 15 s, so 15s is selected: {code:java} apiCallTimeout = enforceMinimumDuration(REQUEST_TIMEOUT, apiCallTimeout, minimumOperationDuration); {code} Here, minimumOperationDuration is 15s. For this Jira, we can # Make the test use "120s" instead of "120" so that it will not set 15s by default. # Add a test with timeout value smaller than 15s and verify that actual timeout in S3A client config object is 15s. # Add a test by setting "0" as timeout and verify that SdkClientOption.API_CALL_ATTEMPT_TIMEOUT does not even get set. # Document "fs.s3a.connection.request.timeout" as having 15s default behavior if any client sets it with value > 0 and < 15s. WDYT? > S3A : ITestS3AConfiguration#testRequestTimeout failure > -- > > Key: HADOOP-19022 > URL: https://issues.apache.org/jira/browse/HADOOP-19022 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Priority: Minor > > "fs.s3a.connection.request.timeout" should be specified in milliseconds as per > {code:java} > Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT, > DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); > {code} > The test fails consistently because it sets 120 ms timeout which is less than > 15s (min network operation duration), and hence gets reset to 15000 ms based > on the enforcement. > > {code:java} > [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration) > Time elapsed: 0.016 s <<< FAILURE! > java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is > different than what AWS sdk configuration uses internally expected:<12> > but was:<15000> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotEquals(Assert.java:835) > at org.junit.Assert.assertEquals(Assert.java:647) > at > org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-19023) ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-19023: -- Component/s: test > ITestS3AConcurrentOps#testParallelRename intermittent timeout failure > - > > Key: HADOOP-19023 > URL: https://issues.apache.org/jira/browse/HADOOP-19023 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Priority: Major > > Need to configure higher timeout for the test. > > {code:java} > [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 256.281 s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps > [ERROR] > testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps) > Time elapsed: 72.565 s <<< ERROR! > org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on > fork-0005/test/testParallelRename-source0: > software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client > execution did not complete before the specified timeout configuration: 15000 > millis > at > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215) > at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) > at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376) > at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468) > at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372) > at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347) > at > org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214) > at > org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) > at > org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) > at > org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: > Client execution did not complete before the specified timeout configuration: > 15000 millis > at > software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97) > at > software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32) > at > software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) > at > software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExcept
[jira] [Updated] (HADOOP-19022) S3A : ITestS3AConfiguration#testRequestTimeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-19022: -- Summary: S3A : ITestS3AConfiguration#testRequestTimeout failure (was: ITestS3AConfiguration#testRequestTimeout failure) > S3A : ITestS3AConfiguration#testRequestTimeout failure > -- > > Key: HADOOP-19022 > URL: https://issues.apache.org/jira/browse/HADOOP-19022 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Priority: Minor > > "fs.s3a.connection.request.timeout" should be specified in milliseconds as per > {code:java} > Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT, > DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); > {code} > The test fails consistently because it sets 120 ms timeout which is less than > 15s (min network operation duration), and hence gets reset to 15000 ms based > on the enforcement. > > {code:java} > [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration) > Time elapsed: 0.016 s <<< FAILURE! > java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is > different than what AWS sdk configuration uses internally expected:<12> > but was:<15000> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotEquals(Assert.java:835) > at org.junit.Assert.assertEquals(Assert.java:647) > at > org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-19023) S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-19023: -- Summary: S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure (was: ITestS3AConcurrentOps#testParallelRename intermittent timeout failure) > S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure > --- > > Key: HADOOP-19023 > URL: https://issues.apache.org/jira/browse/HADOOP-19023 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Priority: Major > > Need to configure higher timeout for the test. > > {code:java} > [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 256.281 s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps > [ERROR] > testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps) > Time elapsed: 72.565 s <<< ERROR! > org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on > fork-0005/test/testParallelRename-source0: > software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client > execution did not complete before the specified timeout configuration: 15000 > millis > at > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215) > at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) > at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376) > at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468) > at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372) > at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347) > at > org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214) > at > org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) > at > org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) > at > org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: > Client execution did not complete before the specified timeout configuration: > 15000 millis > at > software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97) > at > software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32) > at > software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) > at > software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineB
[jira] [Updated] (HADOOP-18980) S3A credential provider remapping: make extensible
[ https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18980: -- Status: Patch Available (was: In Progress) > S3A credential provider remapping: make extensible > -- > > Key: HADOOP-18980 > URL: https://issues.apache.org/jira/browse/HADOOP-18980 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Viraj Jasani >Priority: Minor > Labels: pull-request-available > > s3afs will now remap the common com.amazonaws credential providers to > equivalents in the v2 sdk or in hadoop-aws > We could do the same for third party credential providers by taking a > key=value list in a configuration property and adding to the map. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18959) Use builder for prefetch CachingBlockManager
[ https://issues.apache.org/jira/browse/HADOOP-18959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17803401#comment-17803401 ] Viraj Jasani commented on HADOOP-18959: --- [~slfan1989] this is already committed to trunk, only backport PR is pending for merge. > Use builder for prefetch CachingBlockManager > > > Key: HADOOP-18959 > URL: https://issues.apache.org/jira/browse/HADOOP-18959 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > Some of the recent changes (HADOOP-18399, HADOOP-18291, HADOOP-18829 etc) > have added more params for prefetch CachingBlockManager c'tor to process > read/write block requests. They have added too many params and more are > likely to be introduced later. We should use builder pattern to pass params. > This would also help consolidating required prefetch params into one single > place within S3ACachingInputStream, from scattered locations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19023) ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
Viraj Jasani created HADOOP-19023: - Summary: ITestS3AConcurrentOps#testParallelRename intermittent timeout failure Key: HADOOP-19023 URL: https://issues.apache.org/jira/browse/HADOOP-19023 Project: Hadoop Common Issue Type: Sub-task Reporter: Viraj Jasani Need to configure higher timeout for the test. {code:java} [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 256.281 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps [ERROR] testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps) Time elapsed: 72.565 s <<< ERROR! org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on fork-0005/test/testParallelRename-source0: software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client execution did not complete before the specified timeout configuration: 15000 millis at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215) at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376) at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468) at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372) at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347) at org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214) at org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620) at org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) at org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) at org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client execution did not complete before the specified timeout configuration: 15000 millis at software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97) at software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37) at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26) at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:224) at software.amazon.awssdk.core.intern
[jira] [Commented] (HADOOP-19022) ITestS3AConfiguration#testRequestTimeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17802395#comment-17802395 ] Viraj Jasani commented on HADOOP-19022: --- It's small test, but perhaps good to cover both cases: more than 15s and less than 15s timeouts. > ITestS3AConfiguration#testRequestTimeout failure > > > Key: HADOOP-19022 > URL: https://issues.apache.org/jira/browse/HADOOP-19022 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Viraj Jasani >Priority: Minor > > "fs.s3a.connection.request.timeout" should be specified in milliseconds as per > {code:java} > Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT, > DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); > {code} > The test fails consistently because it sets 120 ms timeout which is less than > 15s (min network operation duration), and hence gets reset to 15000 ms based > on the enforcement. > > {code:java} > [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration) > Time elapsed: 0.016 s <<< FAILURE! > java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is > different than what AWS sdk configuration uses internally expected:<12> > but was:<15000> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotEquals(Assert.java:835) > at org.junit.Assert.assertEquals(Assert.java:647) > at > org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19022) ITestS3AConfiguration#testRequestTimeout failure
Viraj Jasani created HADOOP-19022: - Summary: ITestS3AConfiguration#testRequestTimeout failure Key: HADOOP-19022 URL: https://issues.apache.org/jira/browse/HADOOP-19022 Project: Hadoop Common Issue Type: Sub-task Reporter: Viraj Jasani "fs.s3a.connection.request.timeout" should be specified in milliseconds as per {code:java} Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT, DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); {code} The test fails consistently because it sets 120 ms timeout which is less than 15s (min network operation duration), and hence gets reset to 15000 ms based on the enforcement. {code:java} [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration) Time elapsed: 0.016 s <<< FAILURE! java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is different than what AWS sdk configuration uses internally expected:<12> but was:<15000> at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18991) Remove commons-benautils dependency from Hadoop 3
[ https://issues.apache.org/jira/browse/HADOOP-18991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17790816#comment-17790816 ] Viraj Jasani commented on HADOOP-18991: --- As per HADOOP-16542, if we remove this, hive build fails. Hive can explicitly use common-beanutils directly? FYI [~weichiu] > Remove commons-benautils dependency from Hadoop 3 > - > > Key: HADOOP-18991 > URL: https://issues.apache.org/jira/browse/HADOOP-18991 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Reporter: Istvan Toth >Priority: Major > > Hadoop doesn't acually use it, and it pollutes the classpath of dependent > projects. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18991) Remove commons-benautils dependency from Hadoop 3
[ https://issues.apache.org/jira/browse/HADOOP-18991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17790788#comment-17790788 ] Viraj Jasani commented on HADOOP-18991: --- [~stoty] is this the cause for managing it in phoenix even after excluding it from omid? > Remove commons-benautils dependency from Hadoop 3 > - > > Key: HADOOP-18991 > URL: https://issues.apache.org/jira/browse/HADOOP-18991 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Reporter: Istvan Toth >Priority: Major > > Hadoop doesn't acually use it, and it pollutes the classpath of dependent > projects. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18980) S3A credential provider remapping: make extensible
[ https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788128#comment-17788128 ] Viraj Jasani commented on HADOOP-18980: --- {quote}exactly; though i'd expect the remapping to be from com.amazonaws to software.amazonaws or private implementations key goal: you can use the same credentials.provider list for v1 and v2 sdk clients. {quote} In addition to having same credentials.provider list for v1 and v2 sdk, maybe we can also remove static mapping for v1 to v2 credential providers and let new config have default key value pairs: {code:java} fs.s3a.aws.credentials.provider.mapping com.amazonaws.auth.AnonymousAWSCredentials=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider, com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider, com.amazonaws.auth.InstanceProfileCredentialsProvider=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider, com.amazonaws.auth.EnvironmentVariableCredentialsProvider=software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider, com.amazonaws.auth.profile.ProfileCredentialsProvider=software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider {code} With this being default value, any new third-party credential provider can be added to this list by users. Does that sound good? > S3A credential provider remapping: make extensible > -- > > Key: HADOOP-18980 > URL: https://issues.apache.org/jira/browse/HADOOP-18980 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Priority: Minor > > s3afs will now remap the common com.amazonaws credential providers to > equivalents in the v2 sdk or in hadoop-aws > We could do the same for third party credential providers by taking a > key=value list in a configuration property and adding to the map. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18980) S3A credential provider remapping: make extensible
[ https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788128#comment-17788128 ] Viraj Jasani edited comment on HADOOP-18980 at 11/20/23 6:44 PM: - In addition to having same credentials.provider list for v1 and v2 sdk, maybe we can also remove static mapping for v1 to v2 credential providers and let new config have default key value pairs: {code:java} fs.s3a.aws.credentials.provider.mapping com.amazonaws.auth.AnonymousAWSCredentials=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider, com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider, com.amazonaws.auth.InstanceProfileCredentialsProvider=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider, com.amazonaws.auth.EnvironmentVariableCredentialsProvider=software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider, com.amazonaws.auth.profile.ProfileCredentialsProvider=software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider {code} With this being default value, any new third-party credential provider can be added to this list by users. Does that sound good? was (Author: vjasani): {quote}exactly; though i'd expect the remapping to be from com.amazonaws to software.amazonaws or private implementations key goal: you can use the same credentials.provider list for v1 and v2 sdk clients. {quote} In addition to having same credentials.provider list for v1 and v2 sdk, maybe we can also remove static mapping for v1 to v2 credential providers and let new config have default key value pairs: {code:java} fs.s3a.aws.credentials.provider.mapping com.amazonaws.auth.AnonymousAWSCredentials=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider, com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider, com.amazonaws.auth.InstanceProfileCredentialsProvider=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider, com.amazonaws.auth.EnvironmentVariableCredentialsProvider=software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider, com.amazonaws.auth.profile.ProfileCredentialsProvider=software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider {code} With this being default value, any new third-party credential provider can be added to this list by users. Does that sound good? > S3A credential provider remapping: make extensible > -- > > Key: HADOOP-18980 > URL: https://issues.apache.org/jira/browse/HADOOP-18980 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Priority: Minor > > s3afs will now remap the common com.amazonaws credential providers to > equivalents in the v2 sdk or in hadoop-aws > We could do the same for third party credential providers by taking a > key=value list in a configuration property and adding to the map. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18980) S3A credential provider remapping: make extensible
[ https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787854#comment-17787854 ] Viraj Jasani commented on HADOOP-18980: --- Something like this maybe? {code:java} fs.s3a.aws.credentials.provider.mapping com.amazon.xyz.auth.provider.key1=org.apache.hadoop.fs.s3a.CustomCredsProvider1, com.amazon.xyz.auth.provider.key2=org.apache.hadoop.fs.s3a.CustomCredsProvider2, com.amazon.xyz.auth.provider.key3=org.apache.hadoop.fs.s3a.CustomCredsProvider3 fs.s3a.aws.credentials.provider com.amazon.xyz.auth.provider.key1, com.amazon.xyz.auth.provider.key2 {code} > S3A credential provider remapping: make extensible > -- > > Key: HADOOP-18980 > URL: https://issues.apache.org/jira/browse/HADOOP-18980 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Priority: Minor > > s3afs will now remap the common com.amazonaws credential providers to > equivalents in the v2 sdk or in hadoop-aws > We could do the same for third party credential providers by taking a > key=value list in a configuration property and adding to the map. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18959) Use builder for prefetch CachingBlockManager
[ https://issues.apache.org/jira/browse/HADOOP-18959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18959: - Assignee: Viraj Jasani > Use builder for prefetch CachingBlockManager > > > Key: HADOOP-18959 > URL: https://issues.apache.org/jira/browse/HADOOP-18959 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > Some of the recent changes (HADOOP-18399, HADOOP-18291, HADOOP-18829 etc) > have added more params for prefetch CachingBlockManager c'tor to process > read/write block requests. They have added too many params and more are > likely to be introduced later. We should use builder pattern to pass params. > This would also help consolidating required prefetch params into one single > place within S3ACachingInputStream, from scattered locations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-18959) Use builder for prefetch CachingBlockManager
Viraj Jasani created HADOOP-18959: - Summary: Use builder for prefetch CachingBlockManager Key: HADOOP-18959 URL: https://issues.apache.org/jira/browse/HADOOP-18959 Project: Hadoop Common Issue Type: Sub-task Reporter: Viraj Jasani Some of the recent changes (HADOOP-18399, HADOOP-18291, HADOOP-18829 etc) have added more params for prefetch CachingBlockManager c'tor to process read/write block requests. They have added too many params and more are likely to be introduced later. We should use builder pattern to pass params. This would also help consolidating required prefetch params into one single place within S3ACachingInputStream, from scattered locations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE/DSSE encryption is used
[ https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18918: -- Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > ITestS3GuardTool fails if SSE/DSSE encryption is used > - > > Key: HADOOP-18918 > URL: https://issues.apache.org/jira/browse/HADOOP-18918 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.6 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > {code:java} > [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool > [ERROR] > testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool) > Time elapsed: 0.807 s <<< ERROR! > 46: Bucket s3a://landsat-pds: required encryption is none but actual > encryption is DSSE-KMS > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147) > at > org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114) > at > org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:750) > {code} > Since landsat requires none encryption, the test should be skipped for any > encryption algorithm. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18952) FsCommand Stat class set the timeZone"UTC", which is different from the machine's timeZone
[ https://issues.apache.org/jira/browse/HADOOP-18952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17780014#comment-17780014 ] Viraj Jasani commented on HADOOP-18952: --- This has been the case since the beginning: Stat: {code:java} protected final SimpleDateFormat timeFmt; { timeFmt = new SimpleDateFormat("-MM-dd HH:mm:ss"); timeFmt.setTimeZone(TimeZone.getTimeZone("UTC")); }{code} Ls: {code:java} protected final SimpleDateFormat dateFormat = new SimpleDateFormat("-MM-dd HH:mm"); {code} > FsCommand Stat class set the timeZone"UTC", which is different from the > machine's timeZone > -- > > Key: HADOOP-18952 > URL: https://issues.apache.org/jira/browse/HADOOP-18952 > Project: Hadoop Common > Issue Type: Bug > Environment: Using Hadoop 3.3.4-release >Reporter: liang yu >Priority: Major > Attachments: image-2023-10-26-10-07-11-637.png > > > Using Hadoop version 3.3.4 > > When executing Ls command and Stat command on the same hadoop file, I get two > timestamps. > > {code:java} > hdfs dfs -stat "modify_time %y, access_time%x" /path/to/file{code} > returns: > modify_time {_}*2023-10-17 01:43:05*{_}, access_time _*2023-10-17 01:41:00*_ > > {code:java} > hdfs dfs -ls /path/to/file{code} > returns: > {-}rw{-}rw-r–+ 3 user_name user_group 247400339 > _*2023-10-17 09:43*_ /path/to/file > > these two timestamps has the difference 8hours. > I am in China, the timezone is “UTC+8”, so the timestamp from LS command is > correct and timestamp from STAT command is wrong. > > !image-2023-10-26-10-07-11-637.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-18829) s3a prefetch LRU cache eviction metric
[ https://issues.apache.org/jira/browse/HADOOP-18829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani resolved HADOOP-18829. --- Fix Version/s: 3.4.0 3.3.9 Hadoop Flags: Reviewed Resolution: Fixed > s3a prefetch LRU cache eviction metric > -- > > Key: HADOOP-18829 > URL: https://issues.apache.org/jira/browse/HADOOP-18829 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9 > > > Follow-up from HADOOP-18291: > Add new IO statistics metric to capture s3a prefetch LRU cache eviction. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18931) FileSystem.getFileSystemClass() to log at debug the jar the .class came from
[ https://issues.apache.org/jira/browse/HADOOP-18931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17776356#comment-17776356 ] Viraj Jasani edited comment on HADOOP-18931 at 10/17/23 7:16 PM: - sounds good, it makes sense to log for all fs invocation by keeping the log separate from the heavy service load. was (Author: vjasani): sounds good, it makes sense to log for all fs invocation > FileSystem.getFileSystemClass() to log at debug the jar the .class came from > > > Key: HADOOP-18931 > URL: https://issues.apache.org/jira/browse/HADOOP-18931 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Priority: Minor > > we want to be able to log the jar the filesystem implementation class, so > that we can identify which version of a module the class came from. > this is to help track down problems where different machines in the cluster > or the .tar.gz bundle is out of date. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18931) FileSystem.getFileSystemClass() to log at debug the jar the .class came from
[ https://issues.apache.org/jira/browse/HADOOP-18931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17776356#comment-17776356 ] Viraj Jasani commented on HADOOP-18931: --- sounds good, it makes sense to log for all fs invocation > FileSystem.getFileSystemClass() to log at debug the jar the .class came from > > > Key: HADOOP-18931 > URL: https://issues.apache.org/jira/browse/HADOOP-18931 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Priority: Minor > > we want to be able to log the jar the filesystem implementation class, so > that we can identify which version of a module the class came from. > this is to help track down problems where different machines in the cluster > or the .tar.gz bundle is out of date. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE/DSSE encryption is used
[ https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18918: -- Status: Patch Available (was: In Progress) > ITestS3GuardTool fails if SSE/DSSE encryption is used > - > > Key: HADOOP-18918 > URL: https://issues.apache.org/jira/browse/HADOOP-18918 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.6 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > Labels: pull-request-available > > {code:java} > [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool > [ERROR] > testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool) > Time elapsed: 0.807 s <<< ERROR! > 46: Bucket s3a://landsat-pds: required encryption is none but actual > encryption is DSSE-KMS > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147) > at > org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114) > at > org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:750) > {code} > Since landsat requires none encryption, the test should be skipped for any > encryption algorithm. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
[ https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18850: -- Status: Patch Available (was: In Progress) > Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS) > - > > Key: HADOOP-18850 > URL: https://issues.apache.org/jira/browse/HADOOP-18850 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, security >Reporter: Akira Ajisaka >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > Add support for DSSE-KMS > https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18931) FileSystem.getFileSystemClass() to log at debug the jar the .class came from
[ https://issues.apache.org/jira/browse/HADOOP-18931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775281#comment-17775281 ] Viraj Jasani commented on HADOOP-18931: --- i thought we were already logging it during the first time init of fs for the given JVM {code:java} try { SERVICE_FILE_SYSTEMS.put(fs.getScheme(), fs.getClass()); if (LOGGER.isDebugEnabled()) { LOGGER.debug("{}:// = {} from {}", fs.getScheme(), fs.getClass(), ClassUtil.findContainingJar(fs.getClass())); } } catch (Exception e) { LOGGER.warn("Cannot load: {} from {}", fs, ClassUtil.findContainingJar(fs.getClass())); LOGGER.info("Full exception loading: {}", fs, e); } {code} maybe you are suggesting that we should log it for every call to {_}getFileSystemClass(){_}, correct? > FileSystem.getFileSystemClass() to log at debug the jar the .class came from > > > Key: HADOOP-18931 > URL: https://issues.apache.org/jira/browse/HADOOP-18931 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Priority: Minor > > we want to be able to log the jar the filesystem implementation class, so > that we can identify which version of a module the class came from. > this is to help track down problems where different machines in the cluster > or the .tar.gz bundle is out of date. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE/DSSE encryption is used
[ https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18918: -- Summary: ITestS3GuardTool fails if SSE/DSSE encryption is used (was: ITestS3GuardTool fails if SSE encryption is used) > ITestS3GuardTool fails if SSE/DSSE encryption is used > - > > Key: HADOOP-18918 > URL: https://issues.apache.org/jira/browse/HADOOP-18918 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.6 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > > {code:java} > [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool > [ERROR] > testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool) > Time elapsed: 0.807 s <<< ERROR! > 46: Bucket s3a://landsat-pds: required encryption is none but actual > encryption is DSSE-KMS > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147) > at > org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114) > at > org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:750) > {code} > Since landsat requires none encryption, the test should be skipped for any > encryption algorithm. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE encryption is used
[ https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18918: -- Priority: Minor (was: Major) > ITestS3GuardTool fails if SSE encryption is used > > > Key: HADOOP-18918 > URL: https://issues.apache.org/jira/browse/HADOOP-18918 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.6 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > > {code:java} > [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool > [ERROR] > testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool) > Time elapsed: 0.807 s <<< ERROR! > 46: Bucket s3a://landsat-pds: required encryption is none but actual > encryption is DSSE-KMS > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147) > at > org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114) > at > org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:750) > {code} > Since landsat requires none encryption, the test should be skipped for any > encryption algorithm. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18918) ITestS3GuardTool fails if SSE encryption is used
[ https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18918: - Assignee: Viraj Jasani > ITestS3GuardTool fails if SSE encryption is used > > > Key: HADOOP-18918 > URL: https://issues.apache.org/jira/browse/HADOOP-18918 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.6 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > {code:java} > [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool > [ERROR] > testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool) > Time elapsed: 0.807 s <<< ERROR! > 46: Bucket s3a://landsat-pds: required encryption is none but actual > encryption is DSSE-KMS > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147) > at > org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114) > at > org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:750) > {code} > Since landsat requires none encryption, the test should be skipped for any > encryption algorithm. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-18918) ITestS3GuardTool fails if SSE encryption is used
Viraj Jasani created HADOOP-18918: - Summary: ITestS3GuardTool fails if SSE encryption is used Key: HADOOP-18918 URL: https://issues.apache.org/jira/browse/HADOOP-18918 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3, test Affects Versions: 3.3.6 Reporter: Viraj Jasani {code:java} [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool [ERROR] testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool) Time elapsed: 0.807 s <<< ERROR! 46: Bucket s3a://landsat-pds: required encryption is none but actual encryption is DSSE-KMS at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915) at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881) at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511) at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963) at org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147) at org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114) at org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:750) {code} Since landsat requires none encryption, the test should be skipped for any encryption algorithm. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
[ https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18850: - Assignee: Viraj Jasani > Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS) > - > > Key: HADOOP-18850 > URL: https://issues.apache.org/jira/browse/HADOOP-18850 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, security >Reporter: Akira Ajisaka >Assignee: Viraj Jasani >Priority: Major > > Add support for DSSE-KMS > https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18915) HTTP timeouts are not set correctly
[ https://issues.apache.org/jira/browse/HADOOP-18915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770745#comment-17770745 ] Viraj Jasani commented on HADOOP-18915: --- Nice find! > HTTP timeouts are not set correctly > --- > > Key: HADOOP-18915 > URL: https://issues.apache.org/jira/browse/HADOOP-18915 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Ahmar Suhail >Priority: Major > > In the client config builders, when [setting > timeouts|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSClientConfig.java#L120], > it uses Duration.ofSeconds(), configs all use milliseconds so this needs to > be updated to Duration.ofMillis(). > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18208) Remove all the log4j reference in modules other than hadoop-logging
[ https://issues.apache.org/jira/browse/HADOOP-18208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18208: - Assignee: (was: Viraj Jasani) > Remove all the log4j reference in modules other than hadoop-logging > --- > > Key: HADOOP-18208 > URL: https://issues.apache.org/jira/browse/HADOOP-18208 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Duo Zhang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-16206) Migrate from Log4j1 to Log4j2
[ https://issues.apache.org/jira/browse/HADOOP-16206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-16206: - Assignee: (was: Viraj Jasani) > Migrate from Log4j1 to Log4j2 > - > > Key: HADOOP-16206 > URL: https://issues.apache.org/jira/browse/HADOOP-16206 > Project: Hadoop Common > Issue Type: Task >Affects Versions: 3.3.0 >Reporter: Akira Ajisaka >Priority: Major > Attachments: HADOOP-16206-wip.001.patch > > > This sub-task is to remove log4j1 dependency and add log4j2 dependency. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18207) Introduce hadoop-logging module
[ https://issues.apache.org/jira/browse/HADOOP-18207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18207: - Assignee: (was: Viraj Jasani) > Introduce hadoop-logging module > --- > > Key: HADOOP-18207 > URL: https://issues.apache.org/jira/browse/HADOOP-18207 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Duo Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > There are several goals here: > # Provide the ability to change log level, get log level, etc. > # Place all the appender implementation(?) > # Hide the real logging implementation. > # Later we could remove all the log4j references in other hadoop module. > # Move as much log4j usage to the module as possible. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-15984) Update jersey from 1.19 to 2.x
[ https://issues.apache.org/jira/browse/HADOOP-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-15984: - Assignee: (was: Viraj Jasani) > Update jersey from 1.19 to 2.x > -- > > Key: HADOOP-15984 > URL: https://issues.apache.org/jira/browse/HADOOP-15984 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Akira Ajisaka >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > jersey-json 1.19 depends on Jackson 1.9.2. Let's upgrade. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
[ https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755406#comment-17755406 ] Viraj Jasani commented on HADOOP-18850: --- [~ste...@apache.org] are you in favor of this before v2 sdk upgrade? > Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS) > - > > Key: HADOOP-18850 > URL: https://issues.apache.org/jira/browse/HADOOP-18850 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, security >Reporter: Akira Ajisaka >Priority: Major > > Add support for DSSE-KMS > https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
[ https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755394#comment-17755394 ] Viraj Jasani commented on HADOOP-18850: --- only recently HADOOP-18832 bumped sdk bundle to 1.12.499, so looks like we can support this > Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS) > - > > Key: HADOOP-18850 > URL: https://issues.apache.org/jira/browse/HADOOP-18850 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, security >Reporter: Akira Ajisaka >Priority: Major > > Add support for DSSE-KMS > https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
[ https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755392#comment-17755392 ] Viraj Jasani edited comment on HADOOP-18850 at 8/17/23 7:13 AM: it seems SSEAlgorithm added DSSE as part of 1.12.488 release: [https://github.com/aws/aws-sdk-java/releases/tag/1.12.488] {code:java} public enum SSEAlgorithm { AES256("AES256"), KMS("aws:kms"), DSSE("aws:kms:dsse"), ;{code} was (Author: vjasani): SSEAlgorithm added DSSE as part of 1.12.488 release: [https://github.com/aws/aws-sdk-java/releases/tag/1.12.488] {code:java} public enum SSEAlgorithm { AES256("AES256"), KMS("aws:kms"), DSSE("aws:kms:dsse"), ;{code} > Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS) > - > > Key: HADOOP-18850 > URL: https://issues.apache.org/jira/browse/HADOOP-18850 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, security >Reporter: Akira Ajisaka >Priority: Major > > Add support for DSSE-KMS > https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
[ https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755392#comment-17755392 ] Viraj Jasani commented on HADOOP-18850: --- SSEAlgorithm added DSSE as part of 1.12.488 release: [https://github.com/aws/aws-sdk-java/releases/tag/1.12.488] {code:java} public enum SSEAlgorithm { AES256("AES256"), KMS("aws:kms"), DSSE("aws:kms:dsse"), ;{code} > Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS) > - > > Key: HADOOP-18850 > URL: https://issues.apache.org/jira/browse/HADOOP-18850 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, security >Reporter: Akira Ajisaka >Priority: Major > > Add support for DSSE-KMS > https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18852) S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look like random IO
[ https://issues.apache.org/jira/browse/HADOOP-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755385#comment-17755385 ] Viraj Jasani commented on HADOOP-18852: --- {quote}for other reads, we may want a bigger prefech count than 1, depending on: split start/end, file read policy (random, sequential, whole-file) {quote} this means we first need prefetch read policy (HADOOP-18791), correct? > S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look > like random IO > -- > > Key: HADOOP-18852 > URL: https://issues.apache.org/jira/browse/HADOOP-18852 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Priority: Major > > noticed in HADOOP-18184, but I think it's a big enough issue to be dealt with > separately. > # all seeks are lazy; no fetching is kicked off after an open > # the first read is treated as an out of order read, so cancels any active > reads (don't think there are any) and then only asks for 1 block > {code} > if (outOfOrderRead) { > LOG.debug("lazy-seek({})", getOffsetStr(readPos)); > blockManager.cancelPrefetches(); > // We prefetch only 1 block immediately after a seek operation. > prefetchCount = 1; > } > {code} > * for any read fully we should prefetch all blocks in the range requested > * for other reads, we may want a bigger prefech count than 1, depending on: > split start/end, file read policy (random, sequential, whole-file) > * also, if a read is in a block other than the current one, but which is > already being fetched or cached, is this really an OOO read to the extent > that outstanding fetches should be cancelled? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18852) S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look like random IO
[ https://issues.apache.org/jira/browse/HADOOP-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755384#comment-17755384 ] Viraj Jasani commented on HADOOP-18852: --- {quote}also, if a read is in a block other than the current one, but which is already being fetched or cached, is this really an OOO read to the extent that outstanding fetches should be cancelled? {quote} +1 to this, now that i checked some logs, can see lazy-seek for every first seek + read on the given block: {code:java} DEBUG prefetch.S3ACachingInputStream (S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(0:0) DEBUG prefetch.S3ACachingInputStream (S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(4:40960) DEBUG prefetch.S3ACachingInputStream (S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(3:30720) DEBUG prefetch.S3ACachingInputStream (S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(2:20480){code} but it's also valid that if the block was being cached, why cancel the outstanding fetches. > S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look > like random IO > -- > > Key: HADOOP-18852 > URL: https://issues.apache.org/jira/browse/HADOOP-18852 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Priority: Major > > noticed in HADOOP-18184, but I think it's a big enough issue to be dealt with > separately. > # all seeks are lazy; no fetching is kicked off after an open > # the first read is treated as an out of order read, so cancels any active > reads (don't think there are any) and then only asks for 1 block > {code} > if (outOfOrderRead) { > LOG.debug("lazy-seek({})", getOffsetStr(readPos)); > blockManager.cancelPrefetches(); > // We prefetch only 1 block immediately after a seek operation. > prefetchCount = 1; > } > {code} > * for any read fully we should prefetch all blocks in the range requested > * for other reads, we may want a bigger prefech count than 1, depending on: > split start/end, file read policy (random, sequential, whole-file) > * also, if a read is in a block other than the current one, but which is > already being fetched or cached, is this really an OOO read to the extent > that outstanding fetches should be cancelled? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18829) s3a prefetch LRU cache eviction metric
[ https://issues.apache.org/jira/browse/HADOOP-18829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17750035#comment-17750035 ] Viraj Jasani commented on HADOOP-18829: --- sure thing, i think this can wait for sure. thanks > s3a prefetch LRU cache eviction metric > -- > > Key: HADOOP-18829 > URL: https://issues.apache.org/jira/browse/HADOOP-18829 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > Follow-up from HADOOP-18291: > Add new IO statistics metric to capture s3a prefetch LRU cache eviction. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+
[ https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17748981#comment-17748981 ] Viraj Jasani commented on HADOOP-18832: --- ITestS3AFileContextStatistics#testStatistics is flaky: {code:java} [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.983 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextStatistics [ERROR] testStatistics(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextStatistics) Time elapsed: 1.776 s <<< FAILURE! java.lang.AssertionError: expected:<512> but was:<448> at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:633) at org.apache.hadoop.fs.FCStatisticsBaseTest.testStatistics(FCStatisticsBaseTest.java:108) {code} This only happened once, now unable to reproduce it locally. > Upgrade aws-java-sdk to 1.12.499+ > - > > Key: HADOOP-18832 > URL: https://issues.apache.org/jira/browse/HADOOP-18832 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence > showing up in security CVE scans (CVE-2023-34462). The safe version for netty > is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+
[ https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17748980#comment-17748980 ] Viraj Jasani commented on HADOOP-18832: --- Testing in progress: Test results look good with -scale and -prefetch so far. Now running some encryption tests (bucket with algo: SSE-KMS). > Upgrade aws-java-sdk to 1.12.499+ > - > > Key: HADOOP-18832 > URL: https://issues.apache.org/jira/browse/HADOOP-18832 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence > showing up in security CVE scans (CVE-2023-34462). The safe version for netty > is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+
[ https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18832: -- Description: aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence showing up in security CVE scans (CVE-2023-34462). The safe version for netty is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+ (was: aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence showing up in security CVE scans (CVE-2023-34462). The safe version for netty is 4.1.94.Final and this is used by aws-java-adk:1.12.499+) > Upgrade aws-java-sdk to 1.12.499+ > - > > Key: HADOOP-18832 > URL: https://issues.apache.org/jira/browse/HADOOP-18832 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence > showing up in security CVE scans (CVE-2023-34462). The safe version for netty > is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+
[ https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18832: - Assignee: Viraj Jasani > Upgrade aws-java-sdk to 1.12.499+ > - > > Key: HADOOP-18832 > URL: https://issues.apache.org/jira/browse/HADOOP-18832 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence > showing up in security CVE scans (CVE-2023-34462). The safe version for netty > is 4.1.94.Final and this is used by aws-java-adk:1.12.499+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+
Viraj Jasani created HADOOP-18832: - Summary: Upgrade aws-java-sdk to 1.12.499+ Key: HADOOP-18832 URL: https://issues.apache.org/jira/browse/HADOOP-18832 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Reporter: Viraj Jasani aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence showing up in security CVE scans (CVE-2023-34462). The safe version for netty is 4.1.94.Final and this is used by aws-java-adk:1.12.499+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-18829) s3a prefetch LRU cache eviction metric
Viraj Jasani created HADOOP-18829: - Summary: s3a prefetch LRU cache eviction metric Key: HADOOP-18829 URL: https://issues.apache.org/jira/browse/HADOOP-18829 Project: Hadoop Common Issue Type: Sub-task Reporter: Viraj Jasani Assignee: Viraj Jasani Follow-up from HADOOP-18291: Add new IO statistics metric to capture s3a prefetch LRU cache eviction. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-18809) s3a prefetch read/write file operations should guard channel close
Viraj Jasani created HADOOP-18809: - Summary: s3a prefetch read/write file operations should guard channel close Key: HADOOP-18809 URL: https://issues.apache.org/jira/browse/HADOOP-18809 Project: Hadoop Common Issue Type: Sub-task Reporter: Viraj Jasani Assignee: Viraj Jasani As per Steve's suggestion from s3a prefetch LRU cache, s3a prefetch disk based cache file read and write operations should guard against close of FileChannel and WritableByteChannel, close them even if read/write operations throw IOException. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data
[ https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743344#comment-17743344 ] Viraj Jasani edited comment on HADOOP-18805 at 7/17/23 8:15 PM: sorry Steve, i was not aware you already created this Jira, i created PR for letting LRU tests use small files rather than landsat: [https://github.com/apache/hadoop/pull/5851] {quote}also, and this is very, very important, they can't validate the data {quote} i was about to create a sub-task for this as i am planning to refactor Entry to it's own class and have the contents of the linked list data tested in UT (discussed with Mehakmeet in the earlier part of the review). i can take this up as new sub-task and for the current Jira, we can focus on tests using small files for the better break-down? PR review discussion: [https://github.com/apache/hadoop/pull/5754#discussion_r1247476231] was (Author: vjasani): sorry Steve, i was not aware you already created this Jira, i created addendum for letting LRU test depend on small file rather than large one: [https://github.com/apache/hadoop/pull/5843] {quote}also, and this is very, very important, they can't validate the data {quote} i was about to create a sub-task for this as i am planning to refactor Entry to it's own class and have the contents of the linked list data tested in UT (discussed with Mehakmeet in the earlier part of the review). maybe i can do the work as part of this Jira. are you fine with? * the above addendum PR for using small file in the test (so that we don't need to put the test under -scale) * this Jira to refactor Entry and allowing a UT to test the contents of the linked list if you think above PR is not good for an addendum and should rather be linked to this Jira, i can change PR title to reflect this Jira number and i can create another sub-task to write simple UT that can test contents of the linked list from head to tail. > s3a large file prefetch tests are too slow, don't validate data > --- > > Key: HADOOP-18805 > URL: https://issues.apache.org/jira/browse/HADOOP-18805 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.9 >Reporter: Steve Loughran >Priority: Major > Labels: pull-request-available > > the large file prefetch tests (including LRU cache eviction) are really slow. > moving under -scale may hide the problem for most runs, but they are still > too slow, can time out, etc etc. > also, and this is very, very important, they can't validate the data. > Better: > * test on smaller files by setting a very small block size (1k bytes or less) > just to force paged reads of a small 16k file. > * with known contents to the values of all forms of read can be validated > * maybe the LRU tests can work with a fake remote object which can then be > used in a unit test > * extend one of the huge file tests to read from there -including s3-CSE > encryption coverage. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data
[ https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743344#comment-17743344 ] Viraj Jasani edited comment on HADOOP-18805 at 7/15/23 6:48 AM: sorry Steve, i was not aware you already created this Jira, i created addendum for letting LRU test depend on small file rather than large one: [https://github.com/apache/hadoop/pull/5843] {quote}also, and this is very, very important, they can't validate the data {quote} i was about to create a sub-task for this as i am planning to refactor Entry to it's own class and have the contents of the linked list data tested in UT (discussed with Mehakmeet in the earlier part of the review). maybe i can do the work as part of this Jira. are you fine with? * the above addendum PR for using small file in the test (so that we don't need to put the test under -scale) * this Jira to refactor Entry and allowing a UT to test the contents of the linked list if you think above PR is not good for an addendum and should rather be linked to this Jira, i can change PR title to reflect this Jira number and i can create another sub-task to write simple UT that can test contents of the linked list from head to tail. was (Author: vjasani): sorry Steve, i was not aware you already created this Jira, i created addendum for letting LRU test depend on small file rather than large one: [https://github.com/apache/hadoop/pull/5843] {quote}also, and this is very, very important, they can't validate the data {quote} i was about to create a sub-task for this as i am planning to refactor Entry to it's own class and have the contents of the linked list data tested in UT (discussed with Mehakmeet in the earlier part of the review). maybe i can do the work as part of this Jira. are you fine with the above addendum PR taking care of using small file in the test (so that we don't need to put the test under -scale) and this Jira being used for refactoring Entry and allowing a UT to test the contents of the linked list? > s3a large file prefetch tests are too slow, don't validate data > --- > > Key: HADOOP-18805 > URL: https://issues.apache.org/jira/browse/HADOOP-18805 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.9 >Reporter: Steve Loughran >Priority: Major > > the large file prefetch tests (including LRU cache eviction) are really slow. > moving under -scale may hide the problem for most runs, but they are still > too slow, can time out, etc etc. > also, and this is very, very important, they can't validate the data. > Better: > * test on smaller files by setting a very small block size (1k bytes or less) > just to force paged reads of a small 16k file. > * with known contents to the values of all forms of read can be validated > * maybe the LRU tests can work with a fake remote object which can then be > used in a unit test > * extend one of the huge file tests to read from there -including s3-CSE > encryption coverage. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org