[GitHub] [hadoop] jianghuazhu opened a new pull request #2741: HDFS-15855.Solve the problem of incorrect EC progress when loading FsImage.
jianghuazhu opened a new pull request #2741: URL: https://github.com/apache/hadoop/pull/2741 …Image. ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.) For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17546) Update Description of hadoop-http-auth-signature-secret in HttpAuthentication.md
[ https://issues.apache.org/jira/browse/HADOOP-17546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HADOOP-17546: --- Fix Version/s: 3.2.3 2.10.2 3.1.5 3.4.0 3.3.1 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to all the active branches. Thank you [~Sushma_28] for your contribution! > Update Description of hadoop-http-auth-signature-secret in > HttpAuthentication.md > > > Key: HADOOP-17546 > URL: https://issues.apache.org/jira/browse/HADOOP-17546 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Ravuri Sushma sree >Assignee: Ravuri Sushma sree >Priority: Minor > Fix For: 3.3.1, 3.4.0, 3.1.5, 2.10.2, 3.2.3 > > Attachments: HADOOP-17546.001.patch > > > The HttpAuthentication.md document says "The same secret should be used for > all nodes in the cluster, ResourceManager, NameNode, DataNode and > NodeManager" but the secret should be different for each service. This > description is updated in > [core-default.xml|https://github.com/apache/hadoop/commit/d82009599a2e9f48050e0c41440b36c759ec068f#diff-268b9968a4db21ac6eeb7bcaef10e4db744d00ba53989fc7251bb3e8d9eac7df] > but has to be updated in HttpAuthentication.md as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17546) Update Description of hadoop-http-auth-signature-secret in HttpAuthentication.md
[ https://issues.apache.org/jira/browse/HADOOP-17546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295008#comment-17295008 ] Akira Ajisaka commented on HADOOP-17546: +1 > Update Description of hadoop-http-auth-signature-secret in > HttpAuthentication.md > > > Key: HADOOP-17546 > URL: https://issues.apache.org/jira/browse/HADOOP-17546 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Ravuri Sushma sree >Assignee: Ravuri Sushma sree >Priority: Minor > Attachments: HADOOP-17546.001.patch > > > The HttpAuthentication.md document says "The same secret should be used for > all nodes in the cluster, ResourceManager, NameNode, DataNode and > NodeManager" but the secret should be different for each service. This > description is updated in > [core-default.xml|https://github.com/apache/hadoop/commit/d82009599a2e9f48050e0c41440b36c759ec068f#diff-268b9968a4db21ac6eeb7bcaef10e4db744d00ba53989fc7251bb3e8d9eac7df] > but has to be updated in HttpAuthentication.md as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17563) Update Bouncy Castle to 1.68
[ https://issues.apache.org/jira/browse/HADOOP-17563?focusedWorklogId=560834=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560834 ] ASF GitHub Bot logged work on HADOOP-17563: --- Author: ASF GitHub Bot Created on: 04/Mar/21 05:47 Start Date: 04/Mar/21 05:47 Worklog Time Spent: 10m Work Description: tasanuma opened a new pull request #2740: URL: https://github.com/apache/hadoop/pull/2740 ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.) For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560834) Time Spent: 0.5h (was: 20m) > Update Bouncy Castle to 1.68 > > > Key: HADOOP-17563 > URL: https://issues.apache.org/jira/browse/HADOOP-17563 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Bouncy Castle 1.60 has Hash Collision Vulnerability. Let's update to 1.68. > https://www.sourceclear.com/vulnerability-database/security/hash-collision/java/sid-6009 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17563) Update Bouncy Castle to 1.68
[ https://issues.apache.org/jira/browse/HADOOP-17563?focusedWorklogId=560835=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560835 ] ASF GitHub Bot logged work on HADOOP-17563: --- Author: ASF GitHub Bot Created on: 04/Mar/21 05:47 Start Date: 04/Mar/21 05:47 Worklog Time Spent: 10m Work Description: aajisaka closed pull request #2740: URL: https://github.com/apache/hadoop/pull/2740 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560835) Time Spent: 40m (was: 0.5h) > Update Bouncy Castle to 1.68 > > > Key: HADOOP-17563 > URL: https://issues.apache.org/jira/browse/HADOOP-17563 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Bouncy Castle 1.60 has Hash Collision Vulnerability. Let's update to 1.68. > https://www.sourceclear.com/vulnerability-database/security/hash-collision/java/sid-6009 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17563) Update Bouncy Castle to 1.68
[ https://issues.apache.org/jira/browse/HADOOP-17563?focusedWorklogId=560833=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560833 ] ASF GitHub Bot logged work on HADOOP-17563: --- Author: ASF GitHub Bot Created on: 04/Mar/21 05:47 Start Date: 04/Mar/21 05:47 Worklog Time Spent: 10m Work Description: aajisaka commented on pull request #2740: URL: https://github.com/apache/hadoop/pull/2740#issuecomment-790311867 LGTM, pending Jenkins This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560833) Time Spent: 20m (was: 10m) > Update Bouncy Castle to 1.68 > > > Key: HADOOP-17563 > URL: https://issues.apache.org/jira/browse/HADOOP-17563 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Bouncy Castle 1.60 has Hash Collision Vulnerability. Let's update to 1.68. > https://www.sourceclear.com/vulnerability-database/security/hash-collision/java/sid-6009 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] aajisaka closed pull request #2740: HADOOP-17563. Update Bouncy Castle to 1.68.
aajisaka closed pull request #2740: URL: https://github.com/apache/hadoop/pull/2740 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] aajisaka commented on pull request #2740: HADOOP-17563. Update Bouncy Castle to 1.68.
aajisaka commented on pull request #2740: URL: https://github.com/apache/hadoop/pull/2740#issuecomment-790311867 LGTM, pending Jenkins This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-17564) Fix typo in UnixShellGuide.html
Takanobu Asanuma created HADOOP-17564: - Summary: Fix typo in UnixShellGuide.html Key: HADOOP-17564 URL: https://issues.apache.org/jira/browse/HADOOP-17564 Project: Hadoop Common Issue Type: Bug Reporter: Takanobu Asanuma The file name of hadoop-user-functions.sh.examples should be hadoop-user-functions.sh.example in UnixShellGuide.html. This is reported by [~aref.kh] in HADOOP-17561. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] tomscut commented on pull request #2668: HDFS-15808. Add metrics for FSNamesystem read/write lock hold long time
tomscut commented on pull request #2668: URL: https://github.com/apache/hadoop/pull/2668#issuecomment-790235902 Hi @shvachko , I uploaded a patch for branch-3.3 in JIRA, please help to review it. Thank you. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17563) Update Bouncy Castle to 1.68
[ https://issues.apache.org/jira/browse/HADOOP-17563?focusedWorklogId=560785=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560785 ] ASF GitHub Bot logged work on HADOOP-17563: --- Author: ASF GitHub Bot Created on: 04/Mar/21 02:06 Start Date: 04/Mar/21 02:06 Worklog Time Spent: 10m Work Description: tasanuma opened a new pull request #2740: URL: https://github.com/apache/hadoop/pull/2740 ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.) For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560785) Remaining Estimate: 0h Time Spent: 10m > Update Bouncy Castle to 1.68 > > > Key: HADOOP-17563 > URL: https://issues.apache.org/jira/browse/HADOOP-17563 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Bouncy Castle 1.60 has Hash Collision Vulnerability. Let's update to 1.68. > https://www.sourceclear.com/vulnerability-database/security/hash-collision/java/sid-6009 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17563) Update Bouncy Castle to 1.68
[ https://issues.apache.org/jira/browse/HADOOP-17563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HADOOP-17563: Labels: pull-request-available (was: ) > Update Bouncy Castle to 1.68 > > > Key: HADOOP-17563 > URL: https://issues.apache.org/jira/browse/HADOOP-17563 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Bouncy Castle 1.60 has Hash Collision Vulnerability. Let's update to 1.68. > https://www.sourceclear.com/vulnerability-database/security/hash-collision/java/sid-6009 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17563) Update Bouncy Castle to 1.68
[ https://issues.apache.org/jira/browse/HADOOP-17563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated HADOOP-17563: -- Status: Patch Available (was: Open) > Update Bouncy Castle to 1.68 > > > Key: HADOOP-17563 > URL: https://issues.apache.org/jira/browse/HADOOP-17563 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Bouncy Castle 1.60 has Hash Collision Vulnerability. Let's update to 1.68. > https://www.sourceclear.com/vulnerability-database/security/hash-collision/java/sid-6009 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] tasanuma opened a new pull request #2740: HADOOP-17563. Update Bouncy Castle to 1.68.
tasanuma opened a new pull request #2740: URL: https://github.com/apache/hadoop/pull/2740 ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.) For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-17563) Update Bouncy Castle to 1.68
Takanobu Asanuma created HADOOP-17563: - Summary: Update Bouncy Castle to 1.68 Key: HADOOP-17563 URL: https://issues.apache.org/jira/browse/HADOOP-17563 Project: Hadoop Common Issue Type: Improvement Reporter: Takanobu Asanuma Assignee: Takanobu Asanuma Bouncy Castle 1.60 has Hash Collision Vulnerability. Let's update to 1.68. https://www.sourceclear.com/vulnerability-database/security/hash-collision/java/sid-6009 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17552) Change ipc.client.rpc-timeout.ms from 0 to 120000 by default to avoid potential hang
[ https://issues.apache.org/jira/browse/HADOOP-17552?focusedWorklogId=560764=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560764 ] ASF GitHub Bot logged work on HADOOP-17552: --- Author: ASF GitHub Bot Created on: 04/Mar/21 01:08 Start Date: 04/Mar/21 01:08 Worklog Time Spent: 10m Work Description: ferhui commented on pull request #2727: URL: https://github.com/apache/hadoop/pull/2727#issuecomment-790202955 @functioner As @iwasakims said, you can add `conf.setInt(CommonConfigurationKeys.IPC_CLIENT_RPC_TIMEOUT_KEY, 0);` before `assertEquals(Client.getTimeout(config), -1);` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560764) Time Spent: 8h 20m (was: 8h 10m) > Change ipc.client.rpc-timeout.ms from 0 to 12 by default to avoid > potential hang > > > Key: HADOOP-17552 > URL: https://issues.apache.org/jira/browse/HADOOP-17552 > Project: Hadoop Common > Issue Type: Bug > Components: ipc >Affects Versions: 3.2.2 >Reporter: Haoze Wu >Priority: Major > Labels: pull-request-available > Time Spent: 8h 20m > Remaining Estimate: 0h > > We are doing some systematic fault injection testing in Hadoop-3.2.2 and > when we try to run a client (e.g., `bin/hdfs dfs -ls /`) to our HDFS cluster > (1 NameNode, 2 DataNodes), the client gets stuck forever. After some > investigation, we believe that it’s a bug in `hadoop.ipc.Client` because the > read method of `hadoop.ipc.Client$Connection$PingInputStream` keeps > swallowing `java.net.SocketTimeoutException` due to the mistaken usage of the > `rpcTimeout` configuration in the `handleTimeout` method. > *Reproduction* > Start HDFS with the default configuration. Then execute a client (we used > the command `bin/hdfs dfs -ls /` in the terminal). While HDFS is trying to > accept the client’s socket, inject a socket error (java.net.SocketException > or java.io.IOException), specifically at line 1402 (line 1403 or 1404 will > also work). > We prepare the scripts for reproduction in a gist > ([https://gist.github.com/functioner/08bcd86491b8ff32860eafda8c140e24]). > *Diagnosis* > When the NameNode tries to accept a client’s socket, basically there are > 4 steps: > # accept the socket (line 1400) > # configure the socket (line 1402-1404) > # make the socket a Reader (after line 1404) > # swallow the possible IOException in line 1350 > {code:java} > //hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java > public void run() { > while (running) { > SelectionKey key = null; > try { > getSelector().select(); > Iterator iter = > getSelector().selectedKeys().iterator(); > while (iter.hasNext()) { > key = iter.next(); > iter.remove(); > try { > if (key.isValid()) { > if (key.isAcceptable()) > doAccept(key); > } > } catch (IOException e) { // line 1350 > } > key = null; > } > } catch (OutOfMemoryError e) { > // ... > } catch (Exception e) { > // ... > } > } > } > void doAccept(SelectionKey key) throws InterruptedException, IOException, > OutOfMemoryError { > ServerSocketChannel server = (ServerSocketChannel) key.channel(); > SocketChannel channel; > while ((channel = server.accept()) != null) { // line 1400 > channel.configureBlocking(false); // line 1402 > channel.socket().setTcpNoDelay(tcpNoDelay); // line 1403 > channel.socket().setKeepAlive(true); // line 1404 > > Reader reader = getReader(); > Connection c = connectionManager.register(channel, > this.listenPort, this.isOnAuxiliaryPort); > // If the connectionManager can't take it, close the connection. > if (c == null) { > if (channel.isOpen()) { > IOUtils.cleanup(null, channel); > } > connectionManager.droppedConnections.getAndIncrement(); > continue; > } > key.attach(c); // so closeCurrentConnection can get the object > reader.addConnection(c); > } > } > {code} > When
[GitHub] [hadoop] ferhui commented on pull request #2727: HADOOP-17552. Change ipc.client.rpc-timeout.ms from 0 to 120000 by default to avoid potential hang
ferhui commented on pull request #2727: URL: https://github.com/apache/hadoop/pull/2727#issuecomment-790202955 @functioner As @iwasakims said, you can add `conf.setInt(CommonConfigurationKeys.IPC_CLIENT_RPC_TIMEOUT_KEY, 0);` before `assertEquals(Client.getTimeout(config), -1);` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17546) Update Description of hadoop-http-auth-signature-secret in HttpAuthentication.md
[ https://issues.apache.org/jira/browse/HADOOP-17546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17294764#comment-17294764 ] Ravuri Sushma sree commented on HADOOP-17546: - I have uploaded a patch correcting the description in HttpAuthentication.md , Could you please Review [~aajisaka] > Update Description of hadoop-http-auth-signature-secret in > HttpAuthentication.md > > > Key: HADOOP-17546 > URL: https://issues.apache.org/jira/browse/HADOOP-17546 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Ravuri Sushma sree >Assignee: Ravuri Sushma sree >Priority: Minor > Attachments: HADOOP-17546.001.patch > > > The HttpAuthentication.md document says "The same secret should be used for > all nodes in the cluster, ResourceManager, NameNode, DataNode and > NodeManager" but the secret should be different for each service. This > description is updated in > [core-default.xml|https://github.com/apache/hadoop/commit/d82009599a2e9f48050e0c41440b36c759ec068f#diff-268b9968a4db21ac6eeb7bcaef10e4db744d00ba53989fc7251bb3e8d9eac7df] > but has to be updated in HttpAuthentication.md as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] billierinaldi commented on a change in pull request #1925: HADOOP-16948. Support single writer dirs.
billierinaldi commented on a change in pull request #1925: URL: https://github.com/apache/hadoop/pull/1925#discussion_r586705157 ## File path: hadoop-tools/hadoop-azure/src/site/markdown/abfs.md ## @@ -877,6 +877,21 @@ enabled for your Azure Storage account." The directories can be specified as comma separated values. By default the value is "/hbase" +### Single Writer Options +`fs.azure.singlewriter.directories`: Directories for single writer support +can be specified comma separated in this config. By default, multiple +clients will be able to write to the same file simultaneously. When writing +to files contained within the directories specified in this config, the +client will obtain a lease on the file that will prevent any other clients +from writing to the file. The lease will be renewed by the client until the +output stream is closed, after which it will be released. To revoke a client's +write access for a file, the AzureBlobFilesystem breakLease method may be + called. + +`fs.azure.lease.threads`: This is the size of the thread pool that will be +used for lease operations for single writer directories. By default the value +is 0, so it must be set to at least 1 to support single writer directories. Review comment: Correction, single writer dirs accepts a list of paths, not URIs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-16948) ABFS: Support single writer dirs
[ https://issues.apache.org/jira/browse/HADOOP-16948?focusedWorklogId=560602=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560602 ] ASF GitHub Bot logged work on HADOOP-16948: --- Author: ASF GitHub Bot Created on: 03/Mar/21 19:09 Start Date: 03/Mar/21 19:09 Worklog Time Spent: 10m Work Description: billierinaldi commented on a change in pull request #1925: URL: https://github.com/apache/hadoop/pull/1925#discussion_r586705157 ## File path: hadoop-tools/hadoop-azure/src/site/markdown/abfs.md ## @@ -877,6 +877,21 @@ enabled for your Azure Storage account." The directories can be specified as comma separated values. By default the value is "/hbase" +### Single Writer Options +`fs.azure.singlewriter.directories`: Directories for single writer support +can be specified comma separated in this config. By default, multiple +clients will be able to write to the same file simultaneously. When writing +to files contained within the directories specified in this config, the +client will obtain a lease on the file that will prevent any other clients +from writing to the file. The lease will be renewed by the client until the +output stream is closed, after which it will be released. To revoke a client's +write access for a file, the AzureBlobFilesystem breakLease method may be + called. + +`fs.azure.lease.threads`: This is the size of the thread pool that will be +used for lease operations for single writer directories. By default the value +is 0, so it must be set to at least 1 to support single writer directories. Review comment: Correction, single writer dirs accepts a list of paths, not URIs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560602) Time Spent: 3h 40m (was: 3.5h) > ABFS: Support single writer dirs > > > Key: HADOOP-16948 > URL: https://issues.apache.org/jira/browse/HADOOP-16948 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Billie Rinaldi >Assignee: Billie Rinaldi >Priority: Minor > Labels: abfsactive, pull-request-available > Time Spent: 3h 40m > Remaining Estimate: 0h > > This would allow some directories to be configured as single writer > directories. The ABFS driver would obtain a lease when creating or opening a > file for writing and would automatically renew the lease and release the > lease when closing the file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-17562) Provide mechanism for explicitly specifying the compression codec for input files
Nicholas Chammas created HADOOP-17562: - Summary: Provide mechanism for explicitly specifying the compression codec for input files Key: HADOOP-17562 URL: https://issues.apache.org/jira/browse/HADOOP-17562 Project: Hadoop Common Issue Type: Improvement Reporter: Nicholas Chammas I come to you via SPARK-29280. I am looking for the file _input_ equivalents of the following settings: {code:java} mapreduce.output.fileoutputformat.compress mapreduce.map.output.compress{code} Right now, I understand that Hadoop infers the codec to use when reading a file from the file's extension. However, in some cases the files may have the incorrect extension or no extension. There are links to some examples from SPARK-29280. Ideally, you should be able to explicitly specify the codec to use to read those files. I don't believe that's possible today. Instead, the current workaround appears to be to [create a custom codec class|https://stackoverflow.com/a/17152167/877069] and override the getDefaultExtension method to specify the extension to expect. Does it make sense to offer an explicit way to select the compression codec for file input, mirroring how things work for file output? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17511) Add an Audit plugin point for S3A auditing/context
[ https://issues.apache.org/jira/browse/HADOOP-17511?focusedWorklogId=560560=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560560 ] ASF GitHub Bot logged work on HADOOP-17511: --- Author: ASF GitHub Bot Created on: 03/Mar/21 17:21 Start Date: 03/Mar/21 17:21 Worklog Time Spent: 10m Work Description: steveloughran commented on a change in pull request #2675: URL: https://github.com/apache/hadoop/pull/2675#discussion_r586621183 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java ## @@ -4532,8 +4814,10 @@ private HeaderProcessing getHeaderProcessing() { public RemoteIterator listLocatedStatus(final Path f, final PathFilter filter) throws FileNotFoundException, IOException { -entryPoint(INVOCATION_LIST_LOCATED_STATUS); Path path = qualify(f); +// Unless that iterator is closed, the iterator wouldn't be closed +// there. +entryPoint(INVOCATION_LIST_LOCATED_STATUS, path); Review comment: Latest PR is moving back to try-with-resource/some closure mechanism. As close() only deactivates the span, it does not close it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560560) Time Spent: 8h 20m (was: 8h 10m) > Add an Audit plugin point for S3A auditing/context > -- > > Key: HADOOP-17511 > URL: https://issues.apache.org/jira/browse/HADOOP-17511 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.1 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Time Spent: 8h 20m > Remaining Estimate: 0h > > Add a way for auditing tools to correlate S3 object calls with Hadoop FS API > calls. > Initially just to log/forward to an auditing service. > Later: let us attach them as parameters in S3 requests, such as opentrace > headeers or (my initial idea: http referrer header -where it will get into > the log) > Challenges > * ensuring the audit span is created for every public entry point. That will > have to include those used in s3guard tools, some defacto public APIs > * and not re-entered for active spans. s3A code must not call back into the > FS API points > * Propagation across worker threads -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #2675: HADOOP-17511. Add audit/telemetry logging to S3A connector
steveloughran commented on a change in pull request #2675: URL: https://github.com/apache/hadoop/pull/2675#discussion_r586621183 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java ## @@ -4532,8 +4814,10 @@ private HeaderProcessing getHeaderProcessing() { public RemoteIterator listLocatedStatus(final Path f, final PathFilter filter) throws FileNotFoundException, IOException { -entryPoint(INVOCATION_LIST_LOCATED_STATUS); Path path = qualify(f); +// Unless that iterator is closed, the iterator wouldn't be closed +// there. +entryPoint(INVOCATION_LIST_LOCATED_STATUS, path); Review comment: Latest PR is moving back to try-with-resource/some closure mechanism. As close() only deactivates the span, it does not close it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] tomscut commented on pull request #2739: HDFS-15870. Remove unused configuration dfs.namenode.stripe.min
tomscut commented on pull request #2739: URL: https://github.com/apache/hadoop/pull/2739#issuecomment-789750740 Thanks @tasanuma for merging. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17546) Update Description of hadoop-http-auth-signature-secret in HttpAuthentication.md
[ https://issues.apache.org/jira/browse/HADOOP-17546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17294547#comment-17294547 ] Hadoop QA commented on HADOOP-17546: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 16s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} markdownlint {color} | {color:blue} 0m 0s{color} | {color:blue}{color} | {color:blue} markdownlint was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 2s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 39m 26s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 12s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green}{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 59m 4s{color} | {color:black}{color} | {color:black}{color} | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/158/artifact/out/Dockerfile | | JIRA Issue | HADOOP-17546 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13021514/HADOOP-17546.001.patch | | Optional Tests | dupname asflicense mvnsite markdownlint | | uname | Linux c1f0f9841f2a 4.15.0-126-generic #129-Ubuntu SMP Mon Nov 23 18:53:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 8af56de1fa7 | | Max. process+thread count | 515 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/158/console | | versions | git=2.25.1 maven=3.6.3 | | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. > Update Description of hadoop-http-auth-signature-secret in > HttpAuthentication.md > > > Key: HADOOP-17546 > URL: https://issues.apache.org/jira/browse/HADOOP-17546 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Ravuri Sushma sree >Assignee: Ravuri Sushma sree >Priority: Minor > Attachments: HADOOP-17546.001.patch > > > The HttpAuthentication.md document says "The same secret should be used for > all nodes in the cluster, ResourceManager, NameNode, DataNode and > NodeManager" but the secret should be different for each service. This > description is updated in > [core-default.xml|https://github.com/apache/hadoop/commit/d82009599a2e9f48050e0c41440b36c759ec068f#diff-268b9968a4db21ac6eeb7bcaef10e4db744d00ba53989fc7251bb3e8d9eac7df] > but has to be updated in HttpAuthentication.md as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional
[GitHub] [hadoop] tasanuma commented on pull request #2739: HDFS-15870. Remove unused configuration dfs.namenode.stripe.min
tasanuma commented on pull request #2739: URL: https://github.com/apache/hadoop/pull/2739#issuecomment-789717373 Thanks for your contribution, @tomscut. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] tasanuma merged pull request #2739: HDFS-15870. Remove unused configuration dfs.namenode.stripe.min
tasanuma merged pull request #2739: URL: https://github.com/apache/hadoop/pull/2739 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] tasanuma commented on pull request #2739: HDFS-15870. Remove unused configuration dfs.namenode.stripe.min
tasanuma commented on pull request #2739: URL: https://github.com/apache/hadoop/pull/2739#issuecomment-789716946 The failed tests are not related. I will file them if necessary. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-17561) Where is hadoop-user-functions.sh.examples ?
[ https://issues.apache.org/jira/browse/HADOOP-17561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma resolved HADOOP-17561. --- Resolution: Invalid > Where is hadoop-user-functions.sh.examples ? > > > Key: HADOOP-17561 > URL: https://issues.apache.org/jira/browse/HADOOP-17561 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.1.3 >Reporter: Aref Khandan >Priority: Critical > > in UnixShellGuide page > [https://hadoop.apache.org/docs/r3.1.3/hadoop-project-dist/hadoop-common/UnixShellGuide.html] > it is mentioned that > {{Examples of function replacement are in the > ??{{hadoop-user-functions.sh.examples}}?? file.}} > I've searched through whole Hadoop directory and source code, but there is no > trace of this file except: > [hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md] > which only mentions ??Examples of function replacement are in the > `hadoop-user-functions.sh.examples` file.?? > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17561) Where is hadoop-user-functions.sh.examples ?
[ https://issues.apache.org/jira/browse/HADOOP-17561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17294524#comment-17294524 ] Takanobu Asanuma commented on HADOOP-17561: --- The filename is a typo. Please refer to the following file. [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-user-functions.sh.example] BTW, JIRA is for reporting bugs. Please use mailing lists for questions. > Where is hadoop-user-functions.sh.examples ? > > > Key: HADOOP-17561 > URL: https://issues.apache.org/jira/browse/HADOOP-17561 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.1.3 >Reporter: Aref Khandan >Priority: Critical > > in UnixShellGuide page > [https://hadoop.apache.org/docs/r3.1.3/hadoop-project-dist/hadoop-common/UnixShellGuide.html] > it is mentioned that > {{Examples of function replacement are in the > ??{{hadoop-user-functions.sh.examples}}?? file.}} > I've searched through whole Hadoop directory and source code, but there is no > trace of this file except: > [hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md] > which only mentions ??Examples of function replacement are in the > `hadoop-user-functions.sh.examples` file.?? > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17546) Update Description of hadoop-http-auth-signature-secret in HttpAuthentication.md
[ https://issues.apache.org/jira/browse/HADOOP-17546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravuri Sushma sree updated HADOOP-17546: Attachment: HADOOP-17546.001.patch Status: Patch Available (was: Open) > Update Description of hadoop-http-auth-signature-secret in > HttpAuthentication.md > > > Key: HADOOP-17546 > URL: https://issues.apache.org/jira/browse/HADOOP-17546 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Ravuri Sushma sree >Assignee: Ravuri Sushma sree >Priority: Minor > Attachments: HADOOP-17546.001.patch > > > The HttpAuthentication.md document says "The same secret should be used for > all nodes in the cluster, ResourceManager, NameNode, DataNode and > NodeManager" but the secret should be different for each service. This > description is updated in > [core-default.xml|https://github.com/apache/hadoop/commit/d82009599a2e9f48050e0c41440b36c759ec068f#diff-268b9968a4db21ac6eeb7bcaef10e4db744d00ba53989fc7251bb3e8d9eac7df] > but has to be updated in HttpAuthentication.md as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-17561) Where is hadoop-user-functions.sh.examples ?
Aref Khandan created HADOOP-17561: - Summary: Where is hadoop-user-functions.sh.examples ? Key: HADOOP-17561 URL: https://issues.apache.org/jira/browse/HADOOP-17561 Project: Hadoop Common Issue Type: Improvement Components: common Affects Versions: 3.1.3 Reporter: Aref Khandan in UnixShellGuide page [https://hadoop.apache.org/docs/r3.1.3/hadoop-project-dist/hadoop-common/UnixShellGuide.html] it is mentioned that {{Examples of function replacement are in the ??{{hadoop-user-functions.sh.examples}}?? file.}} I've searched through whole Hadoop directory and source code, but there is no trace of this file except: [hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md] which only mentions ??Examples of function replacement are in the `hadoop-user-functions.sh.examples` file.?? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17552) Change ipc.client.rpc-timeout.ms from 0 to 120000 by default to avoid potential hang
[ https://issues.apache.org/jira/browse/HADOOP-17552?focusedWorklogId=560418=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560418 ] ASF GitHub Bot logged work on HADOOP-17552: --- Author: ASF GitHub Bot Created on: 03/Mar/21 11:50 Start Date: 03/Mar/21 11:50 Worklog Time Spent: 10m Work Description: iwasakims commented on pull request #2727: URL: https://github.com/apache/hadoop/pull/2727#issuecomment-789658781 @functioner The `TestIPC#testClientGetTimeout` tests deprecated `Client#getTimeout` which was used before `ipc.client.rpc-timeout.ms` and `Client#getRpcTimeout` was introduced. Based on the context, the testClientGetTimeout should check the value of `Client#getTimeout` when `ipc.client.rpc-timeout.ms` is set to 0. (-1 is expected if ipc.client.ping is true (default)). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560418) Time Spent: 8h 10m (was: 8h) > Change ipc.client.rpc-timeout.ms from 0 to 12 by default to avoid > potential hang > > > Key: HADOOP-17552 > URL: https://issues.apache.org/jira/browse/HADOOP-17552 > Project: Hadoop Common > Issue Type: Bug > Components: ipc >Affects Versions: 3.2.2 >Reporter: Haoze Wu >Priority: Major > Labels: pull-request-available > Time Spent: 8h 10m > Remaining Estimate: 0h > > We are doing some systematic fault injection testing in Hadoop-3.2.2 and > when we try to run a client (e.g., `bin/hdfs dfs -ls /`) to our HDFS cluster > (1 NameNode, 2 DataNodes), the client gets stuck forever. After some > investigation, we believe that it’s a bug in `hadoop.ipc.Client` because the > read method of `hadoop.ipc.Client$Connection$PingInputStream` keeps > swallowing `java.net.SocketTimeoutException` due to the mistaken usage of the > `rpcTimeout` configuration in the `handleTimeout` method. > *Reproduction* > Start HDFS with the default configuration. Then execute a client (we used > the command `bin/hdfs dfs -ls /` in the terminal). While HDFS is trying to > accept the client’s socket, inject a socket error (java.net.SocketException > or java.io.IOException), specifically at line 1402 (line 1403 or 1404 will > also work). > We prepare the scripts for reproduction in a gist > ([https://gist.github.com/functioner/08bcd86491b8ff32860eafda8c140e24]). > *Diagnosis* > When the NameNode tries to accept a client’s socket, basically there are > 4 steps: > # accept the socket (line 1400) > # configure the socket (line 1402-1404) > # make the socket a Reader (after line 1404) > # swallow the possible IOException in line 1350 > {code:java} > //hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java > public void run() { > while (running) { > SelectionKey key = null; > try { > getSelector().select(); > Iterator iter = > getSelector().selectedKeys().iterator(); > while (iter.hasNext()) { > key = iter.next(); > iter.remove(); > try { > if (key.isValid()) { > if (key.isAcceptable()) > doAccept(key); > } > } catch (IOException e) { // line 1350 > } > key = null; > } > } catch (OutOfMemoryError e) { > // ... > } catch (Exception e) { > // ... > } > } > } > void doAccept(SelectionKey key) throws InterruptedException, IOException, > OutOfMemoryError { > ServerSocketChannel server = (ServerSocketChannel) key.channel(); > SocketChannel channel; > while ((channel = server.accept()) != null) { // line 1400 > channel.configureBlocking(false); // line 1402 > channel.socket().setTcpNoDelay(tcpNoDelay); // line 1403 > channel.socket().setKeepAlive(true); // line 1404 > > Reader reader = getReader(); > Connection c = connectionManager.register(channel, > this.listenPort, this.isOnAuxiliaryPort); > // If the connectionManager can't take it, close the connection. > if (c == null) { > if (channel.isOpen()) { > IOUtils.cleanup(null, channel); > } >
[GitHub] [hadoop] iwasakims commented on pull request #2727: HADOOP-17552. Change ipc.client.rpc-timeout.ms from 0 to 120000 by default to avoid potential hang
iwasakims commented on pull request #2727: URL: https://github.com/apache/hadoop/pull/2727#issuecomment-789658781 @functioner The `TestIPC#testClientGetTimeout` tests deprecated `Client#getTimeout` which was used before `ipc.client.rpc-timeout.ms` and `Client#getRpcTimeout` was introduced. Based on the context, the testClientGetTimeout should check the value of `Client#getTimeout` when `ipc.client.rpc-timeout.ms` is set to 0. (-1 is expected if ipc.client.ping is true (default)). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17552) Change ipc.client.rpc-timeout.ms from 0 to 120000 by default to avoid potential hang
[ https://issues.apache.org/jira/browse/HADOOP-17552?focusedWorklogId=560396=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560396 ] ASF GitHub Bot logged work on HADOOP-17552: --- Author: ASF GitHub Bot Created on: 03/Mar/21 11:24 Start Date: 03/Mar/21 11:24 Worklog Time Spent: 10m Work Description: ferhui commented on pull request #2727: URL: https://github.com/apache/hadoop/pull/2727#issuecomment-789644619 @functioner That's OK This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560396) Time Spent: 8h (was: 7h 50m) > Change ipc.client.rpc-timeout.ms from 0 to 12 by default to avoid > potential hang > > > Key: HADOOP-17552 > URL: https://issues.apache.org/jira/browse/HADOOP-17552 > Project: Hadoop Common > Issue Type: Bug > Components: ipc >Affects Versions: 3.2.2 >Reporter: Haoze Wu >Priority: Major > Labels: pull-request-available > Time Spent: 8h > Remaining Estimate: 0h > > We are doing some systematic fault injection testing in Hadoop-3.2.2 and > when we try to run a client (e.g., `bin/hdfs dfs -ls /`) to our HDFS cluster > (1 NameNode, 2 DataNodes), the client gets stuck forever. After some > investigation, we believe that it’s a bug in `hadoop.ipc.Client` because the > read method of `hadoop.ipc.Client$Connection$PingInputStream` keeps > swallowing `java.net.SocketTimeoutException` due to the mistaken usage of the > `rpcTimeout` configuration in the `handleTimeout` method. > *Reproduction* > Start HDFS with the default configuration. Then execute a client (we used > the command `bin/hdfs dfs -ls /` in the terminal). While HDFS is trying to > accept the client’s socket, inject a socket error (java.net.SocketException > or java.io.IOException), specifically at line 1402 (line 1403 or 1404 will > also work). > We prepare the scripts for reproduction in a gist > ([https://gist.github.com/functioner/08bcd86491b8ff32860eafda8c140e24]). > *Diagnosis* > When the NameNode tries to accept a client’s socket, basically there are > 4 steps: > # accept the socket (line 1400) > # configure the socket (line 1402-1404) > # make the socket a Reader (after line 1404) > # swallow the possible IOException in line 1350 > {code:java} > //hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java > public void run() { > while (running) { > SelectionKey key = null; > try { > getSelector().select(); > Iterator iter = > getSelector().selectedKeys().iterator(); > while (iter.hasNext()) { > key = iter.next(); > iter.remove(); > try { > if (key.isValid()) { > if (key.isAcceptable()) > doAccept(key); > } > } catch (IOException e) { // line 1350 > } > key = null; > } > } catch (OutOfMemoryError e) { > // ... > } catch (Exception e) { > // ... > } > } > } > void doAccept(SelectionKey key) throws InterruptedException, IOException, > OutOfMemoryError { > ServerSocketChannel server = (ServerSocketChannel) key.channel(); > SocketChannel channel; > while ((channel = server.accept()) != null) { // line 1400 > channel.configureBlocking(false); // line 1402 > channel.socket().setTcpNoDelay(tcpNoDelay); // line 1403 > channel.socket().setKeepAlive(true); // line 1404 > > Reader reader = getReader(); > Connection c = connectionManager.register(channel, > this.listenPort, this.isOnAuxiliaryPort); > // If the connectionManager can't take it, close the connection. > if (c == null) { > if (channel.isOpen()) { > IOUtils.cleanup(null, channel); > } > connectionManager.droppedConnections.getAndIncrement(); > continue; > } > key.attach(c); // so closeCurrentConnection can get the object > reader.addConnection(c); > } > } > {code} > When a SocketException occurs in line 1402 (or 1403 or 1404), the > server.accept() in line 1400 has finished, so we expect the following > behavior: > # The server
[GitHub] [hadoop] ferhui commented on pull request #2727: HADOOP-17552. Change ipc.client.rpc-timeout.ms from 0 to 120000 by default to avoid potential hang
ferhui commented on pull request #2727: URL: https://github.com/apache/hadoop/pull/2727#issuecomment-789644619 @functioner That's OK This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17552) Change ipc.client.rpc-timeout.ms from 0 to 120000 by default to avoid potential hang
[ https://issues.apache.org/jira/browse/HADOOP-17552?focusedWorklogId=560360=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560360 ] ASF GitHub Bot logged work on HADOOP-17552: --- Author: ASF GitHub Bot Created on: 03/Mar/21 09:53 Start Date: 03/Mar/21 09:53 Worklog Time Spent: 10m Work Description: functioner commented on pull request #2727: URL: https://github.com/apache/hadoop/pull/2727#issuecomment-789587539 > @functioner According to CI results, TestIPC#testClientGetTimeout fails. It is related, please check. It fails at line 1459: https://github.com/apache/hadoop/blob/b4985c1ef277bcf51eec981385c56218ac41f09e/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ipc/TestIPC.java#L1456-L1460 `Client.getTimeout` is: https://github.com/apache/hadoop/blob/b4985c1ef277bcf51eec981385c56218ac41f09e/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java#L237-L258 Before we change the default rpcTimeout: rpcTimeout is 0, so it won't return at line 251. `CommonConfigurationKeys.IPC_CLIENT_PING_DEFAULT` is true, so it won't return at line 255 either. Finally, it returns -1 at line 257, and passes the test case. After we change the default rpcTimeout=12: It returns at line 251, it fails because 12 is not -1. Conclusion: This test is essentially checking the default value of rpcTimeout. Since we modified this value, we should also modify this test as `assertThat(Client.getTimeout(config)).isEqualTo(12)`. What do you think? @ferhui @iwasakims This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560360) Time Spent: 7h 50m (was: 7h 40m) > Change ipc.client.rpc-timeout.ms from 0 to 12 by default to avoid > potential hang > > > Key: HADOOP-17552 > URL: https://issues.apache.org/jira/browse/HADOOP-17552 > Project: Hadoop Common > Issue Type: Bug > Components: ipc >Affects Versions: 3.2.2 >Reporter: Haoze Wu >Priority: Major > Labels: pull-request-available > Time Spent: 7h 50m > Remaining Estimate: 0h > > We are doing some systematic fault injection testing in Hadoop-3.2.2 and > when we try to run a client (e.g., `bin/hdfs dfs -ls /`) to our HDFS cluster > (1 NameNode, 2 DataNodes), the client gets stuck forever. After some > investigation, we believe that it’s a bug in `hadoop.ipc.Client` because the > read method of `hadoop.ipc.Client$Connection$PingInputStream` keeps > swallowing `java.net.SocketTimeoutException` due to the mistaken usage of the > `rpcTimeout` configuration in the `handleTimeout` method. > *Reproduction* > Start HDFS with the default configuration. Then execute a client (we used > the command `bin/hdfs dfs -ls /` in the terminal). While HDFS is trying to > accept the client’s socket, inject a socket error (java.net.SocketException > or java.io.IOException), specifically at line 1402 (line 1403 or 1404 will > also work). > We prepare the scripts for reproduction in a gist > ([https://gist.github.com/functioner/08bcd86491b8ff32860eafda8c140e24]). > *Diagnosis* > When the NameNode tries to accept a client’s socket, basically there are > 4 steps: > # accept the socket (line 1400) > # configure the socket (line 1402-1404) > # make the socket a Reader (after line 1404) > # swallow the possible IOException in line 1350 > {code:java} > //hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java > public void run() { > while (running) { > SelectionKey key = null; > try { > getSelector().select(); > Iterator iter = > getSelector().selectedKeys().iterator(); > while (iter.hasNext()) { > key = iter.next(); > iter.remove(); > try { > if (key.isValid()) { > if (key.isAcceptable()) > doAccept(key); > } > } catch (IOException e) { // line 1350 > } > key = null; > } > } catch (OutOfMemoryError e) { > // ... > } catch (Exception e) { > // ... > } > } > } > void doAccept(SelectionKey key) throws InterruptedException, IOException, > OutOfMemoryError { > ServerSocketChannel server =
[GitHub] [hadoop] functioner commented on pull request #2727: HADOOP-17552. Change ipc.client.rpc-timeout.ms from 0 to 120000 by default to avoid potential hang
functioner commented on pull request #2727: URL: https://github.com/apache/hadoop/pull/2727#issuecomment-789587539 > @functioner According to CI results, TestIPC#testClientGetTimeout fails. It is related, please check. It fails at line 1459: https://github.com/apache/hadoop/blob/b4985c1ef277bcf51eec981385c56218ac41f09e/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ipc/TestIPC.java#L1456-L1460 `Client.getTimeout` is: https://github.com/apache/hadoop/blob/b4985c1ef277bcf51eec981385c56218ac41f09e/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java#L237-L258 Before we change the default rpcTimeout: rpcTimeout is 0, so it won't return at line 251. `CommonConfigurationKeys.IPC_CLIENT_PING_DEFAULT` is true, so it won't return at line 255 either. Finally, it returns -1 at line 257, and passes the test case. After we change the default rpcTimeout=12: It returns at line 251, it fails because 12 is not -1. Conclusion: This test is essentially checking the default value of rpcTimeout. Since we modified this value, we should also modify this test as `assertThat(Client.getTimeout(config)).isEqualTo(12)`. What do you think? @ferhui @iwasakims This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17552) Change ipc.client.rpc-timeout.ms from 0 to 120000 by default to avoid potential hang
[ https://issues.apache.org/jira/browse/HADOOP-17552?focusedWorklogId=560359=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560359 ] ASF GitHub Bot logged work on HADOOP-17552: --- Author: ASF GitHub Bot Created on: 03/Mar/21 09:25 Start Date: 03/Mar/21 09:25 Worklog Time Spent: 10m Work Description: ferhui commented on pull request #2727: URL: https://github.com/apache/hadoop/pull/2727#issuecomment-789570037 @functioner According to CI results, TestIPC#testClientGetTimeout fails. It is related, please check. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560359) Time Spent: 7h 40m (was: 7.5h) > Change ipc.client.rpc-timeout.ms from 0 to 12 by default to avoid > potential hang > > > Key: HADOOP-17552 > URL: https://issues.apache.org/jira/browse/HADOOP-17552 > Project: Hadoop Common > Issue Type: Bug > Components: ipc >Affects Versions: 3.2.2 >Reporter: Haoze Wu >Priority: Major > Labels: pull-request-available > Time Spent: 7h 40m > Remaining Estimate: 0h > > We are doing some systematic fault injection testing in Hadoop-3.2.2 and > when we try to run a client (e.g., `bin/hdfs dfs -ls /`) to our HDFS cluster > (1 NameNode, 2 DataNodes), the client gets stuck forever. After some > investigation, we believe that it’s a bug in `hadoop.ipc.Client` because the > read method of `hadoop.ipc.Client$Connection$PingInputStream` keeps > swallowing `java.net.SocketTimeoutException` due to the mistaken usage of the > `rpcTimeout` configuration in the `handleTimeout` method. > *Reproduction* > Start HDFS with the default configuration. Then execute a client (we used > the command `bin/hdfs dfs -ls /` in the terminal). While HDFS is trying to > accept the client’s socket, inject a socket error (java.net.SocketException > or java.io.IOException), specifically at line 1402 (line 1403 or 1404 will > also work). > We prepare the scripts for reproduction in a gist > ([https://gist.github.com/functioner/08bcd86491b8ff32860eafda8c140e24]). > *Diagnosis* > When the NameNode tries to accept a client’s socket, basically there are > 4 steps: > # accept the socket (line 1400) > # configure the socket (line 1402-1404) > # make the socket a Reader (after line 1404) > # swallow the possible IOException in line 1350 > {code:java} > //hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java > public void run() { > while (running) { > SelectionKey key = null; > try { > getSelector().select(); > Iterator iter = > getSelector().selectedKeys().iterator(); > while (iter.hasNext()) { > key = iter.next(); > iter.remove(); > try { > if (key.isValid()) { > if (key.isAcceptable()) > doAccept(key); > } > } catch (IOException e) { // line 1350 > } > key = null; > } > } catch (OutOfMemoryError e) { > // ... > } catch (Exception e) { > // ... > } > } > } > void doAccept(SelectionKey key) throws InterruptedException, IOException, > OutOfMemoryError { > ServerSocketChannel server = (ServerSocketChannel) key.channel(); > SocketChannel channel; > while ((channel = server.accept()) != null) { // line 1400 > channel.configureBlocking(false); // line 1402 > channel.socket().setTcpNoDelay(tcpNoDelay); // line 1403 > channel.socket().setKeepAlive(true); // line 1404 > > Reader reader = getReader(); > Connection c = connectionManager.register(channel, > this.listenPort, this.isOnAuxiliaryPort); > // If the connectionManager can't take it, close the connection. > if (c == null) { > if (channel.isOpen()) { > IOUtils.cleanup(null, channel); > } > connectionManager.droppedConnections.getAndIncrement(); > continue; > } > key.attach(c); // so closeCurrentConnection can get the object > reader.addConnection(c); > } > } > {code} > When a SocketException occurs in line 1402 (or 1403 or 1404), the > server.accept()
[GitHub] [hadoop] ferhui commented on pull request #2727: HADOOP-17552. Change ipc.client.rpc-timeout.ms from 0 to 120000 by default to avoid potential hang
ferhui commented on pull request #2727: URL: https://github.com/apache/hadoop/pull/2727#issuecomment-789570037 @functioner According to CI results, TestIPC#testClientGetTimeout fails. It is related, please check. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org