[jira] [Updated] (HUDI-7392) Fix connection leak causing lingering CLOSE_WAIT TCP connections
[ https://issues.apache.org/jira/browse/HUDI-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7392: Fix Version/s: 1.0.0 > Fix connection leak causing lingering CLOSE_WAIT TCP connections > > > Key: HUDI-7392 > URL: https://issues.apache.org/jira/browse/HUDI-7392 > Project: Apache Hudi > Issue Type: Bug >Reporter: voon >Assignee: voon >Priority: Major > Labels: pull-request-available > Fix For: 0.15.0, 1.0.0 > > > When consistent_hashing is enabled and a long running Spark job > (Deltastreamer) is created, we noticed that there is a gradual increase in > CLOSE_WAIT connections originating from the AM -> HDFS DN. > > Command to check for close waits > {code:java} > netstat -anlpt | grep CLOSE_WAIT | grep 50010{code} > Result > {code:java} > tcp6 1 0 10.1.2.3:45994 10.5.4.3:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:48478 10.6.5.4:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:49542 10.7.6.5:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:47220 10.8.7.6:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:49786 10.9.8.7:50010 CLOSE_WAIT > 2446/java {code} > > Socket analysis using ss (last send) showed pointed us in the direction that > these CLOSE_WAITs were only created between INFLIGHT and COMPLETED instants > (inclusive), at this point in time. On top of that, this issue is only > reproducible in tables using consistent hashing index. > > To reproduce this: > > {code:java} > CREATE TABLE dev_hudi.close_wait_issue_investigation ( > id INT, > name STRING, > date_col STRING, > grass_region STRING > ) USING hudi > PARTITIONED BY (grass_region) > tblproperties ( > primaryKey = 'id', > type = 'mor', > precombineField = 'id', > hoodie.index.type = 'BUCKET', > hoodie.index.bucket.engine = 'CONSISTENT_HASHING', > hoodie.compact.inline = 'true' > ) > LOCATION 'hdfs://DEV/close_wait_issue_investigation'; > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', > '2023-12-22', 'SG');{code} > > Observation: > After every INSERT, there will be 1 new CLOSE_WAIT. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7392) Fix connection leak causing lingering CLOSE_WAIT TCP connections
[ https://issues.apache.org/jira/browse/HUDI-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7392: - Fix Version/s: 0.14.2 > Fix connection leak causing lingering CLOSE_WAIT TCP connections > > > Key: HUDI-7392 > URL: https://issues.apache.org/jira/browse/HUDI-7392 > Project: Apache Hudi > Issue Type: Bug >Reporter: voon >Assignee: voon >Priority: Major > Labels: pull-request-available > Fix For: 0.14.2 > > > When consistent_hashing is enabled and a long running Spark job > (Deltastreamer) is created, we noticed that there is a gradual increase in > CLOSE_WAIT connections originating from the AM -> HDFS DN. > > Command to check for close waits > {code:java} > netstat -anlpt | grep CLOSE_WAIT | grep 50010{code} > Result > {code:java} > tcp6 1 0 10.1.2.3:45994 10.5.4.3:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:48478 10.6.5.4:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:49542 10.7.6.5:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:47220 10.8.7.6:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:49786 10.9.8.7:50010 CLOSE_WAIT > 2446/java {code} > > Socket analysis using ss (last send) showed pointed us in the direction that > these CLOSE_WAITs were only created between INFLIGHT and COMPLETED instants > (inclusive), at this point in time. On top of that, this issue is only > reproducible in tables using consistent hashing index. > > To reproduce this: > > {code:java} > CREATE TABLE dev_hudi.close_wait_issue_investigation ( > id INT, > name STRING, > date_col STRING, > grass_region STRING > ) USING hudi > PARTITIONED BY (grass_region) > tblproperties ( > primaryKey = 'id', > type = 'mor', > precombineField = 'id', > hoodie.index.type = 'BUCKET', > hoodie.index.bucket.engine = 'CONSISTENT_HASHING', > hoodie.compact.inline = 'true' > ) > LOCATION 'hdfs://DEV/close_wait_issue_investigation'; > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', > '2023-12-22', 'SG');{code} > > Observation: > After every INSERT, there will be 1 new CLOSE_WAIT. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7392) Fix connection leak causing lingering CLOSE_WAIT TCP connections
[ https://issues.apache.org/jira/browse/HUDI-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] voon updated HUDI-7392: --- Description: When consistent_hashing is enabled and a long running Spark job (Deltastreamer) is created, we noticed that there is a gradual increase in CLOSE_WAIT connections originating from the AM -> HDFS DN. Command to check for close waits {code:java} netstat -anlpt | grep CLOSE_WAIT | grep 50010{code} Result {code:java} tcp6 1 0 10.1.2.3:45994 10.5.4.3:50010 CLOSE_WAIT 2446/java tcp6 1 0 10.1.2.3:48478 10.6.5.4:50010 CLOSE_WAIT 2446/java tcp6 1 0 10.1.2.3:49542 10.7.6.5:50010 CLOSE_WAIT 2446/java tcp6 1 0 10.1.2.3:47220 10.8.7.6:50010 CLOSE_WAIT 2446/java tcp6 1 0 10.1.2.3:49786 10.9.8.7:50010 CLOSE_WAIT 2446/java {code} Socket analysis using ss (last send) showed pointed us in the direction that these CLOSE_WAITs were only created between INFLIGHT and COMPLETED instants (inclusive), at this point in time. On top of that, this issue is only reproducible in tables using consistent hashing index. To reproduce this: {code:java} CREATE TABLE dev_hudi.close_wait_issue_investigation ( id INT, name STRING, date_col STRING, grass_region STRING ) USING hudi PARTITIONED BY (grass_region) tblproperties ( primaryKey = 'id', type = 'mor', precombineField = 'id', hoodie.index.type = 'BUCKET', hoodie.index.bucket.engine = 'CONSISTENT_HASHING', hoodie.compact.inline = 'true' ) LOCATION 'hdfs://DEV/close_wait_issue_investigation'; INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', '2023-12-22', 'SG'); INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', '2023-12-22', 'SG'); INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', '2023-12-22', 'SG'); INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', '2023-12-22', 'SG'); INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', '2023-12-22', 'SG');{code} Observation: After every INSERT, there will be 1 new CLOSE_WAIT. was: When consistent_hashing is enabled and a long running Spark job (Deltastreamer) is created, we noticed that there is a gradual increase in CLOSE_WAIT connections originating from the AM -> HDFS DN. Command to check for close waits {code:java} netstat -anlpt | grep CLOSE_WAIT | grep 50010{code} Result {code:java} tcp6 1 0 10.1.2.3:45994 10.5.4.3:50010 CLOSE_WAIT 2446/java tcp6 1 0 10.1.2.3:48478 10.6.5.4:50010 CLOSE_WAIT 2446/java tcp6 1 0 10.1.2.3:49542 10.7.6.5:50010 CLOSE_WAIT 2446/java tcp6 1 0 10.1.2.3:47220 10.8.7.6:50010 CLOSE_WAIT 2446/java tcp6 1 0 10.1.2.3:49786 10.9.8.7:50010 CLOSE_WAIT 2446/java {code} To reproduce this: {code:java} CREATE TABLE dev_hudi.close_wait_issue_investigation ( id INT, name STRING, date_col STRING, grass_region STRING ) USING hudi PARTITIONED BY (grass_region) tblproperties ( primaryKey = 'id', type = 'mor', precombineField = 'id', hoodie.index.type = 'BUCKET', hoodie.index.bucket.engine = 'CONSISTENT_HASHING', hoodie.compact.inline = 'true' ) LOCATION 'hdfs://DEV/close_wait_issue_investigation'; INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', '2023-12-22', 'SG'); INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', '2023-12-22', 'SG'); INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', '2023-12-22', 'SG'); INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', '2023-12-22', 'SG'); INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', '2023-12-22', 'SG');{code} Observation: After every INSERT, there will be 1 new CLOSE_WAIT. > Fix connection leak causing lingering CLOSE_WAIT TCP connections > > > Key: HUDI-7392 > URL: https://issues.apache.org/jira/browse/HUDI-7392 > Project: Apache Hudi > Issue Type: Bug >Reporter: voon >Assignee: voon >Priority: Major > Labels: pull-request-available > > When consistent_hashing is enabled and a long running Spark job > (Deltastreamer) is created, we noticed that there is a gradual increase in > CLOSE_WAIT connections originating from the AM -> HDFS DN. > > Command to check for close waits > {code:java} > netstat -anlpt | grep CLOSE_WAIT | grep 50010{code} > Result > {code:java} > tcp6 1 0 10.1.2.3:45994 10.5.4.3:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:48478 10.6.5.4:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 1
[jira] [Updated] (HUDI-7392) Fix connection leak causing lingering CLOSE_WAIT TCP connections
[ https://issues.apache.org/jira/browse/HUDI-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7392: - Labels: pull-request-available (was: ) > Fix connection leak causing lingering CLOSE_WAIT TCP connections > > > Key: HUDI-7392 > URL: https://issues.apache.org/jira/browse/HUDI-7392 > Project: Apache Hudi > Issue Type: Bug >Reporter: voon >Assignee: voon >Priority: Major > Labels: pull-request-available > > When consistent_hashing is enabled and a long running Spark job > (Deltastreamer) is created, we noticed that there is a gradual increase in > CLOSE_WAIT connections originating from the AM -> HDFS DN. > > Command to check for close waits > {code:java} > netstat -anlpt | grep CLOSE_WAIT | grep 50010{code} > Result > {code:java} > tcp6 1 0 10.1.2.3:45994 10.5.4.3:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:48478 10.6.5.4:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:49542 10.7.6.5:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:47220 10.8.7.6:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:49786 10.9.8.7:50010 CLOSE_WAIT > 2446/java {code} > > To reproduce this: > > {code:java} > CREATE TABLE dev_hudi.close_wait_issue_investigation ( > id INT, > name STRING, > date_col STRING, > grass_region STRING > ) USING hudi > PARTITIONED BY (grass_region) > tblproperties ( > primaryKey = 'id', > type = 'mor', > precombineField = 'id', > hoodie.index.type = 'BUCKET', > hoodie.index.bucket.engine = 'CONSISTENT_HASHING', > hoodie.compact.inline = 'true' > ) > LOCATION 'hdfs://DEV/close_wait_issue_investigation'; > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', > '2023-12-22', 'SG');{code} > > Observation: > After every INSERT, there will be 1 new CLOSE_WAIT. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7392) Fix connection leak causing lingering CLOSE_WAIT TCP connections
[ https://issues.apache.org/jira/browse/HUDI-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] voon updated HUDI-7392: --- Summary: Fix connection leak causing lingering CLOSE_WAIT TCP connections (was: Fix Connection leak causing lingering CLOSE_WAIT TCP connections) > Fix connection leak causing lingering CLOSE_WAIT TCP connections > > > Key: HUDI-7392 > URL: https://issues.apache.org/jira/browse/HUDI-7392 > Project: Apache Hudi > Issue Type: Bug >Reporter: voon >Assignee: voon >Priority: Major > > When consistent_hashing is enabled and a long running Spark job > (Deltastreamer) is created, we noticed that there is a gradual increase in > CLOSE_WAIT connections originating from the AM -> HDFS DN. > > Command to check for close waits > {code:java} > netstat -anlpt | grep CLOSE_WAIT | grep 50010{code} > Result > {code:java} > tcp6 1 0 10.1.2.3:45994 10.5.4.3:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:48478 10.6.5.4:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:49542 10.7.6.5:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:47220 10.8.7.6:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:49786 10.9.8.7:50010 CLOSE_WAIT > 2446/java {code} > > To reproduce this: > > {code:java} > CREATE TABLE dev_hudi.close_wait_issue_investigation ( > id INT, > name STRING, > date_col STRING, > grass_region STRING > ) USING hudi > PARTITIONED BY (grass_region) > tblproperties ( > primaryKey = 'id', > type = 'mor', > precombineField = 'id', > hoodie.index.type = 'BUCKET', > hoodie.index.bucket.engine = 'CONSISTENT_HASHING', > hoodie.compact.inline = 'true' > ) > LOCATION 'hdfs://DEV/close_wait_issue_investigation'; > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', > '2023-12-22', 'SG');{code} > > Observation: > After every INSERT, there will be 1 new CLOSE_WAIT. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7392) Fix Connection leak causing lingering CLOSE_WAIT TCP connections
[ https://issues.apache.org/jira/browse/HUDI-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] voon updated HUDI-7392: --- Summary: Fix Connection leak causing lingering CLOSE_WAIT TCP connections (was: Fix Connection leak causing CLOSE_WAIT TCP connections) > Fix Connection leak causing lingering CLOSE_WAIT TCP connections > > > Key: HUDI-7392 > URL: https://issues.apache.org/jira/browse/HUDI-7392 > Project: Apache Hudi > Issue Type: Bug >Reporter: voon >Assignee: voon >Priority: Major > > When consistent_hashing is enabled and a long running Spark job > (Deltastreamer) is created, we noticed that there is a gradual increase in > CLOSE_WAIT connections originating from the AM -> HDFS DN. > > Command to check for close waits > {code:java} > netstat -anlpt | grep CLOSE_WAIT | grep 50010{code} > Result > {code:java} > tcp6 1 0 10.1.2.3:45994 10.5.4.3:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:48478 10.6.5.4:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:49542 10.7.6.5:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:47220 10.8.7.6:50010 CLOSE_WAIT > 2446/java > tcp6 1 0 10.1.2.3:49786 10.9.8.7:50010 CLOSE_WAIT > 2446/java {code} > > To reproduce this: > > {code:java} > CREATE TABLE dev_hudi.close_wait_issue_investigation ( > id INT, > name STRING, > date_col STRING, > grass_region STRING > ) USING hudi > PARTITIONED BY (grass_region) > tblproperties ( > primaryKey = 'id', > type = 'mor', > precombineField = 'id', > hoodie.index.type = 'BUCKET', > hoodie.index.bucket.engine = 'CONSISTENT_HASHING', > hoodie.compact.inline = 'true' > ) > LOCATION 'hdfs://DEV/close_wait_issue_investigation'; > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', > '2023-12-22', 'SG'); > INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', > '2023-12-22', 'SG');{code} > > Observation: > After every INSERT, there will be 1 new CLOSE_WAIT. > -- This message was sent by Atlassian Jira (v8.20.10#820010)