[jira] [Updated] (HUDI-7392) Fix connection leak causing lingering CLOSE_WAIT TCP connections

2024-06-06 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7392:

Fix Version/s: 1.0.0

> Fix connection leak causing lingering CLOSE_WAIT TCP connections
> 
>
> Key: HUDI-7392
> URL: https://issues.apache.org/jira/browse/HUDI-7392
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: voon
>Assignee: voon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> When consistent_hashing is enabled and a long running Spark job 
> (Deltastreamer) is created, we noticed that there is a gradual increase in 
> CLOSE_WAIT connections originating from the AM -> HDFS DN. 
>  
> Command to check for close waits
> {code:java}
> netstat -anlpt | grep CLOSE_WAIT | grep 50010{code}
> Result
> {code:java}
> tcp6       1      0 10.1.2.3:45994      10.5.4.3:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:48478      10.6.5.4:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:49542      10.7.6.5:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:47220      10.8.7.6:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:49786      10.9.8.7:50010      CLOSE_WAIT  
> 2446/java {code}
>  
> Socket analysis using ss (last send) showed pointed us in the direction that 
> these CLOSE_WAITs were only created between INFLIGHT and COMPLETED instants 
> (inclusive), at this point in time. On top of that, this issue is only 
> reproducible in tables using consistent hashing index.
>  
> To reproduce this:
>  
> {code:java}
> CREATE TABLE dev_hudi.close_wait_issue_investigation (
>     id INT,
>     name STRING,
>     date_col STRING,
>     grass_region STRING
> ) USING hudi
> PARTITIONED BY (grass_region)
> tblproperties (
>     primaryKey = 'id',
>     type = 'mor',
>     precombineField = 'id',
>     hoodie.index.type = 'BUCKET',
>     hoodie.index.bucket.engine = 'CONSISTENT_HASHING',     
> hoodie.compact.inline = 'true'
> )
> LOCATION 'hdfs://DEV/close_wait_issue_investigation';
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', 
> '2023-12-22', 'SG');{code}
>  
>  Observation:
> After every INSERT, there will be 1 new CLOSE_WAIT.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7392) Fix connection leak causing lingering CLOSE_WAIT TCP connections

2024-02-07 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated HUDI-7392:
-
Fix Version/s: 0.14.2

> Fix connection leak causing lingering CLOSE_WAIT TCP connections
> 
>
> Key: HUDI-7392
> URL: https://issues.apache.org/jira/browse/HUDI-7392
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: voon
>Assignee: voon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.2
>
>
> When consistent_hashing is enabled and a long running Spark job 
> (Deltastreamer) is created, we noticed that there is a gradual increase in 
> CLOSE_WAIT connections originating from the AM -> HDFS DN. 
>  
> Command to check for close waits
> {code:java}
> netstat -anlpt | grep CLOSE_WAIT | grep 50010{code}
> Result
> {code:java}
> tcp6       1      0 10.1.2.3:45994      10.5.4.3:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:48478      10.6.5.4:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:49542      10.7.6.5:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:47220      10.8.7.6:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:49786      10.9.8.7:50010      CLOSE_WAIT  
> 2446/java {code}
>  
> Socket analysis using ss (last send) showed pointed us in the direction that 
> these CLOSE_WAITs were only created between INFLIGHT and COMPLETED instants 
> (inclusive), at this point in time. On top of that, this issue is only 
> reproducible in tables using consistent hashing index.
>  
> To reproduce this:
>  
> {code:java}
> CREATE TABLE dev_hudi.close_wait_issue_investigation (
>     id INT,
>     name STRING,
>     date_col STRING,
>     grass_region STRING
> ) USING hudi
> PARTITIONED BY (grass_region)
> tblproperties (
>     primaryKey = 'id',
>     type = 'mor',
>     precombineField = 'id',
>     hoodie.index.type = 'BUCKET',
>     hoodie.index.bucket.engine = 'CONSISTENT_HASHING',     
> hoodie.compact.inline = 'true'
> )
> LOCATION 'hdfs://DEV/close_wait_issue_investigation';
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', 
> '2023-12-22', 'SG');{code}
>  
>  Observation:
> After every INSERT, there will be 1 new CLOSE_WAIT.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7392) Fix connection leak causing lingering CLOSE_WAIT TCP connections

2024-02-06 Thread voon (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

voon updated HUDI-7392:
---
Description: 
When consistent_hashing is enabled and a long running Spark job (Deltastreamer) 
is created, we noticed that there is a gradual increase in CLOSE_WAIT 
connections originating from the AM -> HDFS DN. 

 

Command to check for close waits
{code:java}
netstat -anlpt | grep CLOSE_WAIT | grep 50010{code}
Result
{code:java}
tcp6       1      0 10.1.2.3:45994      10.5.4.3:50010      CLOSE_WAIT  
2446/java
tcp6       1      0 10.1.2.3:48478      10.6.5.4:50010      CLOSE_WAIT  
2446/java
tcp6       1      0 10.1.2.3:49542      10.7.6.5:50010      CLOSE_WAIT  
2446/java
tcp6       1      0 10.1.2.3:47220      10.8.7.6:50010      CLOSE_WAIT  
2446/java
tcp6       1      0 10.1.2.3:49786      10.9.8.7:50010      CLOSE_WAIT  
2446/java {code}
 

Socket analysis using ss (last send) showed pointed us in the direction that 
these CLOSE_WAITs were only created between INFLIGHT and COMPLETED instants 
(inclusive), at this point in time. On top of that, this issue is only 
reproducible in tables using consistent hashing index.

 

To reproduce this:
 
{code:java}
CREATE TABLE dev_hudi.close_wait_issue_investigation (
    id INT,
    name STRING,
    date_col STRING,
    grass_region STRING
) USING hudi
PARTITIONED BY (grass_region)
tblproperties (
    primaryKey = 'id',
    type = 'mor',
    precombineField = 'id',
    hoodie.index.type = 'BUCKET',
    hoodie.index.bucket.engine = 'CONSISTENT_HASHING',     
hoodie.compact.inline = 'true'
)
LOCATION 'hdfs://DEV/close_wait_issue_investigation';

INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', 
'2023-12-22', 'SG');
INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', 
'2023-12-22', 'SG');
INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', 
'2023-12-22', 'SG');
INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', 
'2023-12-22', 'SG');
INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', 
'2023-12-22', 'SG');{code}
 
 Observation:

After every INSERT, there will be 1 new CLOSE_WAIT.

 

  was:
When consistent_hashing is enabled and a long running Spark job (Deltastreamer) 
is created, we noticed that there is a gradual increase in CLOSE_WAIT 
connections originating from the AM -> HDFS DN. 

 

Command to check for close waits
{code:java}
netstat -anlpt | grep CLOSE_WAIT | grep 50010{code}
Result
{code:java}
tcp6       1      0 10.1.2.3:45994      10.5.4.3:50010      CLOSE_WAIT  
2446/java
tcp6       1      0 10.1.2.3:48478      10.6.5.4:50010      CLOSE_WAIT  
2446/java
tcp6       1      0 10.1.2.3:49542      10.7.6.5:50010      CLOSE_WAIT  
2446/java
tcp6       1      0 10.1.2.3:47220      10.8.7.6:50010      CLOSE_WAIT  
2446/java
tcp6       1      0 10.1.2.3:49786      10.9.8.7:50010      CLOSE_WAIT  
2446/java {code}
 

To reproduce this:
 
{code:java}
CREATE TABLE dev_hudi.close_wait_issue_investigation (
    id INT,
    name STRING,
    date_col STRING,
    grass_region STRING
) USING hudi
PARTITIONED BY (grass_region)
tblproperties (
    primaryKey = 'id',
    type = 'mor',
    precombineField = 'id',
    hoodie.index.type = 'BUCKET',
    hoodie.index.bucket.engine = 'CONSISTENT_HASHING',     
hoodie.compact.inline = 'true'
)
LOCATION 'hdfs://DEV/close_wait_issue_investigation';

INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', 
'2023-12-22', 'SG');
INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', 
'2023-12-22', 'SG');
INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', 
'2023-12-22', 'SG');
INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', 
'2023-12-22', 'SG');
INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', 
'2023-12-22', 'SG');{code}
 
 Observation:

After every INSERT, there will be 1 new CLOSE_WAIT.

 


> Fix connection leak causing lingering CLOSE_WAIT TCP connections
> 
>
> Key: HUDI-7392
> URL: https://issues.apache.org/jira/browse/HUDI-7392
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: voon
>Assignee: voon
>Priority: Major
>  Labels: pull-request-available
>
> When consistent_hashing is enabled and a long running Spark job 
> (Deltastreamer) is created, we noticed that there is a gradual increase in 
> CLOSE_WAIT connections originating from the AM -> HDFS DN. 
>  
> Command to check for close waits
> {code:java}
> netstat -anlpt | grep CLOSE_WAIT | grep 50010{code}
> Result
> {code:java}
> tcp6       1      0 10.1.2.3:45994      10.5.4.3:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:48478      10.6.5.4:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 1

[jira] [Updated] (HUDI-7392) Fix connection leak causing lingering CLOSE_WAIT TCP connections

2024-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-7392:
-
Labels: pull-request-available  (was: )

> Fix connection leak causing lingering CLOSE_WAIT TCP connections
> 
>
> Key: HUDI-7392
> URL: https://issues.apache.org/jira/browse/HUDI-7392
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: voon
>Assignee: voon
>Priority: Major
>  Labels: pull-request-available
>
> When consistent_hashing is enabled and a long running Spark job 
> (Deltastreamer) is created, we noticed that there is a gradual increase in 
> CLOSE_WAIT connections originating from the AM -> HDFS DN. 
>  
> Command to check for close waits
> {code:java}
> netstat -anlpt | grep CLOSE_WAIT | grep 50010{code}
> Result
> {code:java}
> tcp6       1      0 10.1.2.3:45994      10.5.4.3:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:48478      10.6.5.4:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:49542      10.7.6.5:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:47220      10.8.7.6:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:49786      10.9.8.7:50010      CLOSE_WAIT  
> 2446/java {code}
>  
> To reproduce this:
>  
> {code:java}
> CREATE TABLE dev_hudi.close_wait_issue_investigation (
>     id INT,
>     name STRING,
>     date_col STRING,
>     grass_region STRING
> ) USING hudi
> PARTITIONED BY (grass_region)
> tblproperties (
>     primaryKey = 'id',
>     type = 'mor',
>     precombineField = 'id',
>     hoodie.index.type = 'BUCKET',
>     hoodie.index.bucket.engine = 'CONSISTENT_HASHING',     
> hoodie.compact.inline = 'true'
> )
> LOCATION 'hdfs://DEV/close_wait_issue_investigation';
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', 
> '2023-12-22', 'SG');{code}
>  
>  Observation:
> After every INSERT, there will be 1 new CLOSE_WAIT.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7392) Fix connection leak causing lingering CLOSE_WAIT TCP connections

2024-02-06 Thread voon (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

voon updated HUDI-7392:
---
Summary: Fix connection leak causing lingering CLOSE_WAIT TCP connections  
(was: Fix Connection leak causing lingering CLOSE_WAIT TCP connections)

> Fix connection leak causing lingering CLOSE_WAIT TCP connections
> 
>
> Key: HUDI-7392
> URL: https://issues.apache.org/jira/browse/HUDI-7392
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: voon
>Assignee: voon
>Priority: Major
>
> When consistent_hashing is enabled and a long running Spark job 
> (Deltastreamer) is created, we noticed that there is a gradual increase in 
> CLOSE_WAIT connections originating from the AM -> HDFS DN. 
>  
> Command to check for close waits
> {code:java}
> netstat -anlpt | grep CLOSE_WAIT | grep 50010{code}
> Result
> {code:java}
> tcp6       1      0 10.1.2.3:45994      10.5.4.3:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:48478      10.6.5.4:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:49542      10.7.6.5:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:47220      10.8.7.6:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:49786      10.9.8.7:50010      CLOSE_WAIT  
> 2446/java {code}
>  
> To reproduce this:
>  
> {code:java}
> CREATE TABLE dev_hudi.close_wait_issue_investigation (
>     id INT,
>     name STRING,
>     date_col STRING,
>     grass_region STRING
> ) USING hudi
> PARTITIONED BY (grass_region)
> tblproperties (
>     primaryKey = 'id',
>     type = 'mor',
>     precombineField = 'id',
>     hoodie.index.type = 'BUCKET',
>     hoodie.index.bucket.engine = 'CONSISTENT_HASHING',     
> hoodie.compact.inline = 'true'
> )
> LOCATION 'hdfs://DEV/close_wait_issue_investigation';
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', 
> '2023-12-22', 'SG');{code}
>  
>  Observation:
> After every INSERT, there will be 1 new CLOSE_WAIT.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7392) Fix Connection leak causing lingering CLOSE_WAIT TCP connections

2024-02-06 Thread voon (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

voon updated HUDI-7392:
---
Summary: Fix Connection leak causing lingering CLOSE_WAIT TCP connections  
(was: Fix Connection leak causing CLOSE_WAIT TCP connections)

> Fix Connection leak causing lingering CLOSE_WAIT TCP connections
> 
>
> Key: HUDI-7392
> URL: https://issues.apache.org/jira/browse/HUDI-7392
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: voon
>Assignee: voon
>Priority: Major
>
> When consistent_hashing is enabled and a long running Spark job 
> (Deltastreamer) is created, we noticed that there is a gradual increase in 
> CLOSE_WAIT connections originating from the AM -> HDFS DN. 
>  
> Command to check for close waits
> {code:java}
> netstat -anlpt | grep CLOSE_WAIT | grep 50010{code}
> Result
> {code:java}
> tcp6       1      0 10.1.2.3:45994      10.5.4.3:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:48478      10.6.5.4:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:49542      10.7.6.5:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:47220      10.8.7.6:50010      CLOSE_WAIT  
> 2446/java
> tcp6       1      0 10.1.2.3:49786      10.9.8.7:50010      CLOSE_WAIT  
> 2446/java {code}
>  
> To reproduce this:
>  
> {code:java}
> CREATE TABLE dev_hudi.close_wait_issue_investigation (
>     id INT,
>     name STRING,
>     date_col STRING,
>     grass_region STRING
> ) USING hudi
> PARTITIONED BY (grass_region)
> tblproperties (
>     primaryKey = 'id',
>     type = 'mor',
>     precombineField = 'id',
>     hoodie.index.type = 'BUCKET',
>     hoodie.index.bucket.engine = 'CONSISTENT_HASHING',     
> hoodie.compact.inline = 'true'
> )
> LOCATION 'hdfs://DEV/close_wait_issue_investigation';
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (1, 'alex1', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (2, 'alex2', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (3, 'alex3', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (4, 'alex4', 
> '2023-12-22', 'SG');
> INSERT INTO dev_hudi.close_wait_issue_investigation VALUES (5, 'alex5', 
> '2023-12-22', 'SG');{code}
>  
>  Observation:
> After every INSERT, there will be 1 new CLOSE_WAIT.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)