[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=676160=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-676160
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 02:03
Start Date: 04/Nov/21 02:03
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-960359159


   Thanks @tasanuma for the merge. Thanks all for your reviews and suggestions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 676160)
Time Spent: 9h 10m  (was: 9h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=676127=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-676127
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 02:01
Start Date: 04/Nov/21 02:01
Worklog Time Spent: 10m 
  Work Description: tasanuma commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-960354847






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 676127)
Time Spent: 9h  (was: 8h 50m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=675994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675994
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:48
Start Date: 04/Nov/21 01:48
Worklog Time Spent: 10m 
  Work Description: tasanuma merged pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675994)
Time Spent: 8h 50m  (was: 8h 40m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=675728=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675728
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:22
Start Date: 04/Nov/21 01:22
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-958648376






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675728)
Time Spent: 8h 40m  (was: 8.5h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=675222=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675222
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:27
Start Date: 04/Nov/21 00:27
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-958648376


   Hi @tasanuma @jojochuang @aajisaka , could you please help merge this PR. 
And I will open a new JIRA based on it. Thanks a lot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675222)
Time Spent: 8.5h  (was: 8h 20m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=674377=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674377
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 04:30
Start Date: 03/Nov/21 04:30
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-958648376


   Hi @tasanuma @jojochuang @aajisaka , could you please help merge this PR. 
And I will open a new JIRA based on it. Thanks a lot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674377)
Time Spent: 8h 20m  (was: 8h 10m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=674158=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674158
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:46
Start Date: 02/Nov/21 21:46
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957445801


   > I will make some changes in the other JIRA if necessary, because this 
might involve the callContext from Router. What do you think of this? Thank you.
   
   Agreed. Let's discuss in a separate JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674158)
Time Spent: 8h 10m  (was: 8h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=674048=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674048
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:34
Start Date: 02/Nov/21 21:34
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957045765






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674048)
Time Spent: 8h  (was: 7h 50m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=673504=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673504
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:04
Start Date: 02/Nov/21 18:04
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957045765






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673504)
Time Spent: 7h 50m  (was: 7h 40m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=673371=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673371
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:50
Start Date: 02/Nov/21 17:50
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957445801


   > I will make some changes in the other JIRA if necessary, because this 
might involve the callContext from Router. What do you think of this? Thank you.
   
   Agreed. Let's discuss in a separate JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673371)
Time Spent: 7h 40m  (was: 7.5h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=673132=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673132
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 12:29
Start Date: 02/Nov/21 12:29
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957450485


   > > I will make some changes in the other JIRA if necessary, because this 
might involve the callContext from Router. What do you think of this? Thank you.
   > 
   > Agreed. Let's discuss in a separate JIRA.
   
   Thanks @aajisaka for your quick reply.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673132)
Time Spent: 7.5h  (was: 7h 20m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=673129=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673129
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 12:26
Start Date: 02/Nov/21 12:26
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957445801


   > I will make some changes in the other JIRA if necessary, because this 
might involve the callContext from Router. What do you think of this? Thank you.
   
   Agreed. Let's discuss in a separate JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673129)
Time Spent: 7h 20m  (was: 7h 10m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=673119=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673119
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 12:05
Start Date: 02/Nov/21 12:05
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957399841


   > routerClientIp
   
   Thanks @aajisaka for your review and comment. I will make some changes in 
the other JIRA if necessary, because this might involve the callContext from 
Router. What do you think of this? Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673119)
Time Spent: 7h 10m  (was: 7h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=672941=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672941
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 02:38
Start Date: 02/Nov/21 02:38
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957045765


   Hi @aajisaka , could you please review this again. Thanks a lot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 672941)
Time Spent: 7h  (was: 6h 50m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=672515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672515
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 01/Nov/21 02:29
Start Date: 01/Nov/21 02:29
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-955874908


   > @tomscut Thanks for your thoughts. That makes sense to me.
   
   Thanks @tasanuma for your reply and review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 672515)
Time Spent: 6h 50m  (was: 6h 40m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=672514=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672514
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 01/Nov/21 02:24
Start Date: 01/Nov/21 02:24
Worklog Time Spent: 10m 
  Work Description: tasanuma commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-955873323


   @tomscut Thanks for your thoughts. That makes sense to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 672514)
Time Spent: 6h 40m  (was: 6.5h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=672260=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672260
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 30/Oct/21 00:08
Start Date: 30/Oct/21 00:08
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-955108462


   > How about using the word of "port" instead of "clientPort" here, and 
adding "clientPort" as the actual client server port for RBF in another JIRA?
   
   Thank you @tasanuma for your advice.
   
   Based on your suggestion, I have an idea like this:
   
   We leave the ClientPort field for now.
   
   1. When a client sends a request directly to Namenode, the ```clientPort``` 
records the port of the real client.
   
   2. When the client sends a request to Router and then forwards to NameNode, 
we set the ```clientPort``` (the real clientPort) in the ```CallerContext``` of 
the Router. Before namenode prints the audit log, if the ```CallerContext``` 
already contains the ```clientPort``` field, we will not set port again. (I 
plan to do this in another PR)
   
   In both cases, only one ```clientPort``` field is left in the CallerContext, 
which holds the actual clientPort. What do you think of this? Looking forward 
to your comments. Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 672260)
Time Spent: 6h 20m  (was: 6h 10m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=672261=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672261
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 30/Oct/21 00:08
Start Date: 30/Oct/21 00:08
Worklog Time Spent: 10m 
  Work Description: tomscut edited a comment on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-955108462


   > How about using the word of "port" instead of "clientPort" here, and 
adding "clientPort" as the actual client server port for RBF in another JIRA?
   
   Thank you @tasanuma for your advice.
   
   Based on your suggestion, I have an idea like this:
   
   We leave the ClientPort field for now.
   
   1. When a client sends a request directly to Namenode, the ```clientPort``` 
records the port of the real client.
   
   2. When the client sends a request to Router and then forwards to NameNode, 
we set the ```clientPort``` (the real clientPort) in the ```CallerContext``` of 
the Router. Before NameNode prints the audit log, if the ```CallerContext``` 
already contains the ```clientPort``` field, we will not set port again. (I 
plan to do this in another PR)
   
   In both cases, only one ```clientPort``` field is left in the CallerContext, 
which holds the actual clientPort. What do you think of this? Looking forward 
to your comments. Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 672261)
Time Spent: 6.5h  (was: 6h 20m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=672247=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672247
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 29/Oct/21 23:16
Start Date: 29/Oct/21 23:16
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-955096647


   > Thanks for updating it, @tomscut. I tried it with my RBF cluster. There is 
a client server (1.1.1.1), a DFS Router (2.2.2.2), and NameNode. When a client 
sends a request to the Router, NameNode logs the following.
   > 
   > ```
   > INFO FSNamesystem.audit: allowed=true   ugi=tasanuma ip=/2.2.2.2   
cmd=listStatus  src=/user/tasanuma  dst=nullperm=null   
proto=rpc   callerContext=CLI,clientIp:1.1.1.1,clientPort:33070
   > ```
   > 
   > In this case, `clientIp:1.1.1.1` is the IP of the client server, but 
`clientPort:33070` is the port of the DFS Router (2.2.2.2), not the one of the 
client server. It would be confusing for the users.
   
   Thank you @tasanuma  very much for your test. Because the router acts as a 
client of the Namenode, in this case the port of router is indeed regarded as a 
```clientport```.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 672247)
Time Spent: 6h 10m  (was: 6h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=671871=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671871
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 29/Oct/21 08:38
Start Date: 29/Oct/21 08:38
Worklog Time Spent: 10m 
  Work Description: tasanuma commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-954558413


   How about using the word of "port" instead of "clientPort" here, and adding 
"clientPort" as the actual client server port for RBF in another JIRA?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 671871)
Time Spent: 6h  (was: 5h 50m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=671868=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671868
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 29/Oct/21 08:19
Start Date: 29/Oct/21 08:19
Worklog Time Spent: 10m 
  Work Description: tasanuma commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-954546028


   Thanks for updating it, @tomscut.
   I tried it with my RBF cluster. There is a client server (1.1.1.1), a DFS 
Router (2.2.2.2), and NameNode. When a client sends a request to the Router, 
NameNode logs the following.
   ```
   INFO FSNamesystem.audit: allowed=true   ugi=tasanuma ip=/2.2.2.2   
cmd=listStatus  src=/user/tasanuma  dst=nullperm=null   
proto=rpc   callerContext=CLI,clientIp:1.1.1.1,clientPort:33070
   ```
   In this case, `clientIp:1.1.1.1` is the IP of the client server, but 
`clientPort:33070` is the port of the DFS Router (2.2.2.2), not the one of the 
client server. It would be confusing for the users.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 671868)
Time Spent: 5h 50m  (was: 5h 40m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=671784=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671784
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 29/Oct/21 03:30
Start Date: 29/Oct/21 03:30
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-954401714


   Hi @tasanuma @aajisaka @ferhui @ayushtkn , could you also take a look? 
Thanks a lot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 671784)
Time Spent: 5h 40m  (was: 5.5h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=671780=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671780
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 29/Oct/21 03:27
Start Date: 29/Oct/21 03:27
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-954401049


   > lgtm +1
   
   Thanks @jojochuang for your comment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 671780)
Time Spent: 5.5h  (was: 5h 20m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=671420=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671420
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 28/Oct/21 13:09
Start Date: 28/Oct/21 13:09
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-953828582


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  5s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 27s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  21m 40s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  22m 29s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  19m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   3m 47s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 14s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 18s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   5m 43s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 34s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 22s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  21m 22s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 50s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  19m 50s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 49s |  |  root: The patch generated 
0 new + 606 unchanged - 2 fixed = 606 total (was 608)  |
   | +1 :green_heart: |  mvnsite  |   3m 10s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  javadoc  |   2m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m 10s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 53s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 58s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 388m 27s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m 12s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 612m 49s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/10/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3538 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell xml |
   | uname | Linux 63dfc831ddaf 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / cc82c7c89fa3be59996b5c809dbcaf136f3ccbd3 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 

[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=671177=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671177
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 28/Oct/21 02:55
Start Date: 28/Oct/21 02:55
Worklog Time Spent: 10m 
  Work Description: tomscut commented on a change in pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#discussion_r737983441



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
##
@@ -450,6 +460,25 @@ private void logAuditEvent(boolean succeeded,
 }
   }
 
+  private void appendClientPortToCallerContextIfAbsent() {
+String clientPortInfo = CLIENT_PORT_STR + ":" + Server.getRemotePort();
+final CallerContext ctx = CallerContext.getCurrent();
+if (isClientPortInfoAbsent(clientPortInfo, ctx)) {
+  String origContext = ctx == null ? null : ctx.getContext();
+  byte[] origSignature = ctx == null ? null : ctx.getSignature();
+  CallerContext.setCurrent(
+  new CallerContext.Builder(origContext, contextFieldSeparator)
+  .append(clientPortInfo)

Review comment:
   Thanks @jojochuang for your advice. I'll fix it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 671177)
Time Spent: 5h 10m  (was: 5h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=671162=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-671162
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 28/Oct/21 02:03
Start Date: 28/Oct/21 02:03
Worklog Time Spent: 10m 
  Work Description: jojochuang commented on a change in pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#discussion_r737965807



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
##
@@ -450,6 +460,25 @@ private void logAuditEvent(boolean succeeded,
 }
   }
 
+  private void appendClientPortToCallerContextIfAbsent() {
+String clientPortInfo = CLIENT_PORT_STR + ":" + Server.getRemotePort();
+final CallerContext ctx = CallerContext.getCurrent();
+if (isClientPortInfoAbsent(clientPortInfo, ctx)) {
+  String origContext = ctx == null ? null : ctx.getContext();
+  byte[] origSignature = ctx == null ? null : ctx.getSignature();
+  CallerContext.setCurrent(
+  new CallerContext.Builder(origContext, contextFieldSeparator)
+  .append(clientPortInfo)

Review comment:
   this is good. To make it more understandable, you may use
   append(CLIENT_PORT_STR, Server.getRemotePort())  instead. (will need to 
update TestAuditlogger accordingly)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 671162)
Time Spent: 5h  (was: 4h 50m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=670445=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-670445
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 27/Oct/21 02:05
Start Date: 27/Oct/21 02:05
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-952473223


   Hi @tasanuma @jojochuang @ayushtkn @ferhui , could you please take a look at 
this PR again. Thank you very much.
   
   The code was submitted with multiple versions, resulting in some invalid 
changes (removing some white space).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 670445)
Time Spent: 4h 50m  (was: 4h 40m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=670350=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-670350
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 26/Oct/21 20:54
Start Date: 26/Oct/21 20:54
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-952317143


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  3s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 38s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m  7s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  22m 59s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  18m 55s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   3m 44s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 13s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 18s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 19s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   5m 51s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 22s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 46s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  20m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 51s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  18m 51s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 39s |  |  root: The patch generated 
0 new + 606 unchanged - 2 fixed = 606 total (was 608)  |
   | +1 :green_heart: |  mvnsite  |   3m 11s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  javadoc  |   2m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m  3s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 31s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 28s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 398m 15s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/9/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 11s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 620m  1s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.web.TestWebHdfsFileSystemContract |
   |   | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/9/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3538 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell xml |
   | uname | Linux 522f9b2296f1 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 389d469a234ae6f858e069c797e48298efad6d3d |
   | 

[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=669909=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-669909
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 26/Oct/21 06:22
Start Date: 26/Oct/21 06:22
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-95165


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m  0s |  |  Docker mode activated.  |
   | -1 :x: |  patch  |   0m 21s |  |  
https://github.com/apache/hadoop/pull/3538 does not apply to trunk. Rebase 
required? Wrong Branch? See 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.  
|
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/3538 |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/8/console |
   | versions | git=2.17.1 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 669909)
Time Spent: 4.5h  (was: 4h 20m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=669861=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-669861
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 26/Oct/21 00:41
Start Date: 26/Oct/21 00:41
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-951452758


   Previously, I didn't think it would be too much of an impact to append a 
port to the IP field if the feature was made configurable, but users might need 
to change the resolution rules.
   
   Considering compatibility, @tasanuma suggests adding fields, and @jojochuang 
suggests putting port in the CallerContext. If we put the port in the 
CallerContext, it will not affect field resolution. The content in the 
callerContext is also dynamic, which is more flexible. 
   
   Thank you all for your advice and help. I will update the PR later. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 669861)
Time Spent: 4h 20m  (was: 4h 10m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=669856=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-669856
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 26/Oct/21 00:30
Start Date: 26/Oct/21 00:30
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-951446234


   > The API is declared Public, Evolving. If it stays in Hadoop 3.4.0 I am 
fine with it.
   > 
   > We used to have an audit logger (Cloudera Navigator) that extends the 
AuditLogger interface. But we've moved away from that.
   > 
   > Performance: It would have a slight performance penalty because every 
audit log op will always convert InetAddress to a string, regardless if audit 
logger is off (audit log level = debug or dfs.namenode.audit.log.debug.cmdlist 
has the excluded op)). It's probably acceptable since audit is logged outside 
of namenode lock.
   > 
   > CallerContext: the caller context is probably a better option when you 
want to do fine-grained post-mortem anyway. Maybe we can modify the caller 
context to attach remote port so that it doesn't break api compatibility. Just 
a thought.
   
   Thanks @jojochuang for your careful consideration and advice.
   
   I think it's a very good idea to add remote port to the CallerContext, these 
will not affect the compatibility @tasanuma  mentioned. After the user enable 
the CallerContext, we add clientPort to the CallerContext, similar to how the 
Router sets clientIp to the CallerContext. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 669856)
Time Spent: 4h 10m  (was: 4h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=669855=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-669855
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 26/Oct/21 00:28
Start Date: 26/Oct/21 00:28
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-951445364


   > The API is declared Public, Evolving. If it stays in Hadoop 3.4.0 I am 
fine with it.
   > 
   > We used to have an audit logger (Cloudera Navigator) that extends the 
AuditLogger interface. But we've moved away from that.
   > 
   > Performance: It would have a slight performance penalty because every 
audit log op will always convert InetAddress to a string, regardless if audit 
logger is off (audit log level = debug or dfs.namenode.audit.log.debug.cmdlist 
has the excluded op)). It's probably acceptable since audit is logged outside 
of namenode lock.
   > 
   > CallerContext: the caller context is probably a better option when you 
want to do fine-grained post-mortem anyway. Maybe we can modify the caller 
context to attach remote port so that it doesn't break api compatibility. Just 
a thought.
   
   
   
   > I haven't gone through the entire discussion/code. Just that whether we 
should modify the existing field or add a new one. Technically both are correct 
and I don't see any serious issue with either(not thinking too deep). But I 
feel for the parsers to adapt, if there was a new field, it might be little bit 
more easy, Rather than trying to figure out whether the existing field has a 
port or not. Just my thoughts, I am Ok with whichever way most people tend to 
agree. Anyway whatever we do should be optional & guarded by a config.
   
   Thanks @ayushtkn for your comments and suggestions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 669855)
Time Spent: 4h  (was: 3h 50m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=669400=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-669400
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 25/Oct/21 08:11
Start Date: 25/Oct/21 08:11
Worklog Time Spent: 10m 
  Work Description: jojochuang commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-950643667


   The API is declared Public, Evolving. If it stays in Hadoop 3.4.0 I am fine 
with it.
   
   We used to have an audit logger (Cloudera Navigator) that extends the 
AuditLogger interface. But we've moved away from that.
   
   Performance:
   It would have a slight performance penalty because every audit log op will 
always convert InetAddress to a string, regardless if audit logger is off 
(audit log level = debug or dfs.namenode.audit.log.debug.cmdlist has the 
excluded op)). It's probably acceptable since audit is logged outside of 
namenode lock.
   
   CallerContext:
the caller context is probably a better option when you want to do 
fine-grained post-mortem anyway. Maybe we can modify the caller context to 
attach remote port so that it doesn't break api compatibility. Just a thought.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 669400)
Time Spent: 3h 50m  (was: 3h 40m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=669214=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-669214
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 23/Oct/21 15:43
Start Date: 23/Oct/21 15:43
Worklog Time Spent: 10m 
  Work Description: ayushtkn edited a comment on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-949246786


   I haven't gone through the entire discussion/code. Just that whether we 
should modify the existing field or add a new one. Technically both are correct 
and I don't see any serious issue with either(not thinking too deep). But I 
feel for the parsers to adapt, if there was a new field, it might be little bit 
more easy, Rather than trying to figure out whether the existing field has  a 
port or not. Just my thoughts, I am Ok with whichever way most people tend to 
agree.
   Anyway whatever we do should be optional & guarded by a config.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 669214)
Time Spent: 3h 40m  (was: 3.5h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=668768=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-668768
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 22/Oct/21 03:00
Start Date: 22/Oct/21 03:00
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-949246786


   I haven't gone through the entire discussion/code. Just that whether we 
should modify the existing field or add a new one. Technically both are correct 
and I don't see any serious issue with either(not thinking to deep). But I feel 
for the parsers to adapt, if there was a new field, it might be little bit more 
easy, Rather than trying to figure out whether the existing field has  a port 
or not. Just my thoughts, I am Ok with whichever way most people tend to agree.
   Anyway whatever we do should be optional & guarded by a config.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 668768)
Time Spent: 3.5h  (was: 3h 20m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=668307=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-668307
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 21/Oct/21 11:26
Start Date: 21/Oct/21 11:26
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-948516568


   > Thanks for updating the PR, @tomscut.
   > 
   > * I discussed with my colleagues, and they suggested that adding a new 
port field would have less impact on users who are analyzing the audit logs 
instead of expanding the existing IP field. What do you think?
   > * After [HDFS-13293](https://issues.apache.org/jira/browse/HDFS-13293), 
Router is forwarding client IP via CallerContext. How about adding the 
client-side port to the CallerContext as well? Maybe we can consider it in 
another JIRA.
   
   Thanks @tasanuma and your colleagues for your good advice. And sorry for the 
late reply.
   
   ```I discussed with my colleagues, and they suggested that adding a new port 
field would have less impact on users who are analyzing the audit logs instead 
of expanding the existing IP field. What do you think?```
   I think it would be nice to put the port in a separate field, but adding the 
port to the IP field is optional at the moment, so I'm a little confused which 
way is more appropriate. I'd like to ask a few other committers to look at this 
and give some suggestions. Anyway, I will update the PR in time.
   @aajisaka @iwasakims @ayushtkn @ferhui @Hexiaoqiao @goiri @jojochuang Could 
you please take a look at this and give some suggestions? Thank you very much!
   
   
   
   ```After [HDFS-13293](https://issues.apache.org/jira/browse/HDFS-13293), 
Router is forwarding client IP via CallerContext. How about adding the 
client-side port to the CallerContext as well? Maybe we can consider it in 
another JIRA.```
   I would like to open a new JIRA to do this. Thank you for pointing this out.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 668307)
Time Spent: 3h 20m  (was: 3h 10m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=668144=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-668144
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 21/Oct/21 08:18
Start Date: 21/Oct/21 08:18
Worklog Time Spent: 10m 
  Work Description: tasanuma commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-948371693


   Thanks for updating the PR, @tomscut.
   - I discussed with my colleagues, and they suggested that adding a new port 
field would have less impact on users who are analyzing the audit logs instead 
of expanding the existing IP field. What do you think?
   - After HDFS-13293, Router is forwarding client IP via CallerContext. How 
about adding the client-side port to the CallerContext as well? Maybe we can 
consider it in another JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 668144)
Time Spent: 3h 10m  (was: 3h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=667661=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-667661
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 20/Oct/21 11:51
Start Date: 20/Oct/21 11:51
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-947589017


   Hi @yakirgb @iwasakims @aajisaka @tasanuma @ayushtkn @ferhui, I made it 
configurable. Please help review the code. Thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 667661)
Time Spent: 3h  (was: 2h 50m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=667657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-667657
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 20/Oct/21 11:46
Start Date: 20/Oct/21 11:46
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-947586074


   ```
   Failed junit tests | 
hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS
   ```
   
   This failed unit test is unrelated to the change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 667657)
Time Spent: 2h 50m  (was: 2h 40m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=667654=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-667654
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 20/Oct/21 11:39
Start Date: 20/Oct/21 11:39
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-947581649


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 12s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 58s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  24m 11s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  24m 50s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  21m 26s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   4m 21s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 21s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 19s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 26s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m 16s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 53s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  6s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  21m 55s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 12s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  19m 12s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 50s |  |  root: The patch generated 
0 new + 683 unchanged - 3 fixed = 683 total (was 686)  |
   | +1 :green_heart: |  mvnsite  |   3m  8s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  javadoc  |   2m  5s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m  6s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 17s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 32s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 411m 55s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 23s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 641m 42s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3538 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell xml |
   | uname | Linux f885e95ea1fc 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / ea1a14cd6307273b9bee07d2817e98b66e56de00 |
   | Default Java | Private 

[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=667089=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-667089
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 19/Oct/21 15:39
Start Date: 19/Oct/21 15:39
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-946849895


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  17m  6s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 53s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  21m 24s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  22m 19s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  19m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   3m 46s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  9s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 21s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   5m 52s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 56s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 40s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  21m 40s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 52s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  18m 52s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 41s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/5/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 2 new + 683 unchanged - 3 fixed = 685 total (was 
686)  |
   | +1 :green_heart: |  mvnsite  |   3m  7s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  javadoc  |   2m 11s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m 10s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 30s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 29s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 402m 27s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 11s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 637m 38s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS |
   |   | hadoop.hdfs.web.TestWebHdfsFileSystemContract |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3538 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell xml |
   | uname | Linux 79618fba0cb0 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |

[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=666735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-666735
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 19/Oct/21 07:13
Start Date: 19/Oct/21 07:13
Worklog Time Spent: 10m 
  Work Description: tomscut commented on a change in pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#discussion_r731559954



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
##
@@ -369,11 +369,22 @@ public static int getCallRetryCount() {
   }
 
   /** Returns the remote side ip address when invoked inside an RPC 
-   *  Returns null incase of an error.
+   *  Returns null in case of an error.
*/
   public static InetAddress getRemoteIp() {
 Call call = CurCall.get();
-return (call != null ) ? call.getHostInetAddress() : null;
+return (call != null) ? call.getHostInetAddress() : null;
+  }
+
+  /**
+   * Returns the remote side port when invoked inside an RPC
+   * Returns an empty string in case of an error.
+   */
+  public static String getRemoteIpAndPort() {
+Call call = CurCall.get();
+return (call != null) ?
+new StringBuffer().append(call.getHostInetAddress()).
+append(":").append(call.getRemotePort()).toString() : "";

Review comment:
   > StringBuffer is thread safe and that's why it is slower than 
StringBuilder. Here I think using '+' is sufficient to concatenate strings 
because '+' is translated to StringBuilder when compiled.
   
   Thanks @aajisaka for your suggestion, I made some updates. PTAL. Thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 666735)
Time Spent: 2h 20m  (was: 2h 10m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=666729=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-666729
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 19/Oct/21 07:03
Start Date: 19/Oct/21 07:03
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-946423052


   > It's good that port is optional.
   
   Thanks @ferhui for your comment. I made it configurable and committed just 
now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 666729)
Time Spent: 2h 10m  (was: 2h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=666725=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-666725
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 19/Oct/21 06:56
Start Date: 19/Oct/21 06:56
Worklog Time Spent: 10m 
  Work Description: ferhui commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-946419243


   It's good that port is optional.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 666725)
Time Spent: 2h  (was: 1h 50m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=666221=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-666221
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 18/Oct/21 10:33
Start Date: 18/Oct/21 10:33
Worklog Time Spent: 10m 
  Work Description: tomscut edited a comment on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-944128111


   Thanks @Akira Ajisaka ***@***.***> for your comments and
   suggestion. I will update it ASAP.
   
   Akira Ajisaka ***@***.***> 于2021年10月15日周五 下午3:27写道:
   
   > And we can make it optional as @iwasakims 
   > suggested. Thank you very much.
   >
   > +1 to make it optional.
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > , or
   > unsubscribe
   > 

   > .
   > Triage notifications on the go with GitHub Mobile for iOS
   > 

   > or Android
   > 
.
   >
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 666221)
Time Spent: 1h 50m  (was: 1h 40m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=663846=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-663846
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 12/Oct/21 02:18
Start Date: 12/Oct/21 02:18
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-940596784


   Hi @jojochuang @tasanuma @goiri @Hexiaoqiao, could you please also take a 
look at this? Do you think it is reasonable to add remote port into ip field 
like ```ip=/0.0.0.0:1234```? And we can make it optional as @iwasakims 
suggested. Thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 663846)
Time Spent: 1h 40m  (was: 1.5h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=663776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-663776
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 11/Oct/21 21:24
Start Date: 11/Oct/21 21:24
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-940453502


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 59s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 41s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  20m 54s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 13s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  18m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   3m 37s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 13s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   5m 48s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 50s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  20m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 38s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  18m 38s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 37s |  |  root: The patch generated 
0 new + 484 unchanged - 3 fixed = 484 total (was 487)  |
   | +1 :green_heart: |  mvnsite  |   3m 11s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m  3s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 27s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 33s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 395m 21s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  asflicense  |   1m  8s | 
[/results-asflicense.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/4/artifact/out/results-asflicense.txt)
 |  The patch generated 1 ASF License warnings.  |
   |  |   | 612m 11s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestProcessCorruptBlocks |
   |   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
   |   | hadoop.hdfs.server.namenode.TestFSDirectory |
   |   | hadoop.hdfs.server.namenode.TestNamenodeStorageDirectives |
   |   | hadoop.hdfs.server.namenode.TestReencryption |
   |   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
   |   | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
   |   | hadoop.hdfs.server.namenode.TestCacheDirectivesWithViewDFS |
   |   | hadoop.hdfs.TestViewDistributedFileSystemWithMountLinks |
   |   | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks |
   |   | hadoop.hdfs.server.namenode.TestEditLogRace |
   |   | hadoop.hdfs.server.namenode.TestNameNodeRecovery |
   
   
   | Subsystem | Report/Notes |

[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=663473=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-663473
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 11/Oct/21 11:50
Start Date: 11/Oct/21 11:50
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-939956618


   > Hi @tomscut Is it necessary to add port to audit log. I am afraid that 
this will have a big influence. Because many users analyze audit log as table 
and this maybe change format.
   
   Thanks @ferhui for your review.
   
   I also have similar concerns, so I published a discussion to HADOOP 
community, hoping to let the members of the community know about this matter 
and then give some suggestions.
   
   However, Adding port is to modify the internal content of the IP field, 
which has little impact on the overall layout. In our cluster, we parse the 
audit log through Vector and send the data to Kafka, which is unaffected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 663473)
Time Spent: 1h 20m  (was: 1h 10m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=663470=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-663470
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 11/Oct/21 11:41
Start Date: 11/Oct/21 11:41
Worklog Time Spent: 10m 
  Work Description: ferhui commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-939950995


   Hi @tomscut  Is it necessary to add port to  audit log. I am afraid that 
this will have a big influence. Because many users analyze audit log as table 
and this maybe change format.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 663470)
Time Spent: 1h 10m  (was: 1h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=663437=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-663437
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 11/Oct/21 10:36
Start Date: 11/Oct/21 10:36
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-939903987


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  2s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 49s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  21m 40s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  22m 38s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  21m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   4m  2s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 20s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m 11s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 27s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 23s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  24m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  19m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 38s |  |  root: The patch generated 
0 new + 484 unchanged - 3 fixed = 484 total (was 487)  |
   | +1 :green_heart: |  mvnsite  |   3m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 22s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m 16s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 22s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 30s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 378m 41s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 10s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 608m 51s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestFsck |
   |   | hadoop.hdfs.web.TestWebHdfsFileSystemContract |
   |   | hadoop.hdfs.server.namenode.TestAuditLogger |
   |   | hadoop.hdfs.TestSnapshotCommands |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3538 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux cb588e754ddb 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 7a340d4aa2f435a6d68f0439c5f07465dc050c67 |
   | Default Java 

[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=663417=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-663417
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 11/Oct/21 09:52
Start Date: 11/Oct/21 09:52
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-939870062


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 19s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 42s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 18s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  19m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   3m 58s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  4s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 11s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   5m 49s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 48s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 22s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 30s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  22m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 52s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  19m 52s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 50s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/2/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 2 new + 485 unchanged - 2 fixed = 487 total (was 
487)  |
   | +1 :green_heart: |  mvnsite  |   3m  2s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m  2s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m 10s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 17s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 14s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 359m  8s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  1s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 588m  6s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogger |
   |   | hadoop.hdfs.server.namenode.TestFsck |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3538 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux d2119707873b 4.15.0-143-generic #147-Ubuntu SMP Wed Apr 14 
16:10:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 

[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=663292=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-663292
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 11/Oct/21 00:14
Start Date: 11/Oct/21 00:14
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-939580772


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  4s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 53s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  20m 54s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 25s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  18m 47s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   3m 45s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 14s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 25s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   5m 43s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  8s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 39s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  20m 39s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 33s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  18m 33s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 36s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/1/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 2 new + 485 unchanged - 2 fixed = 487 total (was 
487)  |
   | +1 :green_heart: |  mvnsite  |   3m 13s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 19s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 22s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m  7s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 12s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 27s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 373m 24s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  2s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 589m 50s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS |
   |   | hadoop.hdfs.server.namenode.TestFsck |
   |   | hadoop.hdfs.server.namenode.TestAuditLogger |
   |   | 
hadoop.fs.viewfs.TestViewFileSystemOverloadSchemeHdfsFileSystemContract |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3538/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3538 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 65cac33ec84e 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 

[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=663290=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-663290
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 11/Oct/21 00:03
Start Date: 11/Oct/21 00:03
Worklog Time Spent: 10m 
  Work Description: tomscut commented on a change in pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#discussion_r725721616



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
##
@@ -409,9 +420,9 @@ public static String getAuxiliaryPortEstablishedQOP() {
 Call call = CurCall.get();
 return call != null ? call.clientId : RpcConstants.DUMMY_CLIENT_ID;
   }
-  
+
   /** Returns remote address as a string when invoked inside an RPC.
-   *  Returns null in case of an error.
+   *  Returns null in case of an error.n

Review comment:
   > Extra `n`
   
   Thanks for your review. I fixed it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 663290)
Time Spent: 0.5h  (was: 20m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=663259=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-663259
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 10/Oct/21 15:10
Start Date: 10/Oct/21 15:10
Worklog Time Spent: 10m 
  Work Description: yakirgb commented on a change in pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#discussion_r725653854



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
##
@@ -409,9 +420,9 @@ public static String getAuxiliaryPortEstablishedQOP() {
 Call call = CurCall.get();
 return call != null ? call.clientId : RpcConstants.DUMMY_CLIENT_ID;
   }
-  
+
   /** Returns remote address as a string when invoked inside an RPC.
-   *  Returns null in case of an error.
+   *  Returns null in case of an error.n

Review comment:
   Extra `n`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 663259)
Time Spent: 20m  (was: 10m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-10-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=663255=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-663255
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 10/Oct/21 14:23
Start Date: 10/Oct/21 14:23
Worklog Time Spent: 10m 
  Work Description: tomscut opened a new pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538


   JIRA: [HDFS-16266](https://issues.apache.org/jira/browse/HDFS-16266)
   
   In our production environment, we occasionally encounter a problem where a 
user submits an abnormal computation task, causing a sudden flood of requests, 
which causes the queueTime and processingTime of the Namenode to rise very 
high, causing a large backlog of tasks.
   
   We usually locate and kill specific Spark, Flink, or MapReduce tasks based 
on metrics and audit logs. Currently, IP and UGI are recorded in audit logs, 
but there is no port information, so it is difficult to locate specific 
processes sometimes. Therefore, I propose that we add the port information to 
the audit log, so that we can easily track the upstream process.
   
   Currently, some projects contain port information in audit logs, such as 
Hbase and Alluxio. I think it is also necessary to add port information for 
HDFS audit logs.
   
   Before:
   
![before-hdfs-audit-log](https://user-images.githubusercontent.com/55134131/136699770-e5c07e90-0046-43ba-8c1c-a0e94a02657d.jpg)
   
   
   After:
   
![hdfs-audit-log](https://user-images.githubusercontent.com/55134131/136699624-13bb3375-398b-473f-9f8a-212bdd5ec765.jpg)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 663255)
Remaining Estimate: 0h
Time Spent: 10m

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org