[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-12-08 Thread Toshihiro Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17246225#comment-17246225
 ] 

Toshihiro Suzuki commented on HDFS-15217:
-

Sound good. Thank you [~ahussein].

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-12-08 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245950#comment-17245950
 ] 

Ahmed Hussein commented on HDFS-15217:
--

Thank you [~brfrn169] and [~inigoiri] for the explanation.
> Do you have specific numbers?

As we were moving from 2.8 to 2.10, [~daryn] reported that 2.10 token ops are 
~1.3X slower.
His initial intuition is as follows:
* adding token ops to the audit log. In doing so, multiple ugis are 
accidentally created to merely get the user name when it already exists in the 
identifier.
* token operations are acquiring the fsn write lock. Like lease renewal and 
token expiration, the read lock is sufficient because the namesystem is not 
mutated.

My concern with {{getLockReportInfoSupplier()}} is that -- similar to first 
point-- it accidentally increased the overhead of creating ugis and detailed 
information that are already in the identifier.
The same information is going to be generated again in the audio log.
For example assume that we have ~1 million lock operations within an hour, then 
{{getLockReportInfoSupplier()}} allocates 1+ million objects to the java heap.
That's a significant impact on long-time JVM considering that only a few of 
those million objects used when the lock-time exceeds the threshold.

My proposal is that we have the following three optimizations to try:
# Try out [~daryn]'s suggestion replacing {{writeLock}} with {{readLock}} in 
some Ops.
# eliminate/reduce ugi creation within the ipc handlers due to the performance 
impact.
# Since [~brfrn169] asserts that lock information is important, then we can 
create a follow-up jira trying to re-evaluate the overhead (memory and time). 
If necessary, then maybe some optimizations can be done to reuse 
{{getLockReportInfoSupplier()}} for both {{auditLog}}, and {{unlock()}}. That 
way, allocating the supplier objects would pay off.

WDYT guys?


> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-12-08 Thread Toshihiro Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245749#comment-17245749
 ] 

Toshihiro Suzuki commented on HDFS-15217:
-

{quote}
The ops information can be retrieved from the AuditLog. Isn't the AuditLog 
enough to see the ops?
Was there a concern regarding a possible deadLock? Then, why not using debug 
instead of adding that overhead to hot production code?
{quote}

I don't think the AuditLog is enough to see the ops when there are huge number 
of opts. As mentioned in the Description, I faced the  long time write-lock 
held issue, but I was not able to identify which operation caused the long time 
write-lock held from the NN log and the AuditLog. That's why I thought that we 
needed this change and added more information to the longest write/read lock 
held log.

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-12-08 Thread Toshihiro Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245740#comment-17245740
 ] 

Toshihiro Suzuki commented on HDFS-15217:
-

Thank you [~ahussein]. Are we sure the slowdown is caused by the changes in 
this Jira?

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-12-07 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245698#comment-17245698
 ] 

Íñigo Goiri commented on HDFS-15217:


[~ahussein] that's concerning.
Do you have specific numbers?

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-12-07 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245565#comment-17245565
 ] 

Ahmed Hussein commented on HDFS-15217:
--

We have seen a slowdown in token ops compared to older releases. So, we were 
looking into ways of optimizations into 3.x.
I came across this jira by chance.

[~brfrn169], I have couple of questions about the justifications behind adding 
that code:
* The ops information can be retrieved from the AuditLog. Isn't the AuditLog 
enough to see the ops?
* Was there a concern regarding a possible deadLock? Then, why not using 
{{debug}} instead of adding that overhead to hot production code?

I can see that all unlock operations are calling 
{{getLockReportInfoSupplier()}}. Even, {{getLockReportInfoSupplier(null)}} is 
not a no-op.
I saw quick evaluations about the overhead, but I am concerned for the 
following reasons:
* There is an overhead even though the string evaluation is lazy. The overhead 
is: allocating a new {{Supplier}} object even though the supplier get 
method is not being called. This is a capturing lambda. Therefore, it has to be 
evaluated every time.
* In a production system, Allocating an object while releasing a lock is 
dangerous because that last allocation could trigger a GC. This makes 
evaluating the patch tricky because there is a considerable large delta error 
depending on whether or not a GC has been triggered. The worst case scenario is 
to-trigger a GC while allocating an object that is going to be suppressed at 
the end.
* On a production system, this is "Hot" . Especially; 
{{getLockReportInfoSupplier()}} is added to all the token ops such as 
{{getDelegationToken()}}.
* The overhead of allocating the supplier could be useless because 
{{writeUnlock}} will suppress the info.


CC: [~ayushtkn], [~inigoiri]


> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-04-27 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093197#comment-17093197
 ] 

Hudson commented on HDFS-15217:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18187 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18187/])
HDFS-15298 Fix the findbugs warnings introduced in HDFS-15217 (#1979) (github: 
rev 62c26b91fd06f505a6e64fd32a36e5e67d06fa30)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-04-23 Thread Toshihiro Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091055#comment-17091055
 ] 

Toshihiro Suzuki commented on HDFS-15217:
-

Filed: https://issues.apache.org/jira/browse/HDFS-15298

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-04-23 Thread Toshihiro Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091054#comment-17091054
 ] 

Toshihiro Suzuki commented on HDFS-15217:
-

[~ayushtkn] I checked the Report again, and yes, the changes in this Jira 
introduced the findbugs warnings. Sorry I misunderstood the warnings. Will 
create a new JIRA to address the warnings. Thank you for pointing it out.

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-04-23 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090971#comment-17090971
 ] 

Ayush Saxena commented on HDFS-15217:
-

[~brfrn169] Can you check the Report once again, that is because of your 
changes only :

{noformat}

Known null at FSNamesystem.java:[line 7434]  --> This Line is added in this 
commit only
{noformat}

Another way to confirm it is by this changes is In this yetus report which is 
from your PR :
https://github.com/apache/hadoop/pull/1954#issuecomment-613238448

The trunk Findbugs is passing but your patch Findbugs is failing. Let me know, 
if still any confusion.


> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-04-23 Thread Toshihiro Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090933#comment-17090933
 ] 

Toshihiro Suzuki commented on HDFS-15217:
-

[~ayushtkn] [~elgoiri] I was aware of the findbugs warnings actually but I 
don't think they were introduced in the changes in this Jira. To fix the 
findbugs warnings, we need to change the original code.

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-04-23 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090797#comment-17090797
 ] 

Íñigo Goiri commented on HDFS-15217:


My bad, I had seen it at the beginning but forgot about it.
[~brfrn169] do you mind opening a JIRA with the fix?

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-04-23 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090503#comment-17090503
 ] 

Ayush Saxena commented on HDFS-15217:
-

Hi [~brfrn169]  [~elgoiri]

Seems this commit has introduced findbugs warnings :

[https://builds.apache.org/job/hadoop-multibranch/job/PR-1954/5/artifact/out/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html]

It was there in the PR yetus report as well, can you check once.

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-04-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17086654#comment-17086654
 ] 

Hudson commented on HDFS-15217:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18163 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18163/])
HDFS-15217 Add more information to longest write/read lock held log (github: 
rev 1824aee9da4056de0fb638906b2172e486bbebe7)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystemLockReport.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystemLock.java


> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15217) Add more information to longest write/read lock held log

2020-04-11 Thread Toshihiro Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17081309#comment-17081309
 ] 

Toshihiro Suzuki commented on HDFS-15217:
-

I created a PR for this. After this patch, we can see additional information in 
the lock report message as follows:
{code:java}
2020-04-11 23:04:36,020 [IPC Server handler 5 on default port 62641] INFO  
namenode.FSNamesystem (FSNamesystemLock.java:writeUnlock(321)) - Number of 
suppressed write-lock reports: 0
Longest write-lock held at 2020-04-11 23:04:36,020+0900 for 3ms by 
delete (ugi=bob (auth:SIMPLE),ip=/127.0.0.1,src=/file,dst=null,perm=null) via 
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:302)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:261)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1746)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3274)
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1130)
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:724)
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1016)
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:944)
java.security.AccessController.doPrivileged(Native Method)
javax.security.auth.Subject.doAs(Subject.java:422)
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2948)

Total suppressed write-lock held time: 0.0

{code}
This patch adds the additional information *"by delete (ugi=bob 
(auth:SIMPLE),ip=/127.0.0.1,src=/file,dst=null,perm=null)"* which is similar to 
the audit log format.

> Add more information to longest write/read lock held log
> 
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
>
> Currently, we can see the stack trace in the longest write/read lock held 
> log, but sometimes we need more information, for example, a target path of 
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to 
> troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org