[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-20 Thread Maciej Smolenski (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649744#comment-17649744
 ] 

Maciej Smolenski commented on HADOOP-18526:
---

[~ste...@apache.org] 

Thanks for the explanations. I expected that there is some reason for fixing 
this only on newer versions. Now I understand this motivation better. 

> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-20 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649710#comment-17649710
 ] 

Steve Loughran commented on HADOOP-18526:
-

Downgrade from fail to warn if trying to use dynamic partition output on a 
committer which doesn't declare support (it works, just slow)
Pass up partitions to partition list of superclass (needs a new addPartition()) 
method
HadoopMapReduceCommitProtocol scaladocs to describe abspath part of dynamic 
part output protocol

Don't take my unwillingness to backport personally...I have to deal with that 
internally for too much of my life. The problem is that often you can't cleanly 
distinguish "fix" from "feature", and what seems a simple fix 
(fs.s3a.endpoint.region) is actually a significant feature which can't be 
backported without cherrypicking a whole chain of commits (HADOOP-17705. As any 
commit can cause a regression, you end up having to keep all branches up to 
date and at that point you have to ask "why?". If all changes go back there is 
no more stability in the older branch than the newer one -so if people want the 
fixes, they may as well upgrade. Or take on the cherrypick, retest and followup 
fix maintenance themselves. We are open source, volunteers are welcome, but not 
me. Keeping trunk, 3.3.9 and getting 3.3.5 out is enough for me.

I do have to deal with cherrypicking things internally. There we are mostly up 
to date with s3a/abfs code though a few things are missing (multipart api), and 
dependencies haven't been updated when shared with other modules except when 
coordinated and the other groups are all happy. But we do end up having to do 
all that retesting, followup fixing etc. One benefit of that work, however, is 
that the full stack tests often find out regressions (example: HADOOP-18410) 
before they get into the asf releases. But it does mean a very large fraction 
of my week is spent in backporting changes to multiple internal branches and 
dealing with the consequences. Oh, and getting region support into older 
releases was a serious PITA until we worked out about 
awssdk_config_default.json. 

To close then: you get to embrace backporting or, as I recommend, pick up the 
latest release with a full set of current fixes and the changes needed for them 
to apply.



> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-19 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649281#comment-17649281
 ] 

Steve Loughran commented on HADOOP-18526:
-

3.3.5 is the maintenance release of 3.3.x. 

for 3.2.x, if you want to do a backport you can put the PR up of both of these 
and, once yetus is happy, i can merge. 

Note if you are working with cloud storage you should be on the latest the 
hadoop release you can install. We have done a lot of work there and 3.2 
shipped many years ago

> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-18 Thread Maciej Smolenski (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649178#comment-17649178
 ] 

Maciej Smolenski commented on HADOOP-18526:
---

[~ste...@apache.org] 
This is issue was observed also on 3.2.2 
(https://issues.apache.org/jira/browse/YARN-11397) - so affected version seems 
incorrect. What about fixing it on older versions ?

> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648018#comment-17648018
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1352951683

   > Lol, I was being dumb.
   no comment




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17647665#comment-17647665
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran merged PR #5144:
URL: https://github.com/apache/hadoop/pull/5144




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17647658#comment-17647658
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1351890211

   @mukund-thakur tried to clarify: the exception is created to build the stack 
trace, which is what we really want. 




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17647656#comment-17647656
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran commented on code in PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#discussion_r1048802203


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##
@@ -459,6 +458,13 @@ public void initialize(URI name, Configuration 
originalConf)
 AuditSpan span = null;
 try {
   LOG.debug("Initializing S3AFileSystem for {}", bucket);
+  if (LOG.isTraceEnabled()) {
+// log a full trace for deep diagnostics of where an object is created,
+// for tracking down memory leak issues.
+LOG.trace("Filesystem for {} created; fs.s3a.impl.disable.cache = {}",
+name, originalConf.getBoolean("fs.s3a.impl.disable.cache", false),
+new RuntimeException(super.toString()));

Review Comment:
   1. generating that runtime exception creates a stack trace which then gets 
logged in the trace call, so we can see who is creating the fs instance
   2. calling super.toString() so we get the minimal instance id rather than 
the expanded tostring with all the stats and stuff. This is initialize, most of 
those values are unset so it would only confuse.
   
   it is that stack trace which is the most critical thing as it lets us track 
down where FS instances are being created and then not closed.





> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17647627#comment-17647627
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

mukund-thakur commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1351814490

   
https://github.com/apache/hadoop/pull/5144/files/48974549308b1eb63cb832e3a04beef3f593018d#r1046453289
   I just had one clarification. 




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17647506#comment-17647506
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1351515139

   i am still happy with this patch and want to get it in to 3.3.5. please can 
i have reviewers tell me if they are happy now. thx




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646351#comment-17646351
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

mukund-thakur commented on code in PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#discussion_r1046453289


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##
@@ -459,6 +458,13 @@ public void initialize(URI name, Configuration 
originalConf)
 AuditSpan span = null;
 try {
   LOG.debug("Initializing S3AFileSystem for {}", bucket);
+  if (LOG.isTraceEnabled()) {
+// log a full trace for deep diagnostics of where an object is created,
+// for tracking down memory leak issues.
+LOG.trace("Filesystem for {} created; fs.s3a.impl.disable.cache = {}",
+name, originalConf.getBoolean("fs.s3a.impl.disable.cache", false),
+new RuntimeException(super.toString()));

Review Comment:
   Why not just print it? I mean I don't understand the reason behind wrapping 
in RuntimeEx. 
   Also base FileStystem doesn't implement toString() so there won't be 
anything. Why not use this.tpString()?





> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646087#comment-17646087
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran commented on code in PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#discussion_r1043226728


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##
@@ -459,6 +458,13 @@ public void initialize(URI name, Configuration 
originalConf)
 AuditSpan span = null;
 try {
   LOG.debug("Initializing S3AFileSystem for {}", bucket);
+  if (LOG.isTraceEnabled()) {
+// log a full trace for deep diagnostics of where an object is created,
+// for tracking down memory leak issues.
+LOG.trace("Filesystem for {} created; fs.s3a.impl.disable.cache = {}",
+name, originalConf.getBoolean("fs.s3a.impl.disable.cache", false),
+new RuntimeException(super.toString()));

Review Comment:
   we don't throw it, just trace it. it can be anything. what is your 
suggestion?



##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##
@@ -3999,22 +4005,18 @@ public void close() throws IOException {
 }
 isClosed = true;
 LOG.debug("Filesystem {} is closed", uri);
-if (getConf() != null) {
-  String iostatisticsLoggingLevel =
-  getConf().getTrimmed(IOSTATISTICS_LOGGING_LEVEL,
-  IOSTATISTICS_LOGGING_LEVEL_DEFAULT);
-  logIOStatisticsAtLevel(LOG, iostatisticsLoggingLevel, getIOStatistics());
-}
 try {
   super.close();
 } finally {
   stopAllServices();
-}
-// Log IOStatistics at debug.
-if (LOG.isDebugEnabled()) {
-  // robust extract and convert to string
-  LOG.debug("Statistics for {}: {}", uri,
-  IOStatisticsLogging.ioStatisticsToPrettyString(getIOStatistics()));
+  // log IO statistics, including of any file deletion during

Review Comment:
   it means "including iostatistics of any file deletion..." so IMO it's valid



##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##
@@ -3999,22 +4005,18 @@ public void close() throws IOException {
 }
 isClosed = true;
 LOG.debug("Filesystem {} is closed", uri);
-if (getConf() != null) {
-  String iostatisticsLoggingLevel =
-  getConf().getTrimmed(IOSTATISTICS_LOGGING_LEVEL,
-  IOSTATISTICS_LOGGING_LEVEL_DEFAULT);
-  logIOStatisticsAtLevel(LOG, iostatisticsLoggingLevel, getIOStatistics());
-}
 try {
   super.close();
 } finally {
   stopAllServices();

Review Comment:
   not worried there. the system tests verify that you can still call 
instrumentation methods safely, it is just unregistered



##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInstrumentation.java:
##
@@ -257,7 +275,8 @@ private void registerAsMetricsSource(URI name) {
   number = ++metricsSourceNameCounter;
 }
 String msName = METRICS_SOURCE_BASENAME + number;
-metricsSourceName = msName + "-" + name.getHost();
+String metricsSourceName = msName + "-" + name.getHost();
+metricsSourceReference = new WeakRefMetricsSource(metricsSourceName, this);

Review Comment:
   not using this though, are we?





> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645000#comment-17645000
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

mukund-thakur commented on code in PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#discussion_r1043859890


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##
@@ -3999,22 +4005,18 @@ public void close() throws IOException {
 }
 isClosed = true;
 LOG.debug("Filesystem {} is closed", uri);
-if (getConf() != null) {
-  String iostatisticsLoggingLevel =
-  getConf().getTrimmed(IOSTATISTICS_LOGGING_LEVEL,
-  IOSTATISTICS_LOGGING_LEVEL_DEFAULT);
-  logIOStatisticsAtLevel(LOG, iostatisticsLoggingLevel, getIOStatistics());
-}
 try {
   super.close();
 } finally {
   stopAllServices();

Review Comment:
   instrumentation is not set to null after closing. Same with auditManager. 





> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644932#comment-17644932
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

hadoop-yetus commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1343107306

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 50s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 20s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 39s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  21m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m  1s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 25s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 17s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 56s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 54s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 29s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  24m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 43s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  21m 43s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 51s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/7/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 1 new + 3 unchanged - 0 fixed = 4 total (was 3)  |
   | +1 :green_heart: |  mvnsite  |   2m 22s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m  4s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 46s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 19s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   2m 51s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 52s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 238m 31s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/7/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5144 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 594476bdd14f 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 627910bb26cc21528c88509683fa639737aba26f |
   | Default Java | Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/7/testReport/ |
   | Max. process+thread count | 

[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644758#comment-17644758
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

mehakmeet commented on code in PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#discussion_r1043223695


##
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestInstrumentationLifecycle.java:
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import java.net.URI;
+
+import org.junit.Test;
+
+import org.apache.hadoop.metrics2.MetricsSystem;
+import org.apache.hadoop.test.AbstractHadoopTestBase;
+
+import static org.apache.hadoop.fs.s3a.S3AInstrumentation.getMetricsSystem;
+import static org.apache.hadoop.fs.s3a.Statistic.DIRECTORIES_CREATED;
+import static org.assertj.core.api.Assertions.assertThat;
+
+public class TestInstrumentationLifecycle extends AbstractHadoopTestBase {

Review Comment:
   Javadoc.



##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##
@@ -3999,22 +4005,18 @@ public void close() throws IOException {
 }
 isClosed = true;
 LOG.debug("Filesystem {} is closed", uri);
-if (getConf() != null) {
-  String iostatisticsLoggingLevel =
-  getConf().getTrimmed(IOSTATISTICS_LOGGING_LEVEL,
-  IOSTATISTICS_LOGGING_LEVEL_DEFAULT);
-  logIOStatisticsAtLevel(LOG, iostatisticsLoggingLevel, getIOStatistics());
-}
 try {
   super.close();
 } finally {
   stopAllServices();
-}
-// Log IOStatistics at debug.
-if (LOG.isDebugEnabled()) {
-  // robust extract and convert to string
-  LOG.debug("Statistics for {}: {}", uri,
-  IOStatisticsLogging.ioStatisticsToPrettyString(getIOStatistics()));
+  // log IO statistics, including of any file deletion during

Review Comment:
   typo: ~of~





> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644739#comment-17644739
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1342539682

   > I can't find where is instrumentation getting closed in fs.close()
   
   done in `stopAllServices()`




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644547#comment-17644547
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

mukund-thakur commented on code in PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#discussion_r1042617624


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##
@@ -459,6 +458,13 @@ public void initialize(URI name, Configuration 
originalConf)
 AuditSpan span = null;
 try {
   LOG.debug("Initializing S3AFileSystem for {}", bucket);
+  if (LOG.isTraceEnabled()) {
+// log a full trace for deep diagnostics of where an object is created,
+// for tracking down memory leak issues.
+LOG.trace("Filesystem for {} created; fs.s3a.impl.disable.cache = {}",
+name, originalConf.getBoolean("fs.s3a.impl.disable.cache", false),
+new RuntimeException(super.toString()));

Review Comment:
   Why RTE?





> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644028#comment-17644028
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

hadoop-yetus commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1339996002

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 41s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 21s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 12s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |  20m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   3m 42s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 36s |  |  trunk passed  |
   | -1 :x: |  javadoc  |   1m 14s | 
[/branch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/6/artifact/out/branch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt)
 |  hadoop-common in trunk failed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 57s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 51s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 20s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |  22m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  20m 23s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 32s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 36s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m  5s | 
[/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/6/artifact/out/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt)
 |  hadoop-common in the patch failed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m  6s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m  4s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 18s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   2m 53s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 59s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 225m 42s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5144 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 5c3da2b5ff42 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 48974549308b1eb63cb832e3a04beef3f593018d |
   | Default Java | Private 

[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642175#comment-17642175
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

hadoop-yetus commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1334467998

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 53s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  19m 41s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  29m 14s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 27s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  22m  6s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 34s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 56s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 50s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 24s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 39s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 24s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 39s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 46s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  24m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 12s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  22m 12s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 12s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 53s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 59s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 48s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 36s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 23s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 49s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m  5s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 10s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 253m 29s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5144 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 5e4fc2e545c5 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / f8cdd3d3e5e17b5a49a6f57336446df065f9a82f |
   | Default Java | Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/5/testReport/ |
   | Max. process+thread count | 1349 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws 
U: . |
   | Console output | 

[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642055#comment-17642055
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1334090863

   build vms playing up "Resource temporarily unavailable"




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641988#comment-17641988
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

hadoop-yetus commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1333896168

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 47s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 11s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 57s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 38s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  22m 17s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 26s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  7s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 10s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 56s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 39s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  28m 10s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 43s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m  3s |  |  the patch passed  |
   | -1 :x: |  compile  |  35m 24s | 
[/patch-compile-root-jdkUbuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/4/artifact/out/patch-compile-root-jdkUbuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04.txt)
 |  root in the patch failed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04.  |
   | -1 :x: |  javac  |  35m 24s | 
[/patch-compile-root-jdkUbuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/4/artifact/out/patch-compile-root-jdkUbuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04.txt)
 |  root in the patch failed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04.  |
   | -1 :x: |  compile  |  21m  1s | 
[/patch-compile-root-jdkPrivateBuild-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/4/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07.txt)
 |  root in the patch failed with JDK Private 
Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07.  |
   | -1 :x: |  javac  |  21m  1s | 
[/patch-compile-root-jdkPrivateBuild-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/4/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07.txt)
 |  root in the patch failed with JDK Private 
Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m  6s | 
[/buildtool-patch-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/4/artifact/out/buildtool-patch-checkstyle-root.txt)
 |  The patch fails to run checkstyle in root  |
   | -1 :x: |  mvnsite  |   1m 10s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/4/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | -1 :x: |  mvnsite  |   1m  7s | 
[/patch-mvnsite-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/4/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt)
 |  hadoop-aws in the patch failed.  |
   | -1 :x: |  javadoc  |   1m 15s | 
[/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/4/artifact/out/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04.txt)
 |  hadoop-common in the patch failed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04.  |
   | -1 :x: |  

[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641839#comment-17641839
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1333603252

   tested aws london w/ ` -Dparallel-tests -DtestsThreadCount=10 
-Dmarkers=delete`
   
   s3a fs was set to log at TRACE so print stack traces on all fs 
creation...just search for "creation stack" in the output to look. no problems 
when this tracing is enabled.
   
   saw HADOOP-18351 again, so have just rebased for future runs. 
   
   think this is ready for review now 
   
   + @mukund-thakur @mehakmeet @dannycjones @ashutoshcipher @HarshitGupta11 




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-11-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641571#comment-17641571
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

hadoop-yetus commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1332789914

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 54s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  21m 51s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  33m  7s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 43s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  22m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 24s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 58s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m  7s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 50s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 29s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 27s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 26s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 51s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  24m 51s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 10s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  22m 10s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 16s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 53s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m  1s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 50s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 33s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 52s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 49s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m  9s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 10s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 260m 23s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5144 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux c5afc9ef472a 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9811c0b795f2a3a99ee88e337833a4feeaeaffcc |
   | Default Java | Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/3/testReport/ |
   | Max. process+thread count | 3137 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws 
U: . |
   | Console output | 

[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-11-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641458#comment-17641458
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1332518475

   next pr logs stack at trace
   ```
   2022-11-30 17:38:32,062 [setup] DEBUG s3a.S3AFileSystem 
(S3AFileSystem.java:initialize(460)) - Initializing S3AFileSystem for 
stevel-london
   2022-11-30 17:38:32,062 [setup] TRACE s3a.S3AFileSystem 
(S3AFileSystem.java:initialize(464)) - Filesystem for s3a://stevel-london/ 
creation stack
   java.lang.RuntimeException: org.apache.hadoop.fs.s3a.S3AFileSystem@56b21fd5
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:465)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3595)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:171)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3696)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3647)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:555)
at 
org.apache.hadoop.fs.contract.AbstractBondedFSContract.init(AbstractBondedFSContract.java:72)
at 
org.apache.hadoop.fs.contract.AbstractFSContractTestBase.setup(AbstractFSContractTestBase.java:189)
at 
org.apache.hadoop.fs.s3a.AbstractS3ATestBase.setup(AbstractS3ATestBase.java:111)
at 
org.apache.hadoop.fs.s3a.ITestS3AUnbuffer.setup(ITestS3AUnbuffer.java:61)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:750)
   ```
   




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-11-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17638264#comment-17638264
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1326345217

   want to add one more feature here; s3a fs initialize() to log stack trace at 
TRACE; helps to track down what is creating fs instances and not cleaning up 
properly




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-11-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17637955#comment-17637955
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

hadoop-yetus commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1325638170

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  6s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 29s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  30m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  30m 34s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  26m 50s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 54s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 16s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 12s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   2m  5s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  8s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 51s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 47s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  27m 24s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  27m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 46s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  22m 46s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 19s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/2/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 1 new + 3 unchanged - 0 fixed = 4 total (was 3)  |
   | +1 :green_heart: |  mvnsite  |   2m 57s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m  1s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 49s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 37s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 49s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 35s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 19s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 11s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 267m 45s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5144 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 6a31c290ec3b 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 05c3c346b4ad9299db8756fd50d1459ccf22bf9f |
   | Default Java | Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/2/testReport/ |
   | Max. process+thread count | 

[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-11-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17637832#comment-17637832
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1325255012

   latest test run s3 london, params ` -Dparallel-tests -DtestsThreadCount=10 
-Dscale`; all good.
   
   new test to verify that instrumentation.close() unregisters on the first 
call and is no-op on the second...both unit test for standalone and itest for 
lifecycle through s3afs.
   
   that showed that instrumentation is being unregistered on close(), so if 
leakage is reported, it's from something creating fs instances but not closing 
them. the weak refs will ensure that the instrumentation isn't being held on 
-but thread pools are still there to consume resources. 




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-11-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17637132#comment-17637132
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1323403552

   testing ideas
   * scale test to create many fs instances in parallel and verify all is 
good...need to make sure metrics is turned on first
   * add probe method for state of s3a statistics; check after fs is closed
   * make sure iostats from closed fs are valid
   * see what happens when you call any stats collecting operation after the fs 
is closed




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-11-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635539#comment-17635539
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

hadoop-yetus commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1319211956

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 54s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  33m  3s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  29m 17s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 29s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  22m  7s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 20s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  0s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m  7s |  |  trunk passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 49s |  |  trunk passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 25s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 59s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 40s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 46s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  24m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m  6s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  22m  6s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 13s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m  2s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m  0s |  |  the patch passed with JDK 
Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 49s |  |  the patch passed with JDK 
Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 38s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 42s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   2m 57s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m  9s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 267m 37s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5144 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 75e690726977 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 61cf31b25393ca93c9faf5cf7a51c1b816c291fb |
   | Default Java | Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5144/1/testReport/ |
   | Max. process+thread count | 1253 (vs. ulimit of 5500) |
   | modules | C: 

[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-11-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635453#comment-17635453
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran commented on PR #5144:
URL: https://github.com/apache/hadoop/pull/5144#issuecomment-1318921953

   tested, s3 london, parallel @ 8. one failure -created HADOOP-18531 for that




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18526) Leak of S3AInstrumentation instances via hadoop Metrics references

2022-11-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635447#comment-17635447
 ] 

ASF GitHub Bot commented on HADOOP-18526:
-

steveloughran opened a new pull request, #5144:
URL: https://github.com/apache/hadoop/pull/5144

   
   This has triggered an OOM in a process which was churning through s3a fs 
instances; the increased memory footprint of IOStatistics amplified what must 
have been a long standing issue.
   
   * Makes sure instrumentation is closed when fs is closed
   * Uses a weak reference from metrics to instrumentation, so even if the fs 
wasn't closed (see HADOOP-18478), this ref would not be holding things back.
   
   
   ### How was this patch tested?
   
   cloud tests. no new tests...if there are suggestions, that'd be good
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [X] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> Leak of S3AInstrumentation instances via hadoop Metrics references
> --
>
> Key: HADOOP-18526
> URL: https://issues.apache.org/jira/browse/HADOOP-18526
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Priority: Major
>
> A heap dump of a process running OOM shows that if a process creates then 
> destroys lots of S3AFS instances, you seem to run out of heap due to 
> references to S3AInstrumentation and the IOStatisticsStore kept via the 
> hadoop metrics registry
> It doesn't look like S3AInstrumentation.close() is being invoked in 
> S3AFS.close(). it should -with the IOStats being snapshotted to a local 
> reference before this happens. This allows for stats of a closed fs to be 
> examined.
> If you look at org.apache.hadoop.ipc.DecayRpcScheduler.MetricsProxy it uses a 
> WeakReference to refer back to the larger object. we should do the same for 
> abfs/s3a bindings. ideally do some template proxy class in hadoop common they 
> can both use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org