[ 
https://issues.apache.org/jira/browse/HADOOP-18091?focusedWorklogId=722926&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-722926
 ]

ASF GitHub Bot logged work on HADOOP-18091:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Feb/22 15:37
            Start Date: 08/Feb/22 15:37
    Worklog Time Spent: 10m 
      Work Description: steveloughran commented on a change in pull request 
#3930:
URL: https://github.com/apache/hadoop/pull/3930#discussion_r801768736



##########
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/WeakReferenceMap.java
##########
@@ -0,0 +1,261 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.util;
+
+import java.lang.ref.WeakReference;
+import java.util.Iterator;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.atomic.AtomicLong;
+import java.util.function.Consumer;
+import java.util.function.Function;
+
+import javax.annotation.Nullable;
+
+import org.apache.hadoop.classification.InterfaceAudience;
+
+import static java.util.Objects.requireNonNull;
+
+/**
+ * A map of keys type K to objects of type V which uses weak references,
+ * so does lot leak memory through long-lived references
+ * <i>at the expense of losing references when GC takes place.</i>.
+ *
+ * This class is intended be used instead of ThreadLocal storage when
+ * references are to be cleaned up when the instance holding.
+ * In this use case, the key is the Long key.
+ *
+ * Concurrency.
+ * The class assumes that map entries are rarely contended for when writing,
+ * and that not blocking other threads is more important than atomicity.
+ * - a ConcurrentHashMap is used to map keys to weak references, with
+ *   all its guarantees.
+ * - there is no automatic pruning.
+ * - see {@link #create(Object)} for the concurrency semantics on entry 
creation.
+ */
+@InterfaceAudience.Private
+public class WeakReferenceMap<K, V> {
+
+  /**
+   * The reference map.
+   */
+  private final Map<K, WeakReference<V>> map = new ConcurrentHashMap<>();
+
+  /**
+   * Supplier of new instances.
+   */
+  private final Function<? super K, ? extends V> factory;
+
+  /**
+   * Nullable callback when a get on a key got a weak reference back.
+   * The assumption is that this is for logging/stats, which is why
+   * no attempt is made to use the call as a supplier of a new value.
+   */
+  private final Consumer<? super K> referenceLost;
+
+  /**
+   * Counter of references lost.
+   */
+  private final AtomicLong referenceLostCount = new AtomicLong();
+
+  /**
+   * Counter of entries created.
+   */
+  private final AtomicLong entriesCreatedCount = new AtomicLong();
+
+  /**
+   * instantiate.
+   * @param factory supplier of new instances
+   * @param referenceLost optional callback on lost references.
+   */
+  public WeakReferenceMap(
+      Function<? super K, ? extends V> factory,
+      @Nullable final Consumer<? super K> referenceLost) {
+
+    this.factory = requireNonNull(factory);
+    this.referenceLost = referenceLost;
+  }
+
+  @Override
+  public String toString() {
+    return "WeakReferenceMap{" +
+        "size=" + size() +
+        ", referenceLostCount=" + referenceLostCount +
+        ", entriesCreatedCount=" + entriesCreatedCount +
+        '}';
+  }
+
+  /**
+   * Map size.
+   * @return the current map size.
+   */
+  public int size() {
+    return map.size();
+  }
+
+  /**
+   * Clear all entries.
+   */
+  public void clear() {
+    map.clear();
+  }
+
+  /**
+   * look up the value, returning the possibly empty weak reference
+   * to a value, or null if no value was found.
+   * @param key key to look up
+   * @return null if there is no entry, a weak reference if found
+   */
+  public WeakReference<V> lookup(K key) {
+    return map.get(key);
+  }
+
+  /**
+   * Get the value, creating if needed.
+   * @param key key.
+   * @return an instance.
+   */
+  public V get(K key) {
+    final WeakReference<V> current = lookup(key);
+    V val = resolve(current);
+    if (val != null) {
+      // all good.
+      return  val;
+    }
+
+    // here, either no ref, or the value is null
+    if (current != null) {
+      noteLost(key);
+    }
+    return create(key);
+  }
+
+  /**
+   * Create a new instance under a key.
+   * The instance is created, added to the map and then the
+   * map value retrieved.
+   * This ensures that the reference returned is that in the map,
+   * even if there is more than one entry being created at the same time.
+   * @param key key
+   * @return the value
+   */
+  public V create(K key) {
+    entriesCreatedCount.incrementAndGet();
+    WeakReference<V> newRef = new WeakReference<>(
+        requireNonNull(factory.apply(key)));
+    map.put(key, newRef);
+    return map.get(key).get();
+  }
+
+  /**
+   * Put a value under the key.
+   * A null value can be put, though on a get() call
+   * a new entry is generated
+   *
+   * @param key key
+   * @param value value
+   * @return any old non-null reference.
+   */
+  public V put(K key, V value) {
+    return resolve(map.put(key, new WeakReference<>(value)));
+  }
+
+  /**
+   * Remove any value under the key.
+   * @param key key
+   * @return any old non-null reference.
+   */
+  public V remove(K key) {
+    return resolve(map.remove(key));
+  }
+
+  /**
+   * Does the map have a valid reference for this object?
+   * no-side effects: there's no attempt to notify or cleanup
+   * if the reference is null.
+   * @param key key to look up
+   * @return true if there is a valid reference.
+   */
+  public boolean containsKey(K key) {
+    final WeakReference<V> current = lookup(key);
+    return resolve(current) != null;
+  }
+
+  /**
+   * Given a possibly null weak reference, resolve
+   * its value.
+   * @param r reference to resolve
+   * @return the value or null
+   */
+  private V resolve(WeakReference<V> r) {
+    return r == null ? null : r.get();
+  }
+
+  /**
+   * Prune all null weak references, calling the referenceLost
+   * callback for each one.
+   *
+   * non-atomic and non-blocking.
+   * @return the number of entries pruned.
+   */
+  public int prune() {

Review comment:
       because the map will become full of threadId to null entries, which 
consumes small amounts of memory (one record per terminated thread). pruning 
just cleans it all out




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 722926)
    Time Spent: 6h 40m  (was: 6.5h)

> S3A auditing leaks memory through ThreadLocal references
> --------------------------------------------------------
>
>                 Key: HADOOP-18091
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18091
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.2
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> {{ActiveAuditManagerS3A}} uses thread locals to map to active audit spans, 
> which (because they are wrapped) include back reference to the audit manager 
> instance and the config it was created with.
> these *do not* get cleaned up when the FS instance is closed.
> if you have a long lived process creating and destroying many FS instances, 
> then memory gets used up. l



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to