Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-21 Thread via GitHub


zhaomin1423 commented on PR #43502:
URL: https://github.com/apache/spark/pull/43502#issuecomment-1821078747

   > Merged into master for Spark 4.0. Thanks @zhaomin1423 ! Your work is 
greatly appreciated. Thank you for your review! @mridulm @beliefer
   
   Thank you very much for your help.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-21 Thread via GitHub


LuciferYang commented on PR #43502:
URL: https://github.com/apache/spark/pull/43502#issuecomment-1821074054

   Merged into master for Spark 4.0.
   Thanks @zhaomin1423 ! Your work is greatly appreciated.
   Thank you for your review! @mridulm @beliefer 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-21 Thread via GitHub


LuciferYang closed pull request #43502: [SPARK-45533][CORE] Use j.l.r.Cleaner 
instead of finalize for RocksDBIterator/LevelDBIterator
URL: https://github.com/apache/spark/pull/43502


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-20 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1400113583


##
.github/workflows/build_and_test.yml:
##
@@ -200,6 +200,7 @@ jobs:
   SKIP_UNIDOC: true
   SKIP_MIMA: true
   SKIP_PACKAGING: true
+  LIVE_UI_LOCAL_STORE_DIR: "/tmp/kvStore"

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-20 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1400076838


##
.github/workflows/build_and_test.yml:
##
@@ -200,6 +200,7 @@ jobs:
   SKIP_UNIDOC: true
   SKIP_MIMA: true
   SKIP_PACKAGING: true
+  LIVE_UI_LOCAL_STORE_DIR: "/tmp/kvStore"

Review Comment:
   @zhaomin1423 Please revert this change, thanks ~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-20 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1399081856


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   Great, I see.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-19 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1398630510


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   Thanks @mridulm 
   @zhaomin1423 Is this ok for you?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-19 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1398629711


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   (I am on vacation, but that sounds right @LuciferYang )



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-19 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1398623958


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   We can revert change of `closeIterator` method in `LevelDB` and refactor it 
as :
   
   ```java
 void closeIterator(DBIterator it) throws IOException {
   notifyIteratorClosed(it);
   synchronized (this._db) {
 DB _db = this._db.get();
 if (_db != null) {
   it.close();
 }
   }
 }
   ```
   
   then there we can call `closeIterator` directly like:
   
   ```java
   public void run() {
 if (started.compareAndSet(true, false)) {
   try {
 levelDB.closeIterator(dbIterator);
   } catch (IOException e) {
 // logXXX("Failed to close iterator", e);
   }
 }
   }
   ```
   
   Do you mean this idea? @mridulm 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-19 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1398623958


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   We can revert change of `closeIterator` method in `LevelDB` and refactor it 
as :
   
   ```java
 void closeIterator(DBIterator it) throws IOException {
   notifyIteratorClosed(it);
   synchronized (this._db) {
 DB _db = this._db.get();
 if (_db != null) {
   it.close();
 }
   }
 }
   ```
   
   then there we can call `closeIterator` directly like:
   
   ```
   public void run() {
 if (started.compareAndSet(true, false)) {
   try {
 levelDB.closeIterator(dbIterator);
   } catch (IOException e) {
 // logXXX("Failed to close iterator", e);
   }
 }
   }
   ```
   
   Do you mean this idea? @mridulm 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-17 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1398084223


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +302,43 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final DBIterator dbIterator;
+
+private final LevelDB levelDB;
+
+private final AtomicBoolean started = new AtomicBoolean(true);
+
+ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) {
+  this.dbIterator = dbIterator;
+  this.levelDB = levelDB;
+}
+
+@Override
+public void run() {
+  if (started.compareAndSet(true, false)) {
+levelDB.notifyIteratorClosed(dbIterator);
+synchronized (levelDB.getLevelDB()) {
+  DB _db = levelDB.getLevelDB().get();
+  if (_db != null) {
+try {
+  dbIterator.close();
+} catch (IOException e) {
+  throw new UncheckedIOException(e);

Review Comment:
   Good suggestion, I re-examined it,the JniDBIterator  can't throw an 
IOException, could we just ignore it?
   
   https://github.com/apache/spark/assets/49054376/4c5c9e6c-4c06-4175-9a1f-e12577a6f86a;>
   
   



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +302,43 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final DBIterator dbIterator;
+
+private final LevelDB levelDB;
+
+private final AtomicBoolean started = new AtomicBoolean(true);
+
+ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) {
+  this.dbIterator = dbIterator;
+  this.levelDB = levelDB;
+}
+
+@Override
+public void run() {
+  if (started.compareAndSet(true, false)) {
+levelDB.notifyIteratorClosed(dbIterator);
+synchronized (levelDB.getLevelDB()) {
+  DB _db = levelDB.getLevelDB().get();
+  if (_db != null) {
+try {
+  dbIterator.close();
+} catch (IOException e) {
+  throw new UncheckedIOException(e);

Review Comment:
   Good suggestion, I re-examined it,the JniDBIterator  can't throw an 
IOException, could we just ignore it?
   
   https://github.com/apache/spark/assets/49054376/4c5c9e6c-4c06-4175-9a1f-e12577a6f86a;>
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-17 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1398042213


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   My understanding is that it is necessary to hold their references, and I 
have not yet thought of any other way to convey them



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-17 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1397977149


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   I am trying to understand why this needs to be moved, was there any specific 
reason to do so ?
   If no, why not revert it back ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-17 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1397140046


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   Are you worried about the impact of ResourceCleaner on them being cleaned up 
by the JVM



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-16 Thread via GitHub


LuciferYang commented on PR #43502:
URL: https://github.com/apache/spark/pull/43502#issuecomment-1815774008

   > There is one [open 
query](https://github.com/apache/spark/pull/43502#discussion_r1391335306) 
above, no ? (I skimmed the last few conversations)
   
   Sorry, I missing this query. We should reach a consensus on this before 
merging


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-16 Thread via GitHub


mridulm commented on PR #43502:
URL: https://github.com/apache/spark/pull/43502#issuecomment-1815744276

   It is not very important, so I am fine with merging the PR without it being 
addressed if you are find with that !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-16 Thread via GitHub


mridulm commented on PR #43502:
URL: https://github.com/apache/spark/pull/43502#issuecomment-1815743637

   There is one [open 
query](https://github.com/apache/spark/pull/43502#discussion_r1391335306) 
above, no ? (I skimmed the last few conversations)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-16 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1396681491


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +302,43 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final DBIterator dbIterator;
+
+private final LevelDB levelDB;
+
+private final AtomicBoolean started = new AtomicBoolean(true);
+
+ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) {
+  this.dbIterator = dbIterator;
+  this.levelDB = levelDB;
+}
+
+@Override
+public void run() {
+  if (started.compareAndSet(true, false)) {
+levelDB.notifyIteratorClosed(dbIterator);
+synchronized (levelDB.getLevelDB()) {
+  DB _db = levelDB.getLevelDB().get();
+  if (_db != null) {
+try {
+  dbIterator.close();
+} catch (IOException e) {
+  throw new UncheckedIOException(e);

Review Comment:
   nit: Personally, perhaps printing a log is enough, as there should be no 
other places capturing this exception now.  but this is not important
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-16 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1396681491


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +302,43 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final DBIterator dbIterator;
+
+private final LevelDB levelDB;
+
+private final AtomicBoolean started = new AtomicBoolean(true);
+
+ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) {
+  this.dbIterator = dbIterator;
+  this.levelDB = levelDB;
+}
+
+@Override
+public void run() {
+  if (started.compareAndSet(true, false)) {
+levelDB.notifyIteratorClosed(dbIterator);
+synchronized (levelDB.getLevelDB()) {
+  DB _db = levelDB.getLevelDB().get();
+  if (_db != null) {
+try {
+  dbIterator.close();
+} catch (IOException e) {
+  throw new UncheckedIOException(e);

Review Comment:
   nit: Personally, perhaps printing a log is enough, as there should be no 
other places capturing this exception now.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-16 Thread via GitHub


LuciferYang commented on PR #43502:
URL: https://github.com/apache/spark/pull/43502#issuecomment-1815716635

   @mridulm If there are no other questions, I will merge this PR as soon as 
possible :)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-16 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1396681491


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +302,43 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final DBIterator dbIterator;
+
+private final LevelDB levelDB;
+
+private final AtomicBoolean started = new AtomicBoolean(true);
+
+ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) {
+  this.dbIterator = dbIterator;
+  this.levelDB = levelDB;
+}
+
+@Override
+public void run() {
+  if (started.compareAndSet(true, false)) {
+levelDB.notifyIteratorClosed(dbIterator);
+synchronized (levelDB.getLevelDB()) {
+  DB _db = levelDB.getLevelDB().get();
+  if (_db != null) {
+try {
+  dbIterator.close();
+} catch (IOException e) {
+  throw new UncheckedIOException(e);

Review Comment:
   nit: Personally: Perhaps printing a log is enough, as there should be no 
other places capturing this exception now.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-16 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1396674249


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,34 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();
+  }
 }
   }
 
   /**
-   * Because it's tricky to expose closeable iterators through many internal 
APIs, especially
-   * when Scala wrappers are used, this makes sure that, hopefully, the JNI 
resources held by
-   * the iterator will eventually be released.
+   * Prevent ResourceCleaner from trying to release resources after close.
*/
-  @SuppressWarnings("deprecation")
-  @Override
-  protected void finalize() throws Throwable {
-db.closeIterator(this);
+  private void cancelResourceClean() {
+this.resourceCleaner.setStartedToFalse();
+this.cleanable.clean();
+  }
+
+  @VisibleForTesting

Review Comment:
   this is used by `notifyIteratorClosed`, not `@VisibleForTesting` now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-16 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1396673560


##
.github/workflows/build_and_test.yml:
##
@@ -200,6 +200,7 @@ jobs:
   SKIP_UNIDOC: true
   SKIP_MIMA: true
   SKIP_PACKAGING: true
+  LIVE_UI_LOCAL_STORE_DIR: "/tmp/kvStore"

Review Comment:
   @zhaomin1423 All current tests have passed, and the additional verification 
purpose for rocksdb has been achieved. We need to remove the changes to this 
file before merging.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-16 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1396633408


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,34 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();

Review Comment:
   Heh, forgot my own comment :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-15 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1395081107


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,34 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();

Review Comment:
   Yes, we have discussed this issue. The reason for not directly calling 
`this.cleaner.clean()` is because the close process in `Cleaner` has added the 
operation of `synchronized (this._db)`, which is slightly different from the 
semantics of the original `close()` method. For specific discussions, please 
refer to this thread: 
   
   https://github.com/apache/spark/pull/43502#discussion_r1372954706



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-15 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1394409363


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,34 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();

Review Comment:
   This was perhaps discussed earlier - please point me to it if I am missing 
it.
   Why not simply call ` this.cleanable.clean();` in `close` (in place of 
notify + it.close) ?



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   resource cleaner has a reference to both `dbIterator` and `levelDB` - not 
sure what I am missing ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-15 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1394398171


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   The main reason is ResourceCleaner can't hold references to the 
LevelDBIterator, so change it. If you have a better suggestion, please let me 
know and I will be happy to change it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-15 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1394375372


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   The logic is consistent - and would remain the same with and without this 
move - I was wondering why the change ? Whether this was simply refactoring or 
there is any other reason for it.
   
   (Note: I am not advocating for reverting, just want to understand why this 
was done).
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-15 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1394375372


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   The logic is consistent - and would remain the same with and without this 
move - I was wondering why the change ? Whether this was simply refactoring or 
there is any other reason for it.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-15 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1394277068


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   It was originally in finalize method, now it has been moved to 
ResourceCleaner.run. I think this logic is consistent



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-15 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1394277068


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   It was originally used in finalize method, but now it has been moved to 
ResourceCleaner.run. I think this logic is consistent



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-14 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1392669927


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {

Review Comment:
   thanks,done



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {

Review Comment:
   thanks,done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-14 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1392651580


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   What is your opinion on the comments here? @zhaomin1423 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-14 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1392615323


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {

Review Comment:
   It should be impossible to catch an exception of type  
`UncheckedIOException` here.`
   
   



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {

Review Comment:
   It should be impossible to catch an exception of type  
`UncheckedIOException` here now
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-14 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1392600591


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -176,22 +183,33 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+closed = true;
+next = null;

Review Comment:
   Thanks, done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1391963989


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {
+throw uncheckedIOException.getCause();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();
+  }
 }
   }
 
   /**
-   * Because it's tricky to expose closeable iterators through many internal 
APIs, especially
-   * when Scala wrappers are used, this makes sure that, hopefully, the JNI 
resources held by
-   * the iterator will eventually be released.
+   * Prevent ResourceCleaner from actually releasing resources after close it.
*/
-  @SuppressWarnings("deprecation")
-  @Override
-  protected void finalize() throws Throwable {
-db.closeIterator(this);
+  private void cancelResourceClean() {
+this.resourceCleaner.setStartedToFalse();
+this.cleanable.clean();

Review Comment:
   It makes sense, let us keep it as is - I wanted to make sure I was not 
missing something !



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1391380331


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {
+throw uncheckedIOException.getCause();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();
+  }
 }
   }
 
   /**
-   * Because it's tricky to expose closeable iterators through many internal 
APIs, especially
-   * when Scala wrappers are used, this makes sure that, hopefully, the JNI 
resources held by
-   * the iterator will eventually be released.
+   * Prevent ResourceCleaner from actually releasing resources after close it.
*/
-  @SuppressWarnings("deprecation")
-  @Override
-  protected void finalize() throws Throwable {
-db.closeIterator(this);
+  private void cancelResourceClean() {
+this.resourceCleaner.setStartedToFalse();
+this.cleanable.clean();

Review Comment:
   My concern is: if we don't manually call the `clean` method here, when the 
`Cleaner` actively performs GC, it will accumulate a large number of `no-op` 
calls, which prevents tasks that really need to be recycled from being executed 
as soon as possible.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1391383180


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {
+throw uncheckedIOException.getCause();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();
+  }
 }
   }
 
   /**
-   * Because it's tricky to expose closeable iterators through many internal 
APIs, especially
-   * when Scala wrappers are used, this makes sure that, hopefully, the JNI 
resources held by
-   * the iterator will eventually be released.
+   * Prevent ResourceCleaner from actually releasing resources after close it.
*/
-  @SuppressWarnings("deprecation")
-  @Override
-  protected void finalize() throws Throwable {
-db.closeIterator(this);
+  private void cancelResourceClean() {
+this.resourceCleaner.setStartedToFalse();
+this.cleanable.clean();

Review Comment:
   However, I indeed don't have valid test data yet to support my viewpoint.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1391380331


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {
+throw uncheckedIOException.getCause();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();
+  }
 }
   }
 
   /**
-   * Because it's tricky to expose closeable iterators through many internal 
APIs, especially
-   * when Scala wrappers are used, this makes sure that, hopefully, the JNI 
resources held by
-   * the iterator will eventually be released.
+   * Prevent ResourceCleaner from actually releasing resources after close it.
*/
-  @SuppressWarnings("deprecation")
-  @Override
-  protected void finalize() throws Throwable {
-db.closeIterator(this);
+  private void cancelResourceClean() {
+this.resourceCleaner.setStartedToFalse();
+this.cleanable.clean();

Review Comment:
   My concern is: if we don't manually call the `clean` method here, when the 
`Cleaner` actively performs GC, it will accumulate a large number of 'no-op' 
calls, which prevents tasks that really need to be recycled from being executed 
as soon as possible.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1391368440


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {
+throw uncheckedIOException.getCause();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();
+  }
 }
   }
 
   /**
-   * Because it's tricky to expose closeable iterators through many internal 
APIs, especially
-   * when Scala wrappers are used, this makes sure that, hopefully, the JNI 
resources held by
-   * the iterator will eventually be released.
+   * Prevent ResourceCleaner from actually releasing resources after close it.
*/
-  @SuppressWarnings("deprecation")
-  @Override
-  protected void finalize() throws Throwable {
-db.closeIterator(this);
+  private void cancelResourceClean() {
+this.resourceCleaner.setStartedToFalse();
+this.cleanable.clean();

Review Comment:
   Yes, it's a no-op. Even if we don't call it, the Cleaner will invoke the 
clean method on objects that have been garbage collected. Personally, I prefer 
to have the Cleaner execute more effective resource reclamation when it calls 
ResourceCleaner, rather than such no-ops.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1391363608


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   Good point, this way we can reduce some code changes.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1391361132


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {
+throw uncheckedIOException.getCause();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();
+  }
 }
   }
 
   /**
-   * Because it's tricky to expose closeable iterators through many internal 
APIs, especially
-   * when Scala wrappers are used, this makes sure that, hopefully, the JNI 
resources held by
-   * the iterator will eventually be released.
+   * Prevent ResourceCleaner from actually releasing resources after close it.
*/
-  @SuppressWarnings("deprecation")
-  @Override
-  protected void finalize() throws Throwable {
-db.closeIterator(this);
+  private void cancelResourceClean() {
+this.resourceCleaner.setStartedToFalse();
+this.cleanable.clean();

Review Comment:
   In other words, what I am trying to understand is (should have phrased it 
better :) ) what is the overhead/issues with cleaner ending up invoking this as 
its normal case.
   
   (Agree with your rationale btw - makes sense to evict it from queue).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1391359057


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {
+throw uncheckedIOException.getCause();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();
+  }
 }
   }
 
   /**
-   * Because it's tricky to expose closeable iterators through many internal 
APIs, especially
-   * when Scala wrappers are used, this makes sure that, hopefully, the JNI 
resources held by
-   * the iterator will eventually be released.
+   * Prevent ResourceCleaner from actually releasing resources after close it.
*/
-  @SuppressWarnings("deprecation")
-  @Override
-  protected void finalize() throws Throwable {
-db.closeIterator(this);
+  private void cancelResourceClean() {
+this.resourceCleaner.setStartedToFalse();
+this.cleanable.clean();

Review Comment:
   Yes, but the clean is a noop right (since started has been force set to 
`false`) ? Is there a reason to call it ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1391358903


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -176,22 +183,33 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+closed = true;
+next = null;

Review Comment:
   +1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1391355792


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {
+throw uncheckedIOException.getCause();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();
+  }
 }
   }
 
   /**
-   * Because it's tricky to expose closeable iterators through many internal 
APIs, especially
-   * when Scala wrappers are used, this makes sure that, hopefully, the JNI 
resources held by
-   * the iterator will eventually be released.
+   * Prevent ResourceCleaner from actually releasing resources after close it.
*/
-  @SuppressWarnings("deprecation")
-  @Override
-  protected void finalize() throws Throwable {
-db.closeIterator(this);
+  private void cancelResourceClean() {
+this.resourceCleaner.setStartedToFalse();
+this.cleanable.clean();

Review Comment:
   This was a suggestion I made earlier. Line 214 sets the `started` status of 
`ResourceCleaner` to false, while line 215 simply triggers ResourceCleaner to 
manually execute and be removed from the Cleaner's pending cleanup queue as 
soon as possible. In fact, since the `started` status is already `false`, it 
won't perform any operations. Of course, we could also just set the status here 
and then wait for the `Cleaner` to execute the registered ResourceCleaner on 
its own :)



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {
+throw uncheckedIOException.getCause();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();
+  }
 }
   }
 
   /**
-   * Because it's tricky to expose closeable iterators through many internal 
APIs, especially
-   * when Scala wrappers are used, this makes sure that, hopefully, the JNI 
resources held by
-   * the iterator will eventually be released.
+   * Prevent ResourceCleaner from actually releasing resources after close it.
*/
-  @SuppressWarnings("deprecation")
-  @Override
-  protected void finalize() throws Throwable {
-db.closeIterator(this);
+  private void cancelResourceClean() {
+this.resourceCleaner.setStartedToFalse();
+this.cleanable.clean();

Review Comment:
   This was a suggestion I made earlier. Line 214 sets the `started` status of 
`ResourceCleaner` to false, while line 215 simply triggers ResourceCleaner to 
manually execute and be removed from the Cleaner's pending cleanup queue as 
soon as possible. In fact, since the `started` status is already `false`, it 
won't perform any operations. Of course, we could also just set the status here 
and then wait for the `Cleaner` to execute the registered ResourceCleaner on 
its own :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1391335306


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java:
##
@@ -322,26 +323,15 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(LevelDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  DB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
-  }

Review Comment:
   Any particular reason to move this into `ResourceCleaner.run` ?



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {
+throw uncheckedIOException.getCause();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();
+  }
 }
   }
 
   /**
-   * Because it's tricky to expose closeable iterators through many internal 
APIs, especially
-   * when Scala wrappers are used, this makes sure that, hopefully, the JNI 
resources held by
-   * the iterator will eventually be released.
+   * Prevent ResourceCleaner from actually releasing resources after close it.

Review Comment:
   nit: 
   ```suggestion
  * Prevent ResourceCleaner from trying to release resources after close.
   ```



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -176,22 +183,33 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+closed = true;
+next = null;

Review Comment:
   Keep both level db and rocks db part consistent w.r.t finally ? (move 
setting closed/next to finally ?)



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +193,36 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+  } catch (UncheckedIOException uncheckedIOException) {
+throw uncheckedIOException.getCause();
+  } finally {
+closed = true;
+next = null;
+cancelResourceClean();
+  }
 }
   }
 
   /**
-   * Because it's tricky to expose closeable iterators through many internal 
APIs, especially
-   * when Scala wrappers are used, this makes sure that, hopefully, the JNI 
resources held by
-   * the iterator will eventually be released.
+   * Prevent ResourceCleaner from actually releasing resources after close it.
*/
-  @SuppressWarnings("deprecation")
-  @Override
-  protected void finalize() throws Throwable {
-db.closeIterator(this);
+  private void cancelResourceClean() {
+this.resourceCleaner.setStartedToFalse();
+this.cleanable.clean();

Review Comment:
   We have already notified and closed the iterator before this method is 
invoked - and we are forcing a clean again ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


zhaomin1423 commented on PR #43502:
URL: https://github.com/apache/spark/pull/43502#issuecomment-1808412311

   > @zhaomin1423 Do you have time to rebase the code and make changes to the 
LevelDB section?
   > 
   > 
   > 
   > 
   
   PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


LuciferYang commented on PR #43502:
URL: https://github.com/apache/spark/pull/43502#issuecomment-1808299310

   @zhaomin1423 Do you have time to rebase the code and make changes to the 
LevelDB section?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-13 Thread via GitHub


LuciferYang commented on PR #43502:
URL: https://github.com/apache/spark/pull/43502#issuecomment-1808293353

   friendly ping @mridulm @beliefer Is the current change to the RocksDB 
section ok for you? Should we continue to push for changes in this pr?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-06 Thread via GitHub


LuciferYang commented on PR #43502:
URL: https://github.com/apache/spark/pull/43502#issuecomment-1794924200

   Rocksdb part is fine to me. friendly ping  @mridulm @beliefer Ccould you 
take a look again if you have time? Thanks 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-06 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1383381947


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -272,4 +290,37 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final RocksIterator rocksIterator;
+private final RocksDB rocksDB;
+private final AtomicBoolean started = new AtomicBoolean(true);
+
+ResourceCleaner(RocksIterator rocksIterator, RocksDB rocksDB) {
+  this.rocksIterator = rocksIterator;
+  this.rocksDB = rocksDB;
+}
+
+@Override
+public void run() {
+  if (started.compareAndSet(true, false)) {
+rocksDB.notifyIteratorClosed(rocksIterator);
+synchronized (rocksDB.getRocksDB()) {
+  org.rocksdb.RocksDB _db = rocksDB.getRocksDB().get();
+  if (_db != null) {
+rocksIterator.close();
+  }
+}
+  }
+}
+
+void setStartedToFalse() {
+  started.set(false);
+}
+
+@VisibleForTesting
+boolean getStarted() {

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-05 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1382625914


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -272,4 +290,37 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final RocksIterator rocksIterator;
+private final RocksDB rocksDB;
+private final AtomicBoolean started = new AtomicBoolean(true);
+
+ResourceCleaner(RocksIterator rocksIterator, RocksDB rocksDB) {
+  this.rocksIterator = rocksIterator;
+  this.rocksDB = rocksDB;
+}
+
+@Override
+public void run() {
+  if (started.compareAndSet(true, false)) {
+rocksDB.notifyIteratorClosed(rocksIterator);
+synchronized (rocksDB.getRocksDB()) {
+  org.rocksdb.RocksDB _db = rocksDB.getRocksDB().get();
+  if (_db != null) {
+rocksIterator.close();
+  }
+}
+  }
+}
+
+void setStartedToFalse() {
+  started.set(false);
+}
+
+@VisibleForTesting
+boolean getStarted() {

Review Comment:
   ```java
   boolean isCompleted() {
 return !started.get();
   }
   ```
?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-05 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1382625914


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -272,4 +290,37 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final RocksIterator rocksIterator;
+private final RocksDB rocksDB;
+private final AtomicBoolean started = new AtomicBoolean(true);
+
+ResourceCleaner(RocksIterator rocksIterator, RocksDB rocksDB) {
+  this.rocksIterator = rocksIterator;
+  this.rocksDB = rocksDB;
+}
+
+@Override
+public void run() {
+  if (started.compareAndSet(true, false)) {
+rocksDB.notifyIteratorClosed(rocksIterator);
+synchronized (rocksDB.getRocksDB()) {
+  org.rocksdb.RocksDB _db = rocksDB.getRocksDB().get();
+  if (_db != null) {
+rocksIterator.close();
+  }
+}
+  }
+}
+
+void setStartedToFalse() {
+  started.set(false);
+}
+
+@VisibleForTesting
+boolean getStarted() {

Review Comment:
   ```java
   boolean isCompleted() {
 return !started.get();
   }
   ``` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-04 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1382363360


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java:
##
@@ -355,26 +355,23 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
RocksDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(RocksDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  org.rocksdb.RocksDB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
+  AtomicReference getRocksDB() {

Review Comment:
   Added, how about?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-03 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1382330007


##
common/kvstore/src/test/java/org/apache/spark/util/kvstore/RocksDBSuite.java:
##
@@ -381,6 +382,55 @@ public void testSkipAfterDBClose() throws Exception {
 assertFalse(iter.skip(1));
   }
 
+  @Test
+  public void testResourceCleaner() throws Exception {
+File dbPathForCleanerTest = File.createTempFile(
+  "test_db_cleaner.", ".rdb");
+dbPathForCleanerTest.delete();
+
+RocksDB dbForCleanerTest = new RocksDB(dbPathForCleanerTest);
+try {
+  for (int i = 0; i < 8192; i++) {
+dbForCleanerTest.write(createCustomType1(i));
+  }
+  RocksDBIterator rocksDBIterator =
+(RocksDBIterator) 
dbForCleanerTest.view(CustomType1.class).iterator();
+  Reference> reference =

Review Comment:
   In this case, since we didn't manually close the `RocksDBIterator`, we can 
ultimately determine whether the `RocksDBIterator` is closed through 
`resourceCleaner.getStatus()`. 
   
   So here, `reference` is just an auxiliary tool to judge whether the 
`RocksDBIterator` has been GCed. So can we change it to manually wrap a 
`WeakReference` instead of getting it from the `iteratorTracker`,like `new 
WeakReference<>(rocksDBIterator) `, so that we don't need to add an `Accessor` 
for the `iteratorTracker`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-03 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1382327696


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -176,22 +183,30 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+closed = true;
+next = null;
+  } finally {
+cancelResourceClean();
+  }
 }
   }
 
-  /**
-   * Because it's tricky to expose closeable iterators through many internal 
APIs, especially
-   * when Scala wrappers are used, this makes sure that, hopefully, the JNI 
resources held by
-   * the iterator will eventually be released.
-   */
-  @Override
-  protected void finalize() throws Throwable {
-db.closeIterator(this);
+  private void cancelResourceClean() {
+this.resourceCleaner.statusToFalse();
+this.cleanable.clean();
+  }
+
+  @VisibleForTesting
+  ResourceCleaner getResourceCleaner() {
+return resourceCleaner;
+  }
+
+  public RocksIterator getRocksIterator() {

Review Comment:
   can we remove the `public`? and I like `internalIterator`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-03 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1382327255


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java:
##
@@ -355,26 +355,23 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
RocksDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(RocksDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  org.rocksdb.RocksDB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
+  AtomicReference getRocksDB() {

Review Comment:
   Some comments need to be added to explain the intent and scope of use of 
this method.



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java:
##
@@ -355,26 +355,23 @@ public void close() throws IOException {
 }
   }
 
-  /**
-   * Closes the given iterator if the DB is still open. Trying to close a JNI 
RocksDB handle
-   * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
-   */
-  void closeIterator(RocksDBIterator it) throws IOException {
-notifyIteratorClosed(it);
-synchronized (this._db) {
-  org.rocksdb.RocksDB _db = this._db.get();
-  if (_db != null) {
-it.close();
-  }
-}
+  AtomicReference getRocksDB() {
+return _db;
+  }
+
+  ConcurrentLinkedQueue>> getIteratorTracker() {

Review Comment:
   ditto



##
common/kvstore/src/test/java/org/apache/spark/util/kvstore/RocksDBSuite.java:
##
@@ -381,6 +382,55 @@ public void testSkipAfterDBClose() throws Exception {
 assertFalse(iter.skip(1));
   }
 
+  @Test
+  public void testResourceCleaner() throws Exception {
+File dbPathForCleanerTest = File.createTempFile(
+  "test_db_cleaner.", ".rdb");
+dbPathForCleanerTest.delete();
+
+RocksDB dbForCleanerTest = new RocksDB(dbPathForCleanerTest);
+try {
+  for (int i = 0; i < 8192; i++) {
+dbForCleanerTest.write(createCustomType1(i));
+  }
+  RocksDBIterator rocksDBIterator =
+(RocksDBIterator) 
dbForCleanerTest.view(CustomType1.class).iterator();
+  Reference> reference =

Review Comment:
   In this case, since we didn't manually close the `RocksDBIterator`, we can 
ultimately determine whether the `RocksDBIterator` is closed through 
`resourceCleaner.getStatus()`. 
   
   So here, `reference` is just an auxiliary tool to judge whether the 
`RocksDBIterator` has been GCed. So can we change it to manually wrap a 
`WeakReference` instead of getting it from the `iteratorTracker`, so that we 
don't need to add an `Accessor` for the `iteratorTracker`?



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -272,4 +287,37 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final RocksIterator rocksIterator;
+private final RocksDB rocksDB;
+private final AtomicBoolean status = new AtomicBoolean(true);
+
+ResourceCleaner(RocksIterator rocksIterator, RocksDB rocksDB) {
+  this.rocksIterator = rocksIterator;
+  this.rocksDB = rocksDB;
+}
+
+@Override
+public void run() {
+  if (status.compareAndSet(true, false)) {
+rocksDB.notifyIteratorClosed(rocksIterator);
+synchronized (rocksDB.getRocksDB()) {
+  org.rocksdb.RocksDB _db = rocksDB.getRocksDB().get();
+  if (_db != null) {
+rocksIterator.close();
+  }
+}
+  }
+}
+
+void statusToFalse() {
+  status.set(false);
+}
+
+@VisibleForTesting
+AtomicBoolean getStatus() {

Review Comment:
   I suggest that
   
   ```java
   boolean getStatus() {
 return status.get();
}
   ```



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -272,4 +287,37 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final RocksIterator rocksIterator;
+private final RocksDB rocksDB;
+private final AtomicBoolean status = new AtomicBoolean(true);

Review Comment:
   Is there a more suitable variable name?
   
   



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -176,22 +183,30 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
+db.notifyIteratorClosed(it);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+it.close();
+closed = true;
+next = null;
+  } finally {
+

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-02 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1380093683


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -272,4 +287,32 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final RocksIterator rocksIterator;
+private final RocksDB rocksDB;
+private final AtomicBoolean status = new AtomicBoolean(true);
+
+public ResourceCleaner(RocksIterator rocksIterator, RocksDB rocksDB) {
+  this.rocksIterator = rocksIterator;
+  this.rocksDB = rocksDB;
+}
+
+@Override
+public void run() {
+  if (status.compareAndSet(true, false)) {
+rocksDB.notifyIteratorClosed(rocksIterator);
+rocksIterator.close();

Review Comment:
   +1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-01 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1379542967


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -272,4 +287,32 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final RocksIterator rocksIterator;
+private final RocksDB rocksDB;
+private final AtomicBoolean status = new AtomicBoolean(true);
+
+public ResourceCleaner(RocksIterator rocksIterator, RocksDB rocksDB) {
+  this.rocksIterator = rocksIterator;
+  this.rocksDB = rocksDB;
+}
+
+@Override
+public void run() {
+  if (status.compareAndSet(true, false)) {
+rocksDB.notifyIteratorClosed(rocksIterator);
+rocksIterator.close();

Review Comment:
   I think this close still needs synchronized db...
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-01 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1379542967


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -272,4 +287,32 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final RocksIterator rocksIterator;
+private final RocksDB rocksDB;
+private final AtomicBoolean status = new AtomicBoolean(true);
+
+public ResourceCleaner(RocksIterator rocksIterator, RocksDB rocksDB) {
+  this.rocksIterator = rocksIterator;
+  this.rocksDB = rocksDB;
+}
+
+@Override
+public void run() {
+  if (status.compareAndSet(true, false)) {
+rocksDB.notifyIteratorClosed(rocksIterator);
+rocksIterator.close();

Review Comment:
   I think this close still needs synchronized
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-01 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1379017731


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -272,4 +283,37 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final RocksIterator rocksIterator;
+private final RocksDB rocksDB;
+private final AtomicBoolean status = new AtomicBoolean(true);
+
+public ResourceCleaner(RocksIterator rocksIterator, RocksDB rocksDB) {
+  this.rocksIterator = rocksIterator;
+  this.rocksDB = rocksDB;
+}
+
+@Override
+public void run() {
+  if (status.compareAndSet(true, false)) {
+rocksDB.getIteratorTracker().forEach(ref -> {

Review Comment:
   ```
   
   void notifyIteratorClosed(RocksIterator rocksIterator) {
   iteratorTracker.removeIf(ref -> {
 RocksDBIterator rocksDBIterator = ref.get();
 return rocksDBIterator != null && 
rocksIterator.equals(rocksDBIterator.getRocksIterator());
   });
 }
   
   ```
   
   How about?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-01 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1378970124


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -272,4 +283,37 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  static class ResourceCleaner implements Runnable {
+
+private final RocksIterator rocksIterator;
+private final RocksDB rocksDB;
+private final AtomicBoolean status = new AtomicBoolean(true);
+
+public ResourceCleaner(RocksIterator rocksIterator, RocksDB rocksDB) {
+  this.rocksIterator = rocksIterator;
+  this.rocksDB = rocksDB;
+}
+
+@Override
+public void run() {
+  if (status.compareAndSet(true, false)) {
+rocksDB.getIteratorTracker().forEach(ref -> {

Review Comment:
   hmm... Why not just change the input arg type of the 
`rocksDB.notifyIteratorClosed` method...
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-11-01 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1378942557


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -270,20 +280,30 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
-  private record ResourceCleaner(RocksIterator rocksIterator, RocksDB rocksDB) 
implements Runnable {
+  private static class ResourceCleaner implements Runnable {
+
+private final RocksIterator rocksIterator;
+private final RocksDB rocksDB;
+private final AtomicBoolean status = new AtomicBoolean(true);
+
+public ResourceCleaner(RocksIterator rocksIterator, RocksDB rocksDB) {
+  this.rocksIterator = rocksIterator;
+  this.rocksDB = rocksDB;
+}
 
 @Override
 public void run() {
-  rocksDB.getIteratorTracker().removeIf(ref -> {
-RocksDBIterator rocksDBIterator = ref.get();
-return rocksDBIterator != null && 
rocksIterator.equals(rocksDBIterator.it);
-  });
-  synchronized (rocksDB.getRocksDB()) {
-org.rocksdb.RocksDB _db = rocksDB.getRocksDB().get();
-if (_db != null) {
-  rocksIterator.close();
-}
+  if (status.compareAndSet(true, false)) {
+rocksDB.getIteratorTracker().removeIf(ref -> {

Review Comment:
   Thanks, updated, levels will update after rocksdb complete



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1378069029


##
common/kvstore/src/test/java/org/apache/spark/util/kvstore/RocksDBSuite.java:
##
@@ -381,6 +382,56 @@ public void testSkipAfterDBClose() throws Exception {
 assertFalse(iter.skip(1));
   }
 
+  @Test
+  public void testResourceCleaner() throws Exception {
+File dbPathForCleanerTest = File.createTempFile(
+  "test_db_cleaner.", ".rdb");
+dbPathForCleanerTest.delete();
+
+RocksDB dbForCleanerTest = new RocksDB(dbPathForCleanerTest);
+try {
+  for (int i = 0; i < 8192; i++) {
+dbForCleanerTest.write(createCustomType1(i));
+  }
+  RocksDBIterator rocksDBIterator =
+(RocksDBIterator) 
dbForCleanerTest.view(CustomType1.class).iterator();
+  Reference> reference =
+getRocksDBIteratorRef(rocksDBIterator, dbForCleanerTest);
+  assertNotNull(reference);
+  RocksIterator it = rocksDBIterator.internalIterator();
+  // it has not been closed yet, isOwningHandle should be true.
+  assertTrue(it.isOwningHandle());
+  // Manually set rocksDBIterator to null, to be GC.
+  rocksDBIterator = null;
+  // 100 times gc, the rocksDBIterator should be GCed.
+  int count = 0;
+  while (count < 100 && !reference.refersTo(null)) {
+System.gc();
+count++;
+Thread.sleep(100);
+  }
+  // check rocksDBIterator should be GCed
+  assertTrue(reference.refersTo(null));
+  // Verify that the Cleaner will be executed after a period of time,
+  // and it.isOwningHandle() will become false.
+  assertTimeout(java.time.Duration.ofSeconds(5), () -> 
assertFalse(it.isOwningHandle()));

Review Comment:
   Given the addition of a state variable, can the test be simplified?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1378067392


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -180,13 +183,20 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
+db.notifyIteratorClosed(this);
 if (!closed) {
-  cleanable.clean();
+  it.close();

Review Comment:
   ```java
   try {
   it.close();
   closed = true;
   next = null;
   } finally {
   cancelResourceClean();
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1378067392


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -180,13 +183,20 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
+db.notifyIteratorClosed(this);
 if (!closed) {
-  cleanable.clean();
+  it.close();

Review Comment:
   ```java
   try {
   it.close();
   } finally {
   closed = true;
   next = null;
   cancelResourceClean();
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1378065031


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java:
##
@@ -270,20 +280,30 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
-  private record ResourceCleaner(RocksIterator rocksIterator, RocksDB rocksDB) 
implements Runnable {
+  private static class ResourceCleaner implements Runnable {
+
+private final RocksIterator rocksIterator;
+private final RocksDB rocksDB;
+private final AtomicBoolean status = new AtomicBoolean(true);
+
+public ResourceCleaner(RocksIterator rocksIterator, RocksDB rocksDB) {
+  this.rocksIterator = rocksIterator;
+  this.rocksDB = rocksDB;
+}
 
 @Override
 public void run() {
-  rocksDB.getIteratorTracker().removeIf(ref -> {
-RocksDBIterator rocksDBIterator = ref.get();
-return rocksDBIterator != null && 
rocksIterator.equals(rocksDBIterator.it);
-  });
-  synchronized (rocksDB.getRocksDB()) {
-org.rocksdb.RocksDB _db = rocksDB.getRocksDB().get();
-if (_db != null) {
-  rocksIterator.close();
-}
+  if (status.compareAndSet(true, false)) {
+rocksDB.getIteratorTracker().removeIf(ref -> {

Review Comment:
   Is it possible to reuse the `db.notifyIteratorClosed`?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1377856270


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   This is to have it removed from the Cleaner's queue as quickly as possible, 
but this is not a critical operation, at least that's what I think right now.
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1377858576


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   Get, thanks



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1377856270


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   This is to have it removed from the Cleaner's queue as quickly as possible.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1377851138


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   ```
   private void cancelXXX() {
   this.resourceCleaner.statusToFalse();
   this.cleanable.clean();
 }
   ```
   is ```this.cleanable.clean() ``` redundant? it can't be executed because the 
status is false.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1377681397


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   Thanks, sorry for late because I am busy recently, I will update it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-30 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1377001120


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   @zhaomin1423 Overall, we can make the close method of `RocksDBIterator` look 
similar. The `notifyIteratorClosed ` method in `RocksDB`  needs to be retained 
(it can be refactored for reuse by `ResourceCleaner`).
   
   ```
 public synchronized void close() throws IOException {
   db.notifyIteratorClosed(this);
   if (!closed) {
 it.close();
 closed = true;
 next = null;
 this.cancelXXX();
   }
 }
   ```
   
   The purpose of `this.cancelXXX();`  is to prevent `ResourceCleaner` from 
executing repeatedly when manually closed.
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-30 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1377001120


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   @zhaomin1423 Overall, I want to make the close method of `RocksDBIterator` 
look similar. The `notifyIteratorClosed ` method in `RocksDB`  needs to be 
retained (it can be refactored for reuse by `ResourceCleaner`).
   
   ```
 public synchronized void close() throws IOException {
   db.notifyIteratorClosed(this);
   if (!closed) {
 it.close();
 closed = true;
 next = null;
 this.cancelXXX();
   }
 }
   ```
   
   The purpose of `this.cancelXXX();`  is to prevent `ResourceCleaner` from 
executing repeatedly when manually closed.
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-30 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1377001116


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   @zhaomin1423 Overall, we want to make the close method of `RocksDBIterator` 
look similar. The `notifyIteratorClosed ` method in `RocksDB`  needs to be 
retained (it can be refactored for reuse by `ResourceCleaner`).
   
   ```
 public synchronized void close() throws IOException {
   db.notifyIteratorClosed(this);
   if (!closed) {
 it.close();
 closed = true;
 next = null;
 this.cancelXXX();
   }
 }
   ```
   
   The purpose of `this.cancelXXX();`  is to prevent `ResourceCleaner` from 
executing repeatedly when manually closed.
   
   
   



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   @zhaomin1423 Overall, we want to make the close method of `RocksDBIterator` 
look similar. The `notifyIteratorClosed ` method in `RocksDB`  needs to be 
retained (it can be refactored for reuse by `ResourceCleaner`).
   
   ```
 public synchronized void close() throws IOException {
   db.notifyIteratorClosed(this);
   if (!closed) {
 it.close();
 closed = true;
 next = null;
 this.cancelXXX();
   }
 }
   ```
   
   The purpose of `this.cancelXXX();`  is to prevent `ResourceCleaner` from 
executing repeatedly when manually closed.
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-30 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1376996077


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   Sounds good to me @LuciferYang, let us keep semantics as is. If there is a 
change in behavior we need, we can do it after.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-30 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1376483755


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   > I personally prefer to keep the method of manually closing as it is 
through some refactoring, without locking the db.
   
   sorry, I don't undenstand this implementation how to do it?
   ```
   /**
  * Closes the given iterator if the DB is still open. Trying to close a 
JNI LevelDB handle
  * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
  */
 void closeIterator(LevelDBIterator it) throws IOException {
   notifyIteratorClosed(it);
   synchronized (this._db) {
 DB _db = this._db.get();
 if (_db != null) {
   it.close();
 }
   }
 }
   ``` 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-30 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1376442012


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   > I personally prefer to keep the method of manually closing as it is 
through some refactoring, without locking the db.
   
   WDYT? @mridulm @beliefer @zhaomin1423 Thanks ~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-28 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1375212924


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   I personally prefer to keep the method of manually closing as it is through 
some refactoring, without locking the db.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-27 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1374106447


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   sorry for late, do we need to change anything?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1374102628


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   You are right, I missed that levelDBIterator is itself ref.get ... sigh, 
sorry about this !



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1374094014


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   I simplified the condition of rocksdb part  into just  `dbIterator == 
rocksDBIterator.it` for local testing, there will be many test failures.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1374094014


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   I simplified the condition of rocksdb part  into `dbIterator == 
rocksDBIterator.it` for local testing, there will be many test failures.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1374056091


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   Agree +1, we can fix the leak in follow up



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1374090070


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   > Given this is enhancement for finalize -> Cleaner, it is fine to simply 
use existing semantics and change this condition to simply dbIterator == 
levelDBIterator.it, thoughts ?
   
   hmm... `levelDBIterator` is `ref.get`, if `levelDBIterator` is null, 
`levelDBIterator.it` will throw NPE ...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1374056091


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   Agree +1, we can fix it in follow up



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1374040714


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   w.r.t change in logic ... in current code, I dont think 
`levelDB.getIteratorTracker()` will ever contain a `null` within it (the ref it 
points to can be `null`).
   
   Given this is enhancement for `finalize` -> `Cleaner`, it is fine to simply 
use existing semantics and change this condition to simply `dbIterator == 
levelDBIterator.it`, thoughts ?
   
   We can fix the leak in a subsequent pr



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1374040714


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   w.r.t change in logic ... in current code, I dont think 
`levelDB.getIteratorTracker()` will ever contain a `null` within it (the ref it 
points to can be `null`).
   
   Given this is enhancement for `finalize` -> `Cleaner`, it is fine to simply 
use existing semantics and change this condition to simply `dbIterator == 
levelDBIterator.it`, thoughts ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1374039278


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   With respect to `WeakReference` accumulating in that list, agree - that is 
an existing problem - this PR is not changing it.
   I noticed it while reviewing, and so pinged you @LuciferYang :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1374038863


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   I am not necessarily saying it is an incorrect change - I have not analyzed 
in detail.
   But in context of `Cleaner` related change, I would have expected the 
modifications to be equivalent and not require MT safety impact :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1373003887


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   > looks like there is a leak in the DB, iterator's weak ref is never cleaned 
up from the tracker ?
   
   
   I think this should be a old problem, and we have also performed null checks 
in the close method of RocksDB.
   
   
https://github.com/apache/spark/blob/7d7afb06f682c10f3900eb8adeab9fad6d49cb24/common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java#L341-L348
   
   
https://github.com/apache/spark/blob/7d7afb06f682c10f3900eb8adeab9fad6d49cb24/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java#L307-L315
   
   If an iterator is no longer referenced and is not manually closed, then if 
GC occurs at this time, the iteratorTracker may still hold the Ref, but Ref#get 
is null.
   
   Should we change this condition to 
   
   ```
   return levelDBIterator == null || dbIterator == levelDBIterator.it; 
   ```
   
   Can this also clean up the invalid weakref in passing?
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1373038414


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   @mridulm Are you referring to the addition of `synchronized 
(rocksDB.getRocksDB())` in `cleanable.clean()` compared to before? Perhaps we 
can let the `close()` method still use the original logic as follows:
   
   1.  Add a state variable for ResourceCleaner and let RocksDB/LevelDBIterator 
hold an instance of ResourceCleaner, like
   ```java
   private final AtomicBoolean status = new AtomicBoolean(true);
   void statusToFalse() {
  status.set(false);
   }
   
   @Override
   public void run() {
   if (status.compareAndSet(true, false)) {
 
}
   }
   ```
   
   2.  Add a method to RocksDB/LevelDBIterator to prevent ResourceCleaner from 
actually releasing resources, like
   
   ```java
   private void cancelXXX() {
   this.resourceCleaner.statusToFalse();
   this.cleanable.clean();
 }
   ```
   
   3. change the `close()` to 
   
   ```java
 public synchronized void close() throws IOException {
   db.notifyIteratorClosed(this);
   if (!closed) {
 it.close();
 closed = true;
 next = null;
 this.cancelXXX();
   }
 }
   ```
   
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1373003887


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   > looks like there is a leak in the DB, iterator's weak ref is never cleaned 
up from the tracker ?
   
   
   I think this should be a old problem, and we have also performed null checks 
in the close method of RocksDB.
   
   
https://github.com/apache/spark/blob/7d7afb06f682c10f3900eb8adeab9fad6d49cb24/common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java#L341-L348
   
   
https://github.com/apache/spark/blob/7d7afb06f682c10f3900eb8adeab9fad6d49cb24/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java#L307-L315
   
   If an iterator is no longer referenced and is not manually closed, then if 
GC occurs at this time, the iteratorTracker will still hold the Ref, but 
Ref#get is null.
   
   Should we change this condition to 
   
   ```
   return levelDBIterator == null || dbIterator == levelDBIterator.it; 
   ```
   
   Can this also clean up the invalid weakref in passing?
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1373003887


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   > looks like there is a leak in the DB, iterator's weak ref is never cleaned 
up from the tracker ?
   
   
   I think this should be a old problem, and we have also performed null checks 
in the close method of RocksDB.
   
   
https://github.com/apache/spark/blob/7d7afb06f682c10f3900eb8adeab9fad6d49cb24/common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java#L341-L348
   
   
https://github.com/apache/spark/blob/7d7afb06f682c10f3900eb8adeab9fad6d49cb24/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java#L307-L315
   
   If an iterator is no longer referenced and is not manually closed, then if 
GC occurs at this time, the iteratorTracker will still hold the Ref, but 
Ref#get is null.
   
   Should we change this condition to 
   
   ```
   return levelDBIterator == null || dbIterator.equals(levelDBIterator.it); 
   ```
   
   Can this also clean up the invalid weakref in passing?
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


LuciferYang commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1373003887


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   ```
   looks like there is a leak in the DB, iterator's weak ref is never cleaned 
up from the tracker ?
   ```
   
   I think this should be a old problem, and we have also performed null checks 
in the close method of RocksDB.
   
   
https://github.com/apache/spark/blob/7d7afb06f682c10f3900eb8adeab9fad6d49cb24/common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java#L341-L348
   
   
https://github.com/apache/spark/blob/7d7afb06f682c10f3900eb8adeab9fad6d49cb24/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java#L307-L315
   
   If an iterator is no longer referenced and is not manually closed, then if 
GC occurs at this time, the iteratorTracker will still hold the Ref, but 
Ref#get is null.
   
   Should we change this condition to 
   
   ```
   return levelDBIterator == null || dbIterator.equals(levelDBIterator.it); 
   ```
   
   Can this also clean up the invalid weakref in passing?
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


mridulm commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1372946640


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,23 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);

Review Comment:
   When is `levelDBIterator` going to be `null` ?
   Also, instead of `equals` here shouldnt it not be instance check (`==`) ?
   
   +CC @LuciferYang, looks like there is a leak in the DB, iterator's weak ref 
is never cleaned up from the tracker ?
   



##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -182,23 +189,21 @@ public boolean skip(long n) {
 
   @Override
   public synchronized void close() throws IOException {
-db.notifyIteratorClosed(this);
 if (!closed) {
-  it.close();
-  closed = true;
-  next = null;
+  try {
+cleanable.clean();

Review Comment:
   This is changing the locking semantics. Is this intentional ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


LuciferYang commented on PR #43502:
URL: https://github.com/apache/spark/pull/43502#issuecomment-1780625609

   > > SKIP_PACKAGING
   > 
   > done
   
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


zhaomin1423 commented on PR #43502:
URL: https://github.com/apache/spark/pull/43502#issuecomment-1780616800

   > SKIP_PACKAGING
   
   done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


LuciferYang commented on PR #43502:
URL: https://github.com/apache/spark/pull/43502#issuecomment-1780536278

   @zhaomin1423 The code changes are fine to me.
   
   
https://github.com/apache/spark/blob/7d7afb06f682c10f3900eb8adeab9fad6d49cb24/.github/workflows/build_and_test.yml#L202
   
https://github.com/apache/spark/blob/7d7afb06f682c10f3900eb8adeab9fad6d49cb24/.github/workflows/build_and_test.yml#L381
   
   Can you add an environment variable `LIVE_UI_LOCAL_STORE_DIR: 
"/tmp/kvStore"` at the above two places? This will additionally test using 
RocksDB as LiveUI Backend. I want to verify that this pr do not affect Rocksdb 
LiveUI feature. We can remove this change after the test is successful.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-26 Thread via GitHub


zhaomin1423 commented on code in PR #43502:
URL: https://github.com/apache/spark/pull/43502#discussion_r1372676965


##
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java:
##
@@ -280,4 +285,24 @@ static int compare(byte[] a, byte[] b) {
 return a.length - b.length;
   }
 
+  private record ResourceCleaner(DBIterator dbIterator, LevelDB levelDB) 
implements Runnable {
+@Override
+public void run() {
+  levelDB.getIteratorTracker().removeIf(ref -> {
+LevelDBIterator levelDBIterator = ref.get();
+return levelDBIterator != null && 
dbIterator.equals(levelDBIterator.it);
+  });
+  synchronized (levelDB.getLevelDB()) {
+DB _db = levelDB.getLevelDB().get();
+if (_db == null) {

Review Comment:
   thanks, done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >