[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19556 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146896813 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,52 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { +for (cls <- outerClasses) { + var currentClass = cls + assert(currentClass != null, "The outer class can't be null.") + + while (currentClass != null) { +accessedFields(currentClass) = Set.empty[String] +currentClass = currentClass.getSuperclass() + } +} + } + + /** Sets accessed fields for given class in clone object based on given object. */ + private def setAccessedFields( + outerClass: Class[_], + clone: AnyRef, + obj: AnyRef, + accessedFields: Map[Class[_], Set[String]]): Unit = { +for (fieldName <- accessedFields(outerClass)) { + val field = outerClass.getDeclaredField(fieldName) + field.setAccessible(true) + val value = field.get(obj) + field.set(clone, value) +} + } + + /** Clones a given object and sets accessed fields in cloned object. */ + private def cloneAndSetFields( + parent: AnyRef, + obj: AnyRef, + outerClass: Class[_], + accessedFields: Map[Class[_], Set[String]]): AnyRef = { +val clone = instantiateClass(outerClass, parent) + +var currentClass = outerClass +do { --- End diff -- Ok. It's late, I will update this and below tomorrow. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146895110 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,52 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { +for (cls <- outerClasses) { + var currentClass = cls + assert(currentClass != null, "The outer class can't be null.") + + while (currentClass != null) { +accessedFields(currentClass) = Set.empty[String] +currentClass = currentClass.getSuperclass() + } +} + } + + /** Sets accessed fields for given class in clone object based on given object. */ + private def setAccessedFields( + outerClass: Class[_], + clone: AnyRef, + obj: AnyRef, + accessedFields: Map[Class[_], Set[String]]): Unit = { +for (fieldName <- accessedFields(outerClass)) { + val field = outerClass.getDeclaredField(fieldName) + field.setAccessible(true) + val value = field.get(obj) + field.set(clone, value) +} + } + + /** Clones a given object and sets accessed fields in cloned object. */ + private def cloneAndSetFields( + parent: AnyRef, + obj: AnyRef, + outerClass: Class[_], + accessedFields: Map[Class[_], Set[String]]): AnyRef = { +val clone = instantiateClass(outerClass, parent) + +var currentClass = outerClass +do { --- End diff -- please update this do while loop too --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146895438 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -395,8 +435,13 @@ private[util] class FieldAccessFinder( if (!visitedMethods.contains(m)) { // Keep track of visited methods to avoid potential infinite cycles visitedMethods += m - ClosureCleaner.getClassReader(cl).accept( -new FieldAccessFinder(fields, findTransitively, Some(m), visitedMethods), 0) + + var currentClass = cl + do { --- End diff -- here too --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146895161 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,50 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { +for (cls <- outerClasses) { + var currentClass = cls + do { +accessedFields(currentClass) = Set.empty[String] +currentClass = currentClass.getSuperclass() + } while (currentClass != null) +} + } + + /** Sets accessed fields for given class in clone object based on given object. */ + private def setAccessedFields( + outerClass: Class[_], + clone: AnyRef, + obj: AnyRef, + accessedFields: Map[Class[_], Set[String]]): Unit = { +for (fieldName <- accessedFields(outerClass)) { + val field = outerClass.getDeclaredField(fieldName) + field.setAccessible(true) + val value = field.get(obj) + field.set(clone, value) +} + } + + /** Clones a given object and sets accessed fields in cloned object. */ + private def cloneAndSetFields( + parent: AnyRef, + obj: AnyRef, + outerClass: Class[_], + accessedFields: Map[Class[_], Set[String]]): AnyRef = { +val clone = instantiateClass(outerClass, parent) + +var currentClass = outerClass +do { + setAccessedFields(currentClass, clone, obj, accessedFields) --- End diff -- yea let's leave it --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146833044 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,50 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { +for (cls <- outerClasses) { + var currentClass = cls + do { +accessedFields(currentClass) = Set.empty[String] +currentClass = currentClass.getSuperclass() + } while (currentClass != null) +} + } + + /** Sets accessed fields for given class in clone object based on given object. */ + private def setAccessedFields( + outerClass: Class[_], + clone: AnyRef, + obj: AnyRef, + accessedFields: Map[Class[_], Set[String]]): Unit = { +for (fieldName <- accessedFields(outerClass)) { + val field = outerClass.getDeclaredField(fieldName) + field.setAccessible(true) + val value = field.get(obj) + field.set(clone, value) +} + } + + /** Clones a given object and sets accessed fields in cloned object. */ + private def cloneAndSetFields( + parent: AnyRef, + obj: AnyRef, + outerClass: Class[_], + accessedFields: Map[Class[_], Set[String]]): AnyRef = { +val clone = instantiateClass(outerClass, parent) + +var currentClass = outerClass +do { + setAccessedFields(currentClass, clone, obj, accessedFields) --- End diff -- As this is not a regression, IIUC, will it block this change? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146771631 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,50 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { +for (cls <- outerClasses) { + var currentClass = cls + do { --- End diff -- Ok. Updated. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146769438 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,50 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { +for (cls <- outerClasses) { + var currentClass = cls + do { +accessedFields(currentClass) = Set.empty[String] +currentClass = currentClass.getSuperclass() + } while (currentClass != null) +} + } + + /** Sets accessed fields for given class in clone object based on given object. */ + private def setAccessedFields( + outerClass: Class[_], + clone: AnyRef, + obj: AnyRef, + accessedFields: Map[Class[_], Set[String]]): Unit = { +for (fieldName <- accessedFields(outerClass)) { + val field = outerClass.getDeclaredField(fieldName) + field.setAccessible(true) + val value = field.get(obj) + field.set(clone, value) +} + } + + /** Clones a given object and sets accessed fields in cloned object. */ + private def cloneAndSetFields( + parent: AnyRef, + obj: AnyRef, + outerClass: Class[_], + accessedFields: Map[Class[_], Set[String]]): AnyRef = { +val clone = instantiateClass(outerClass, parent) + +var currentClass = outerClass +do { + setAccessedFields(currentClass, clone, obj, accessedFields) --- End diff -- Seems that is true. For a closure that only accessed `A.a`, we clone the whole `A` object which contains both `a` and `b` fields. This is the fact in current `ClosureCleaner`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146763677 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,50 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { +for (cls <- outerClasses) { + var currentClass = cls + do { +accessedFields(currentClass) = Set.empty[String] +currentClass = currentClass.getSuperclass() + } while (currentClass != null) +} + } + + /** Sets accessed fields for given class in clone object based on given object. */ + private def setAccessedFields( + outerClass: Class[_], + clone: AnyRef, + obj: AnyRef, + accessedFields: Map[Class[_], Set[String]]): Unit = { +for (fieldName <- accessedFields(outerClass)) { + val field = outerClass.getDeclaredField(fieldName) + field.setAccessible(true) + val value = field.get(obj) + field.set(clone, value) +} + } + + /** Clones a given object and sets accessed fields in cloned object. */ + private def cloneAndSetFields( + parent: AnyRef, + obj: AnyRef, + outerClass: Class[_], + accessedFields: Map[Class[_], Set[String]]): AnyRef = { +val clone = instantiateClass(outerClass, parent) + +var currentClass = outerClass +do { + setAccessedFields(currentClass, clone, obj, accessedFields) --- End diff -- Seems this is also a issue for the `outerClasses`, maybe I missed something... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146760101 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,50 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { +for (cls <- outerClasses) { + var currentClass = cls + do { +accessedFields(currentClass) = Set.empty[String] +currentClass = currentClass.getSuperclass() + } while (currentClass != null) +} + } + + /** Sets accessed fields for given class in clone object based on given object. */ + private def setAccessedFields( + outerClass: Class[_], + clone: AnyRef, + obj: AnyRef, + accessedFields: Map[Class[_], Set[String]]): Unit = { +for (fieldName <- accessedFields(outerClass)) { + val field = outerClass.getDeclaredField(fieldName) + field.setAccessible(true) + val value = field.get(obj) + field.set(clone, value) +} + } + + /** Clones a given object and sets accessed fields in cloned object. */ + private def cloneAndSetFields( + parent: AnyRef, + obj: AnyRef, + outerClass: Class[_], + accessedFields: Map[Class[_], Set[String]]): AnyRef = { +val clone = instantiateClass(outerClass, parent) + +var currentClass = outerClass +do { + setAccessedFields(currentClass, clone, obj, accessedFields) --- End diff -- Assume we have class `A` and `B` having the same parent class `P`. `P` has 2 fields `a` and `b`. The closure accessed `A.a` and `B.b`, so when we clone `A` object, we should only set field `a`, when we clone `B` object, we should only set field `b`. However here seems we set field `a` and `b` for `A` and `B` object, which is sub-optimal. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146759438 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,50 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { +for (cls <- outerClasses) { + var currentClass = cls + do { --- End diff -- I feel it's better to use `while` loop here. Programmatically the loop requires `currentClass != null`, even for the first loop. To completely keep the previous behavior, we can add a `assert(cls != null)` before the loop. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146581086 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,50 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { --- End diff -- I added a related test. Please see if it can clarify your concern. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146577760 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,50 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { --- End diff -- I think from the view of closure, even multiple outer classes have the same parent class, the access of the fields in the parent class shouldn't conflict. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146534666 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,50 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { --- End diff -- what if multiple outer classes have the same parent class? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146426012 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,52 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { +for (cls <- outerClasses) { + accessedFields(cls) = Set.empty[String] + + var superClass = cls.getSuperclass() + while (superClass != null) { +accessedFields(superClass) = Set.empty[String] +superClass = superClass.getSuperclass() + } +} + } + + /** Sets accessed fields for given class in clone object based on given object. */ + private def setAccessedFields( + outerClass: Class[_], + clone: AnyRef, + obj: AnyRef, + accessedFields: Map[Class[_], Set[String]]): Unit = { +for (fieldName <- accessedFields(outerClass)) { + val field = outerClass.getDeclaredField(fieldName) + field.setAccessible(true) + val value = field.get(obj) + field.set(clone, value) +} + } + + /** Clones a given object and sets accessed fields in cloned object. */ + private def cloneAndSetFields( + parent: AnyRef, + obj: AnyRef, + outerClass: Class[_], + accessedFields: Map[Class[_], Set[String]]): AnyRef = { +val clone = instantiateClass(outerClass, parent) +setAccessedFields(outerClass, clone, obj, accessedFields) + +var superClass = outerClass.getSuperclass() +while (superClass != null) { --- End diff -- Thanks. Looks good. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19556#discussion_r146346452 --- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala --- @@ -91,6 +91,52 @@ private[spark] object ClosureCleaner extends Logging { (seen - obj.getClass).toList } + /** Initializes the accessed fields for outer classes and their super classes. */ + private def initAccessedFields( + accessedFields: Map[Class[_], Set[String]], + outerClasses: Seq[Class[_]]): Unit = { +for (cls <- outerClasses) { + accessedFields(cls) = Set.empty[String] + + var superClass = cls.getSuperclass() + while (superClass != null) { +accessedFields(superClass) = Set.empty[String] +superClass = superClass.getSuperclass() + } +} + } + + /** Sets accessed fields for given class in clone object based on given object. */ + private def setAccessedFields( + outerClass: Class[_], + clone: AnyRef, + obj: AnyRef, + accessedFields: Map[Class[_], Set[String]]): Unit = { +for (fieldName <- accessedFields(outerClass)) { + val field = outerClass.getDeclaredField(fieldName) + field.setAccessible(true) + val value = field.get(obj) + field.set(clone, value) +} + } + + /** Clones a given object and sets accessed fields in cloned object. */ + private def cloneAndSetFields( + parent: AnyRef, + obj: AnyRef, + outerClass: Class[_], + accessedFields: Map[Class[_], Set[String]]): AnyRef = { +val clone = instantiateClass(outerClass, parent) +setAccessedFields(outerClass, clone, obj, accessedFields) + +var superClass = outerClass.getSuperclass() +while (superClass != null) { --- End diff -- Can this be something more like ... ``` var currentClass = outerClass do { setAccessedFields(currentClass, clone, obj, accessedFields) currentClass = currentClass.getSuperclass() } while (currentClass != null) ``` Just avoids repeating the key method call here. Same above and below. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19556: [SPARK-22328][Core] ClosureCleaner should not mis...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/19556 [SPARK-22328][Core] ClosureCleaner should not miss referenced superclass fields ## What changes were proposed in this pull request? When the given closure uses some fields defined in super class, `ClosureCleaner` can't figure them and don't set it properly. Those fields will be in null values. ## How was this patch tested? Added test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 SPARK-22328 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19556.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19556 commit 29c5d7301fc3271d723ddd4447c13b61705dcd44 Author: Liang-Chi HsiehDate: 2017-10-23T07:03:32Z ClosureCleaner should fill referenced superclass fields. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org