cloud-fan commented on code in PR #52513:
URL: https://github.com/apache/spark/pull/52513#discussion_r3240780339
##########
sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowDeserializer.scala:
##########
@@ -416,6 +414,16 @@ object ArrowDeserializers {
}
}
+ private def findSetter(refc: Class[_], name: String, ftype: Class[_]):
MethodHandle = {
+ try {
+ methodLookup.findVirtual(refc, name,
MethodType.methodType(classOf[Unit], ftype))
+ } catch {
+ case e: NoSuchMethodException =>
+ val superClass: Class[_] = ftype.getSuperclass
+ if (superClass != null) findSetter(refc, name, superClass) else throw e
+ }
+ }
Review Comment:
The rest of Spark looks up methods on generic beans through Apache Commons
Lang's matching helpers — `ScalaReflection.findConstructor` uses
`ConstructorUtils.getMatchingAccessibleConstructor`, and `Invoke.findMethod`
(used by the non-connect `JavaBeanEncoder` serializer in
`SerializerBuildHelper`) uses `MethodUtils.getMatchingAccessibleMethod`. Both
do full assignment-compatibility matching, including interfaces, in one call.
Could the connect setter (and the new serializer-side
`unreflect(getMethod(...))`) use the same?
```scala
val setter = methodLookup.unreflect(
MethodUtils.getMatchingAccessibleMethod(refc, name, ftype))
```
Beyond removing the `findVirtual` + iteration logic, this also picks up the
interface-erasure case that the current `getSuperclass()` walk misses: `class
Foo[T <: Comparable[T]]` erases to `setValue(Comparable)`, and
`Class.getSuperclass` never returns interfaces — so for `Foo[Integer]` we'd
walk `Integer → Number → Object → null` and never find the setter. (The
non-connect `InitializeJavaBean` has the same gap today, only falling back to
`Object`; switching connect over would also document a cleaner pattern for a
future fix there.) A short Scaladoc summarizing the workaround would also help
future readers either way.
##########
sql/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/arrow/ArrowEncoderSuite.scala:
##########
@@ -940,6 +940,20 @@ class ArrowEncoderSuite extends ConnectFunSuite with
BeforeAndAfterAll {
}
}
+ test("SPARK-53790: bean encoders with specific generics") {
+ val encoder =
JavaTypeInference.encoderFor(classOf[JavaBeanWithGenericsWrapper])
+ roundTripAndCheckIdentical(encoder) { () =>
+ val maybeNull = MaybeNull(3)
+ Iterator.tabulate(10)(i => {
+ val bean = new JavaBeanWithGenericsWrapper()
+ val inner = new JavaBeanWithGenerics[String]()
+ inner.setValue(maybeNull(i.toString))
+ bean.setValue(maybeNull(inner))
+ bean
+ })
+ }
+ }
Review Comment:
The test only exercises an unbounded `T` (erases to `Object`), so the walk
trivially succeeds at `String → Object`. Worth adding (a) a bounded case like
`class Foo[T <: Number]` used as `Foo[Integer]` to actually exercise the
iteration, and (b) an interface-bounded case like `class Foo[T <:
Comparable[T]]` used as `Foo[Integer]` — either to document that it's
intentionally supported, or to expose the gap noted on `findSetter`.
##########
sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowDeserializer.scala:
##########
@@ -416,6 +414,16 @@ object ArrowDeserializers {
}
}
+ private def findSetter(refc: Class[_], name: String, ftype: Class[_]):
MethodHandle = {
+ try {
+ methodLookup.findVirtual(refc, name,
MethodType.methodType(classOf[Unit], ftype))
+ } catch {
+ case e: NoSuchMethodException =>
+ val superClass: Class[_] = ftype.getSuperclass
+ if (superClass != null) findSetter(refc, name, superClass) else throw e
Review Comment:
When the walk exhausts the chain, the `e` rethrown here is the
`NoSuchMethodException` for the last (parent) type tried, not the
originally-requested one. If a user genuinely misnames a setter, the error will
say `setValue(java.lang.Object)` rather than the parameter type they actually
used. Consider capturing the first exception and rethrowing that (or chaining
via `initCause`). Becomes moot if you switch to
`MethodUtils.getMatchingAccessibleMethod` as suggested above.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]