Github user bdrillard commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22878#discussion_r229332500
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
    @@ -1617,6 +1617,58 @@ case class InitializeJavaBean(beanInstance: 
Expression, setters: Map[String, Exp
       }
     }
     
    +/**
    + * Initializes an Avro Record instance (that implements the IndexedRecord 
interface) by calling
    + * the `put` method on a the Record instance with the provided position 
and value arguments
    + *
    + * @param objectInstance an expression that will evaluate to the Record 
instance
    + * @param args a sequence of expression pairs that will respectively 
evaluate to the index of
    + *             the record in which to insert, and the argument value to 
insert
    + */
    +case class InitializeAvroObject(
    --- End diff --
    
    It's possible to refactor the `NewInstance` expression also in this objects 
class to support construction of Avro classes, which would eliminate the need 
for a separate `InititalizeAvroObject`. Interestingly, the same refactor would 
also generalize in such a way as to allow us to remove the need for a separate 
`InitializeJavaBean` expression.
    
    To summarize the change: `NewInstance` would accept a `Seq` of `Expression` 
for the arguments to the instance's constructor, but _also_ a `Seq` of 
`(String, Seq[Expression])` tuples, being an ordered list of setter methods and 
the methods' respective arguments to call _after_ the object has been 
constructed.
    
    This covers both creation of Java beans, it covers the creation and 
instantiation of `SpecificRecord`.
    
    See the necessary changes to `NewInstance`, 
[here](https://github.com/apache/spark/pull/21348/files#diff-e436c96ea839dfe446837ab2a3531f93R447).
    
    Also an additional clause to `TreeNode`, 
[here](https://github.com/apache/spark/pull/21348/files#diff-eac5b02bb450a235fef5e902a2671254R361).
    
    And then the changes to `JavaTypeInference`, 
[here](https://github.com/apache/spark/pull/21348/files#diff-031a812c8799b92eeecab0cbc9ac8f25).
    
    If this refactor is considered a bit too complicated for this PR, we can 
start with an `InitializeAvroObject` and do some cleanup in a followup. As 
background, this refactor was initially suggested by @cloud-fan, see 
[comment](https://github.com/apache/spark/pull/20085#issuecomment-364043282).



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to