Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/3640#issuecomment-66251380
  
    Appreciate a lot for fixing this case! The serialization wrapper class 
makes sense. However, would like to make some refactoring. A summary of my 
comments above:
    
    1. Renaming `HiveFunctionCache` to `HiveFunctionWrapper`
    2. Moving `HiveFunctionWrapper` to the shim layer, and keep Hive 0.12.0 
code path exactly the same as before.
    
       Considering Hive 0.12.0 is not affected by this issue, and 1.2.0 release 
is really close, I'd like to lower the possibility of breaking 0.12.0 code 
paths as much as possible. 
    
    3. Add a `HiveSimpleUdfWrapper`, which inherits from `HiveFunctionWrapper`.
    
       As you've mentioned in the code, `HiveSimpleUdf` is a special case, it 
shouldn't be serialized, and should always create a fresh object on executor 
side. Currently this special case complicates `HiveFunctionWrapper` 
implementation and makes it somewhat confusing. Defining a special subclass for 
`HvieSimpleUdf` helps making `HiveFunctionWrapper` simpler (e.g. no need to 
serialize the boolean flag any more).
    
    4. Avoid using `Utilities.de/serializePlan` by mimicking 
`Utilities.de/serializeObjectByKryo` in the `read/writeExternal` methods of the 
wrapper class, so that we can remove the expensive `HiveConf` instantiation.
    
    In a word, we can add two classes, `HiveFunctionWrapper` and 
`HiveSImpleUdfWrapper` into the shim layer, and make sure that the 0.12.0 
version behaves exactly the same as before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to