[ https://issues.apache.org/jira/browse/HIVE-21096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Adam Szita reassigned HIVE-21096: --------------------------------- > Remove unnecessary Spark dependency from HS2 process > ---------------------------------------------------- > > Key: HIVE-21096 > URL: https://issues.apache.org/jira/browse/HIVE-21096 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, Spark > Reporter: Adam Szita > Assignee: Adam Szita > Priority: Major > > When a HiveOnSpark job is kicked off most of the work is done by the > RemoteDriver, which is a separate process. There a couple of smaller parts of > code, where HS2 process depends on Spark jars, these for example include > receiving stats from the driver or putting together a Spark conf object - > used mostly during communication with RemoteDriver. > We can limit the data types used for such communication so that we don't use > (and serialize) types that are in Spark codebase, and hence we can refactor > our code to only use Spark jars in the Remote Driver process. > I think this way would be cleaner from dependencies point of view, and also > less erroneous when users have to compile the classpath for their HS2 > processes. > (E.g. due to a change between Spark 2.2 and 2.4 we had to also include > spark-unsafe*.jar - though it's an internal change to Spark..) -- This message was sent by Atlassian JIRA (v7.6.3#76005)