Xin Hao created HIVE-13634: ------------------------------ Summary: Hive-on-Spark performed worse than Hive-on-MR, for queries with external scripts Key: HIVE-13634 URL: https://issues.apache.org/jira/browse/HIVE-13634 Project: Hive Issue Type: Bug Reporter: Xin Hao
Hive-on-Spark performed worse than Hive-on-MR, for queries with external scripts. For TPCx-BB Q2/Q3/Q4, they are Python Streaming related cases and will call external scripts to handle reduce tasks. We found that for these 3 queries Hive-on-Spark shows lower performance than Hive-on-MR when processing reduce tasks with external (Python) scripts. So ‘Improve HoS performance for queries with external scripts’ seems a performance optimization opportunity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)