Osama Suleiman created IMPALA-6500: -------------------------------------- Summary: Impala crashes randomly on different queries with GROUP BY Key: IMPALA-6500 URL: https://issues.apache.org/jira/browse/IMPALA-6500 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.10.0 Environment: RHEL 6.5 (Santiago), Kernel version: 2.6.32-431.el6.x86_64, CDH 5.13.1 single node, Impala 2.10 Reporter: Osama Suleiman Attachments: hs_err_pid9910.log
I have a Parquet table created by Hive and I am doing multiple different queries on it, such as: SELECT product_category, SUM(cast(profit AS DECIMAL(15,2))) as total_profit, SUM(cast(sales AS DECIMAL(15,2))) as total_sales FROM copy_orders GROUP BY product_category; and: SELECT customer_name, SUM(cast(profit AS DECIMAL(15,2))) as total_profit, SUM(cast(sales AS DECIMAL(15,2))) as total_sales FROM copy_orders GROUP BY customer_name ORDER BY total_profit DESC LIMIT 10; These two queries tend to run successfully in some rare occasions, most of the time running those queries on HUE's Impala query editor will return: ??Could not connect to hostname:21050 (code THRIFTTRANSPORT): TTransportException('Could not connect to hostname:21050',)?? Simultaneously, the Impala Daemon crashes according to the Cloudera Manager and then it will work again approximately 1 min later. Meanwhile, You can run other simple queries and it will run successfully. I have attached a log file for a sample run of one of the queries since they all generate relevant logs. I have tried to use ??SET disable_codegen=1 ??but the problem resumed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)