Re: Tune hive query launched thru spark-yarn job.

2019-09-05 Thread Himali Patel
. From: Sathi Chowdhury Date: Thursday, 5 September 2019 at 8:10 PM To: Himali Patel , "user@spark.apache.org" Subject: Re: Tune hive query launched thru spark-yarn job. What I can immediately think of is, as you are doing IN in the where clause for a series of timestamps, if you can

Tune hive query launched thru spark-yarn job.

2019-09-05 Thread Himali Patel
Hello all, We have one use-case where we are aggregating billion of rows. It does huge shuffle. Example : As per ‘Job’ tab on yarn UI When Input size is 350 G something, shuffle size >3 TBs. This increases Non-DFS usage beyond warning limit and thus affecting entire cluster. It seems we need

Test mail

2019-09-05 Thread Himali Patel