Github user jiexiong commented on the issue: https://github.com/apache/spark/pull/15722 Here is the query: Here is the query: INSERT OVERWRITE TABLE lookalike_trainer_campaign_conv_users_with_country_shadow PARTITION(ds='2016-10-19') SELECT c.source_id, c.country, c.user_id, c.conversion_time FROM ( SELECT b.source_id, b.country, b.user_id, b.conversion_time, FB_NUMBER_ROWS(b.country, b.source_id) as rank FROM ( SELECT source_id, country, user_id, MAX(conversion_time) / 1000 AS conversion_time FROM ( SELECT v.campaigngroup_id, v.campaign_id, v.adgroup_id, v.user_id, Y.country, v.last_conversion_time AS conversion_time FROM dim_all_users_fast:bi Y JOIN lookalike_trainer_campaign_conv_raw v ON v.user_id = Y.userid WHERE v.ds='2016-10-19' AND Y.ds = '2016-10-19' AND Y.country IS NOT NULL ) a LATERAL VIEW EXPLODE(ARRAY(campaigngroup_id, campaign_id, adgroup_id)) s AS source_id GROUP BY country, source_id, user_id DISTRIBUTE by country, source_id SORT BY country, source_id, conversion_time DESC ) b ) c WHERE rank <= 60000 Before the fix, it would fail from OOM error. After the fix, the OOM error went away.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org