[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a memory lea...
Github user jiexiong commented on the issue: https://github.com/apache/spark/pull/15722 This is production query. Sorry, I could not share it. It is doing a join between two big tables. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a memory lea...
Github user jiexiong commented on the issue: https://github.com/apache/spark/pull/15722 Here is the query: Here is the query: INSERT OVERWRITE TABLE lookalike_trainer_campaign_conv_users_with_country_shadow PARTITION(ds='2016-10-19') SELECT c.source_id, c.country, c.user_id, c.conversion_time FROM ( SELECT b.source_id, b.country, b.user_id, b.conversion_time, FB_NUMBER_ROWS(b.country, b.source_id) as rank FROM ( SELECT source_id, country, user_id, MAX(conversion_time) / 1000 AS conversion_time FROM ( SELECT v.campaigngroup_id, v.campaign_id, v.adgroup_id, v.user_id, Y.country, v.last_conversion_time AS conversion_time FROM dim_all_users_fast:bi Y JOIN lookalike_trainer_campaign_conv_raw v ON v.user_id = Y.userid WHERE v.ds='2016-10-19' AND Y.ds = '2016-10-19' AND Y.country IS NOT NULL ) a LATERAL VIEW EXPLODE(ARRAY(campaigngroup_id, campaign_id, adgroup_id)) s AS source_id GROUP BY country, source_id, user_id DISTRIBUTE by country, source_id SORT BY country, source_id, conversion_time DESC ) b ) c WHERE rank <= 6 Before the fix, it would fail from OOM error. After the fix, the OOM error went away. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a memory lea...
Github user davies commented on the issue: https://github.com/apache/spark/pull/15722 OK to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a memory lea...
Github user davies commented on the issue: https://github.com/apache/spark/pull/15722 @jiexiong I don't think this is a memory leak, BytesToBytesMap does not release all memory for each spilling based on the assumption that the memory will be acquired back soon. What's the query that make you think this is a leak? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a memory lea...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15722 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org