Github user jiexiong commented on the issue:

    https://github.com/apache/spark/pull/15722
  
    Here is the query:
    Here is the query:
    
    INSERT OVERWRITE TABLE 
lookalike_trainer_campaign_conv_users_with_country_shadow
    PARTITION(ds='2016-10-19')
    SELECT c.source_id,
    c.country,
    c.user_id,
    c.conversion_time
    FROM (
    SELECT b.source_id,
    b.country,
    b.user_id,
    b.conversion_time,
    FB_NUMBER_ROWS(b.country, b.source_id) as rank
    FROM (
    SELECT source_id,
    country,
    user_id,
    MAX(conversion_time) / 1000 AS conversion_time
    FROM (
    SELECT v.campaigngroup_id,
    v.campaign_id,
    v.adgroup_id,
    v.user_id,
    Y.country,
    v.last_conversion_time AS conversion_time
    FROM dim_all_users_fast:bi Y
    JOIN lookalike_trainer_campaign_conv_raw v
    ON  v.user_id = Y.userid
    WHERE v.ds='2016-10-19'
    AND Y.ds = '2016-10-19'
    AND Y.country IS NOT NULL
    ) a LATERAL VIEW
    EXPLODE(ARRAY(campaigngroup_id, campaign_id, adgroup_id)) s
    AS source_id
    GROUP BY country, source_id, user_id
    DISTRIBUTE by country, source_id
    SORT BY country, source_id, conversion_time DESC
    ) b
    ) c  WHERE rank <= 60000
    
    Before the fix, it would fail from OOM error. After the fix, the OOM error 
went away.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to