[ https://issues.apache.org/jira/browse/SPARK-23192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li updated SPARK-23192: ---------------------------- Target Version/s: 2.3.0 > Hint is lost after using cached data > ------------------------------------ > > Key: SPARK-23192 > URL: https://issues.apache.org/jira/browse/SPARK-23192 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.2.1, 2.3.0 > Reporter: Xiao Li > Assignee: Xiao Li > Priority: Critical > > The hint of the plan segment is lost, if the plan segment is replaced by the > cached data. > {noformat} > val df1 = spark.createDataFrame(Seq((1, "4"), (2, "2"))).toDF("key", > "value") > val df2 = spark.createDataFrame(Seq((1, "1"), (2, "2"))).toDF("key", > "value") > df2.cache() > val df3 = df1.join(broadcast(df2), Seq("key"), "inner") > {noformat} > Hint is lost in {{df3}}. The physical join algorithm will not respect the > hint due to the loss. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org